IntentRecognizer

The IntentRecognizer class uses sentence embeddings to match user utterances against registered command phrases, enabling natural language voice command recognition.

Class Definition

from moonshine_voice import IntentRecognizer, EmbeddingModelArch

recognizer = IntentRecognizer(
    model_path: str,
    model_arch: EmbeddingModelArch = EmbeddingModelArch.GEMMA_300M,
    model_variant: str = "q4",
    threshold: float = 0.7
)

Constructor Parameters

model_path

str

required

Path to the directory containing embedding model files (gemma-300m model and tokenizer.bin)

model_arch

EmbeddingModelArch

default:"EmbeddingModelArch.GEMMA_300M"

Embedding model architecture. Currently only GEMMA_300M is supported.

model_variant

str

default:"q4"

Model quantization variant:

"fp32" - Full precision (largest, most accurate)
"fp16" - Half precision
"q8" - 8-bit quantized
"q4" - 4-bit quantized (recommended, best performance/accuracy)
"q4f16" - 4-bit weights, fp16 activations

threshold

float

default:"0.7"

Minimum similarity score (0.0-1.0) to trigger an intent. Higher values are more restrictive.

0.6 - Very permissive, more false positives
0.7 - Balanced (recommended)
0.8 - Strict, fewer false positives

Methods

register_intent

recognizer.register_intent(
    trigger_phrase: str,
    handler: Callable[[str, str, float], None]
)

trigger_phrase

str

required

The canonical command phrase (e.g., “turn on the lights”)

handler

Callable

required

Function called when the intent is triggered. Receives:

trigger_phrase (str) - The registered command phrase
utterance (str) - The actual user’s words
similarity (float) - Confidence score 0.0-1.0

Example:

def on_lights_on(trigger, utterance, similarity):
    print(f"Turning on lights (confidence: {similarity:.0%})")
    # Control smart home API here

recognizer.register_intent("turn on the lights", on_lights_on)

unregister_intent

Remove a registered intent.

recognizer.unregister_intent(trigger_phrase: str)

trigger_phrase

str

required

The command phrase to remove (must match exactly as registered)

process_utterance

Manually process an utterance against all registered intents.

recognizer.process_utterance(utterance: str)

utterance

str

required

The text to match against registered intents

This checks the utterance against all registered intents and calls handlers for any matches above the threshold.

set_threshold

Change the similarity threshold.

recognizer.set_threshold(threshold: float)

threshold

float

required

New threshold value (0.0-1.0)

get_threshold

Get the current threshold.

threshold = recognizer.get_threshold() -> float

Returns: Current threshold value

get_intent_count

Get the number of registered intents.

count = recognizer.get_intent_count() -> int

Returns: Number of registered intents

clear_intents

Remove all registered intents.

recognizer.clear_intents()

Usage as Event Listener

IntentRecognizer implements TranscriptEventListener, so you can attach it to a transcriber:

from moonshine_voice import MicTranscriber, IntentRecognizer

# Create transcriber
transcriber = MicTranscriber(
    model_path="/path/to/asr/models"
)

# Create intent recognizer
recognizer = IntentRecognizer(
    model_path="/path/to/embedding/models"
)

# Register commands
recognizer.register_intent(
    "turn on the lights",
    lambda t, u, s: print("Lights ON")
)

recognizer.register_intent(
    "turn off the lights",
    lambda t, u, s: print("Lights OFF")
)

# Attach to transcriber - now intents are detected automatically
transcriber.add_listener(recognizer)

transcriber.start()
# Speak: "Please turn on the lights" -> Triggers "Lights ON"

Example: Smart Home Control

from moonshine_voice import MicTranscriber, IntentRecognizer

class SmartHome:
    def __init__(self):
        self.lights_on = False
    
    def lights_on_handler(self, trigger, utterance, similarity):
        self.lights_on = True
        print(f"✓ Lights ON ({similarity:.0%} match)")
    
    def lights_off_handler(self, trigger, utterance, similarity):
        self.lights_on = False
        print(f"✓ Lights OFF ({similarity:.0%} match)")
    
    def temperature_handler(self, trigger, utterance, similarity):
        # This is a simplified example
        # In practice, you'd parse the number from the utterance
        print(f"Setting temperature ({similarity:.0%} match)")

# Set up smart home
home = SmartHome()

# Create recognizer
recognizer = IntentRecognizer(
    model_path="/path/to/embedding/models",
    threshold=0.65  # More permissive for casual speech
)

# Register intents
recognizer.register_intent("turn on the lights", home.lights_on_handler)
recognizer.register_intent("turn off the lights", home.lights_off_handler)
recognizer.register_intent("set temperature to 72", home.temperature_handler)

# Attach to transcriber
transcriber = MicTranscriber(model_path="/path/to/asr/models")
transcriber.add_listener(recognizer)

print("Smart home voice control ready")
transcriber.start()

Now you can say:

“Please turn on the lights” → Matches “turn on the lights”
“Switch off the lights” → Matches “turn off the lights”
“Make it 72 degrees” → Matches “set temperature to 72”

Example: Dynamic Commands

from moonshine_voice import IntentRecognizer

recognizer = IntentRecognizer(model_path="/path/to/models")

# Robot movement commands
commands = [
    "move forward",
    "move backward",
    "turn left",
    "turn right",
    "stop moving"
]

def robot_handler(trigger, utterance, similarity):
    print(f"Command: {trigger}")
    print(f"User said: {utterance}")
    print(f"Confidence: {similarity:.0%}")
    # Send command to robot here

# Register all commands with same handler
for command in commands:
    recognizer.register_intent(command, robot_handler)

print(f"Registered {recognizer.get_intent_count()} commands")

# Test
recognizer.process_utterance("go forward")  # Matches "move forward"
recognizer.process_utterance("go backwards")  # Matches "move backward"

Semantic Matching

Unlike exact keyword matching, the intent recognizer understands semantic similarity:

recognizer.register_intent("turn on the lights", handler)

# These all match:
recognizer.process_utterance("turn on the lights")     # 100% match
recognizer.process_utterance("switch on the lights")   # ~95% match
recognizer.process_utterance("lights on please")       # ~90% match
recognizer.process_utterance("enable the lights")      # ~85% match
recognizer.process_utterance("illuminate the room")    # ~75% match

# These don't match (below threshold):
recognizer.process_utterance("turn off the lights")    # Opposite meaning
recognizer.process_utterance("what time is it")        # Unrelated

Threshold Tuning

How to choose the right threshold

Threshold 0.6: Very permissive

More false positives
Good for casual, varied language
Use when: Users speak naturally, informally

Threshold 0.7: Balanced (recommended)

Good accuracy with flexibility
Handles most variations
Use when: General voice command systems

Threshold 0.8: Strict

Fewer false positives
Requires closer matches
Use when: Safety-critical commands, confirmation required

Command-Line Usage

python -m moonshine_voice.intent_recognizer \
  --intents "turn left, turn right, stop, go forward"

Command-line options

--intents - Comma-separated list of command phrases
--embedding-model - Path to embedding model
--quantization - Model variant (fp32, fp16, q8, q4, q4f16)
--threshold - Similarity threshold (0.0-1.0)
--language - ASR language for transcription

Performance

Model Variant	Size	Latency	Accuracy
fp32	1.2 GB	50ms	Best
fp16	600 MB	40ms	Excellent
q8	300 MB	30ms	Very good
q4 (recommended)	150 MB	25ms	Good
q4f16	200 MB	28ms	Very good

Use q4 variant for best balance of speed, size, and accuracy.

Limitations

The current implementation matches full phrases, not individual words or slots. For “set temperature to 72”, you need to parse the number separately.

Future versions will support “slot filling” to extract parameters like numbers, times, and names from utterances.

Python API

Swift API

Java API

C++ API

C API

Class Definition

Constructor Parameters

Methods

register_intent

unregister_intent

process_utterance

set_threshold

get_threshold

get_intent_count

clear_intents

Usage as Event Listener

Example: Smart Home Control

Example: Dynamic Commands

Semantic Matching

Threshold Tuning

Command-Line Usage

Performance

Limitations

See Also

Build docs developers (and LLMs) love

Python API

Swift API

Java API

C++ API

C API

​Class Definition

​Constructor Parameters

​Methods

​register_intent

​unregister_intent

​process_utterance

​set_threshold

​get_threshold

​get_intent_count

​clear_intents

​Usage as Event Listener

​Example: Smart Home Control

​Example: Dynamic Commands

​Semantic Matching

​Threshold Tuning

​Command-Line Usage

​Performance

​Limitations

​See Also

Build docs developers (and LLMs) love

Class Definition

Constructor Parameters

Methods

register_intent

unregister_intent

process_utterance

set_threshold

get_threshold

get_intent_count

clear_intents

Usage as Event Listener

Example: Smart Home Control

Example: Dynamic Commands

Semantic Matching

Threshold Tuning

Command-Line Usage

Performance

Limitations

See Also