Skip to main content

Constants

Version

MOONSHINE_HEADER_VERSION
int32_t
default:"20000"
Header file version. Pass this to moonshine_load_transcriber_from_files() to ensure compatibility.Format: MAJOR * 10000 + MINOR * 100 + PATCH
  • Version 2.0.0 = 20000
  • Version 2.3.7 = 20307

Model Architectures

MOONSHINE_MODEL_ARCH_TINY
uint32_t
default:"0"
Tiny model (26M parameters, 12.66% WER for English)
MOONSHINE_MODEL_ARCH_BASE
uint32_t
default:"1"
Base model (58M parameters, 10.07% WER for English)
MOONSHINE_MODEL_ARCH_TINY_STREAMING
uint32_t
default:"2"
Tiny streaming model (34M parameters, 12.00% WER for English)
MOONSHINE_MODEL_ARCH_BASE_STREAMING
uint32_t
default:"3"
Base streaming model (58M parameters)
MOONSHINE_MODEL_ARCH_SMALL_STREAMING
uint32_t
default:"4"
Small streaming model (123M parameters, 7.84% WER for English)
MOONSHINE_MODEL_ARCH_MEDIUM_STREAMING
uint32_t
default:"5"
Medium streaming model (245M parameters, 6.65% WER for English)

Error Codes

MOONSHINE_ERROR_NONE
int32_t
default:"0"
Operation completed successfully
MOONSHINE_ERROR_UNKNOWN
int32_t
default:"-1"
Unknown error occurred
MOONSHINE_ERROR_INVALID_HANDLE
int32_t
default:"-2"
Invalid transcriber or stream handle
MOONSHINE_ERROR_INVALID_ARGUMENT
int32_t
default:"-3"
Invalid function argument

Flags

MOONSHINE_FLAG_FORCE_UPDATE
uint32_t
default:"1 << 0"
Force stream analysis even if less than 200ms of new audio has been added

Intent Recognition Model Architectures

MOONSHINE_EMBEDDING_MODEL_ARCH_GEMMA_300M
uint32_t
default:"0"
Gemma 300M embedding model for intent recognition

Data Structures

transcriber_option_t

struct transcriber_option_t {
  const char *name;
  const char *value;
};
Advanced configuration options for transcriber creation.
name
const char*
Option name
value
const char*
Option value as a string

transcript_line_t

struct transcript_line_t {
  const char *text;
  const float *audio_data;
  size_t audio_data_count;
  float start_time;
  float duration;
  uint64_t id;
  int8_t is_complete;
  int8_t is_updated;
  int8_t is_new;
  int8_t has_text_changed;
  int8_t has_speaker_id;
  uint64_t speaker_id;
  uint32_t speaker_index;
  uint32_t last_transcription_latency_ms;
};
Represents a single segment of speech (phrase or sentence).
text
const char*
UTF-8 encoded transcription text
audio_data
const float*
Raw audio data for this segment (16kHz float PCM, -1.0 to 1.0)
audio_data_count
size_t
Number of audio samples in audio_data
start_time
float
Time offset from start of stream in seconds
duration
float
Duration of the segment in seconds
id
uint64_t
Stable 64-bit identifier for this line (remains constant across updates)
is_complete
int8_t
Streaming only: 1 if speaker has finished this segment, 0 if still speaking
is_updated
int8_t
Streaming only: 1 if line changed since last moonshine_transcribe_stream() call
is_new
int8_t
Streaming only: 1 if line was newly added since last call
has_text_changed
int8_t
Streaming only: 1 if text changed since last call
has_speaker_id
int8_t
1 if speaker_id has been calculated
speaker_id
uint64_t
Randomly-generated 64-bit identifier for the speaker (for diarization)
speaker_index
uint32_t
Order in which this speaker appeared in the transcript (0 = first speaker)
last_transcription_latency_ms
uint32_t
Streaming only: Latency of last transcription in milliseconds

transcript_t

struct transcript_t {
  struct transcript_line_t *lines;
  uint64_t line_count;
};
Complete transcription of an audio stream or file.
lines
transcript_line_t*
Array of transcript lines in chronological order
line_count
uint64_t
Number of lines in the transcript

moonshine_intent_callback

typedef void (*moonshine_intent_callback)(void *user_data,
                                         const char *trigger_phrase,
                                         const char *utterance,
                                         float similarity);
Callback function for intent recognition.
user_data
void*
User data pointer passed to moonshine_register_intent()
trigger_phrase
const char*
The registered trigger phrase that matched
utterance
const char*
The actual utterance that was recognized
similarity
float
Similarity score between 0 and 1

Transcriber Functions

moonshine_get_version

int32_t moonshine_get_version(void);
Returns the loaded Moonshine library version.
return
int32_t
Library version in format MAJOR * 10000 + MINOR * 100 + PATCH
This may differ from MOONSHINE_HEADER_VERSION if a newer shared library is loaded.

moonshine_error_to_string

const char *moonshine_error_to_string(int32_t error);
Converts error code to human-readable string.
error
int32_t
Error code from an API call
return
const char*
Human-readable error description

moonshine_transcript_to_string

const char *moonshine_transcript_to_string(const struct transcript_t *transcript);
Converts transcript to human-readable string for debugging.
transcript
const transcript_t*
Transcript to convert
return
const char*
String representation (valid until next call to this function)

moonshine_load_transcriber_from_files

int32_t moonshine_load_transcriber_from_files(
    const char *path,
    uint32_t model_arch,
    const struct transcriber_option_t *options,
    uint64_t options_count,
    int32_t moonshine_version);
Loads transcriber models from the filesystem.
path
const char*
required
Directory containing model files:
  • encoder_model.ort
  • decoder_model_merged.ort
  • tokenizer.bin
model_arch
uint32_t
required
Model architecture (e.g., MOONSHINE_MODEL_ARCH_BASE_STREAMING)
options
const transcriber_option_t*
Array of custom options (can be NULL)
options_count
uint64_t
Number of options in the array
moonshine_version
int32_t
required
Should be MOONSHINE_HEADER_VERSION for compatibility
return
int32_t
Non-negative transcriber handle on success, negative error code on failure
Example:
int32_t transcriber_handle = moonshine_load_transcriber_from_files(
  "path/to/models", MOONSHINE_MODEL_ARCH_BASE, NULL, 0,
  MOONSHINE_HEADER_VERSION);
if (transcriber_handle < 0) {
  fprintf(stderr, "Error: %s\n", moonshine_error_to_string(transcriber_handle));
}

moonshine_load_transcriber_from_memory

int32_t moonshine_load_transcriber_from_memory(
    const uint8_t *encoder_model_data,
    size_t encoder_model_data_size,
    const uint8_t *decoder_model_data,
    size_t decoder_model_data_size,
    const uint8_t *tokenizer_data,
    size_t tokenizer_data_size,
    uint32_t model_arch,
    const struct transcriber_option_t *options,
    uint64_t options_count,
    int32_t moonshine_version);
Loads transcriber models from memory buffers.
encoder_model_data
const uint8_t*
required
Binary data for encoder model
encoder_model_data_size
size_t
required
Size of encoder model data in bytes
decoder_model_data
const uint8_t*
required
Binary data for decoder model
decoder_model_data_size
size_t
required
Size of decoder model data in bytes
tokenizer_data
const uint8_t*
required
Binary data for tokenizer
tokenizer_data_size
size_t
required
Size of tokenizer data in bytes
model_arch
uint32_t
required
Model architecture constant
options
const transcriber_option_t*
Array of custom options (can be NULL)
options_count
uint64_t
Number of options
moonshine_version
int32_t
required
Should be MOONSHINE_HEADER_VERSION
return
int32_t
Non-negative transcriber handle on success, negative error code on failure

moonshine_free_transcriber

void moonshine_free_transcriber(int32_t transcriber_handle);
Releases all resources used by the transcriber.
transcriber_handle
int32_t
required
Handle returned by moonshine_load_transcriber_from_files() or moonshine_load_transcriber_from_memory()
After freeing, the handle may be reused for future transcribers. Remove all references to it.

moonshine_transcribe_without_streaming

int32_t moonshine_transcribe_without_streaming(
    int32_t transcriber_handle,
    float *audio_data,
    uint64_t audio_length,
    int32_t sample_rate,
    uint32_t flags,
    struct transcript_t **out_transcript);
Transcribes complete audio array (for files or recordings).
transcriber_handle
int32_t
required
Transcriber handle
audio_data
float*
required
PCM audio data array (values between -1.0 and 1.0)
audio_length
uint64_t
required
Number of samples in audio_data
sample_rate
int32_t
required
Sample rate in Hz (16000 recommended)
flags
uint32_t
Reserved for future use (pass 0)
out_transcript
transcript_t**
required
Pointer to receive transcript result
return
int32_t
MOONSHINE_ERROR_NONE (0) on success, error code on failure
The transcript data is owned by the transcriber and valid until the next call or until the transcriber is freed.
Example:
transcript_t *transcript = NULL;
int32_t error = moonshine_transcribe_without_streaming(
  transcriber_handle, audio_data, audio_length, 16000, 0, &transcript);
if (error == MOONSHINE_ERROR_NONE) {
  for (size_t i = 0; i < transcript->line_count; i++) {
    printf("%s\n", transcript->lines[i].text);
  }
}

Streaming Functions

moonshine_create_stream

int32_t moonshine_create_stream(int32_t transcriber_handle, uint32_t flags);
Creates a new audio stream for real-time transcription.
transcriber_handle
int32_t
required
Transcriber handle
flags
uint32_t
Reserved (pass 0)
return
int32_t
Non-negative stream handle on success, negative error code on failure

moonshine_free_stream

int32_t moonshine_free_stream(int32_t transcriber_handle, int32_t stream_handle);
Releases stream resources.
transcriber_handle
int32_t
required
Transcriber handle
stream_handle
int32_t
required
Stream handle to free
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_start_stream

int32_t moonshine_start_stream(int32_t transcriber_handle, int32_t stream_handle);
Starts a new transcription session on the stream.
transcriber_handle
int32_t
required
Transcriber handle
stream_handle
int32_t
required
Stream handle
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure
Clears any previous transcript data. Call this after audio input discontinuities (e.g., when user unmutes).

moonshine_stop_stream

int32_t moonshine_stop_stream(int32_t transcriber_handle, int32_t stream_handle);
Stops the current transcription session.
transcriber_handle
int32_t
required
Transcriber handle
stream_handle
int32_t
required
Stream handle
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure
Any active lines will be marked as complete. Call moonshine_transcribe_stream() after stopping to get final results.

moonshine_transcribe_add_audio_to_stream

int32_t moonshine_transcribe_add_audio_to_stream(
    int32_t transcriber_handle,
    int32_t stream_handle,
    const float *new_audio_data,
    uint64_t audio_length,
    int32_t sample_rate,
    uint32_t flags);
Adds audio data to the stream buffer.
transcriber_handle
int32_t
required
Transcriber handle
stream_handle
int32_t
required
Stream handle
new_audio_data
const float*
required
PCM audio samples (values between -1.0 and 1.0)
audio_length
uint64_t
required
Number of samples
sample_rate
int32_t
required
Sample rate in Hz
flags
uint32_t
Reserved (pass 0)
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure
This function only buffers audio and does not perform transcription. Call moonshine_transcribe_stream() to get results. Safe to call from time-critical threads.

moonshine_transcribe_stream

int32_t moonshine_transcribe_stream(
    int32_t transcriber_handle,
    int32_t stream_handle,
    uint32_t flags,
    struct transcript_t **out_transcript);
Analyzes buffered audio and returns updated transcript.
transcriber_handle
int32_t
required
Transcriber handle
stream_handle
int32_t
required
Stream handle
flags
uint32_t
Bitwise OR of flags:
  • MOONSHINE_FLAG_FORCE_UPDATE: Force analysis even if < 200ms of new audio
out_transcript
transcript_t**
required
Pointer to receive updated transcript
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure
By default, full analysis only occurs if 200ms+ of new audio has been added. Use MOONSHINE_FLAG_FORCE_UPDATE to override this throttling.
Example:
while (audio_available) {
  moonshine_transcribe_add_audio_to_stream(
    transcriber, stream, audio_chunk, chunk_size, 16000, 0);
  
  transcript_t *transcript = NULL;
  moonshine_transcribe_stream(transcriber, stream, 0, &transcript);
  
  // Check for new or updated lines
  for (size_t i = 0; i < transcript->line_count; i++) {
    if (transcript->lines[i].is_new || transcript->lines[i].has_text_changed) {
      printf("Updated: %s\n", transcript->lines[i].text);
    }
  }
}

Intent Recognition Functions

moonshine_create_intent_recognizer

int32_t moonshine_create_intent_recognizer(
    const char *model_path,
    uint32_t model_arch,
    const char *model_variant,
    float threshold);
Creates an intent recognizer for voice command matching.
model_path
const char*
required
Path to directory containing embedding model files
model_arch
uint32_t
required
Model architecture (currently only MOONSHINE_EMBEDDING_MODEL_ARCH_GEMMA_300M supported)
model_variant
const char*
Model quantization: “fp32”, “fp16”, “q8”, “q4”, “q4f16” (NULL defaults to “q4”)
threshold
float
Minimum similarity score (0.0-1.0) to trigger intent (default 0.7)
return
int32_t
Non-negative recognizer handle on success, negative error code on failure

moonshine_free_intent_recognizer

void moonshine_free_intent_recognizer(int32_t intent_recognizer_handle);
Frees intent recognizer resources.
intent_recognizer_handle
int32_t
required
Intent recognizer handle

moonshine_register_intent

int32_t moonshine_register_intent(
    int32_t intent_recognizer_handle,
    const char *trigger_phrase,
    moonshine_intent_callback callback,
    void *user_data);
Registers an intent with a trigger phrase and callback.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
trigger_phrase
const char*
required
Phrase to match (e.g., “Turn on the lights”)
callback
moonshine_intent_callback
required
Function to call when intent is triggered
user_data
void*
User data passed to callback (can be NULL)
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure
Example:
void on_lights_on(void *user_data, const char *trigger, 
                  const char *utterance, float similarity) {
  printf("Turning on lights (%.0f%% match)\n", similarity * 100);
  // Actually turn on lights...
}

moonshine_register_intent(recognizer, "Turn on the lights", 
                         on_lights_on, NULL);

moonshine_unregister_intent

int32_t moonshine_unregister_intent(
    int32_t intent_recognizer_handle,
    const char *trigger_phrase);
Removes a registered intent.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
trigger_phrase
const char*
required
Trigger phrase to remove
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_process_utterance

int32_t moonshine_process_utterance(
    int32_t intent_recognizer_handle,
    const char *utterance);
Processes an utterance and invokes matching intent callback.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
utterance
const char*
required
Utterance to match against registered intents
return
int32_t
1 if intent recognized, 0 if not, negative error code on failure

moonshine_set_intent_threshold

int32_t moonshine_set_intent_threshold(
    int32_t intent_recognizer_handle,
    float threshold);
Sets the similarity threshold for intent matching.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
threshold
float
required
New threshold (0.0-1.0)
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_get_intent_threshold

float moonshine_get_intent_threshold(int32_t intent_recognizer_handle);
Gets the current similarity threshold.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
return
float
Current threshold (>= 0) on success, negative error code on failure

moonshine_get_intent_count

int32_t moonshine_get_intent_count(int32_t intent_recognizer_handle);
Gets the number of registered intents.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
return
int32_t
Number of intents (>= 0) on success, negative error code on failure

moonshine_clear_intents

int32_t moonshine_clear_intents(int32_t intent_recognizer_handle);
Removes all registered intents.
intent_recognizer_handle
int32_t
required
Intent recognizer handle
return
int32_t
MOONSHINE_ERROR_NONE on success, error code on failure

Streaming Guarantees

When using streaming transcription, the library provides these guarantees:
  1. Lines are never removed - only added
  2. Only the last line may be incomplete - all others are finalized
  3. Line IDs are stable - use them to track lines across updates
  4. Empty strings indicate detected speech with no transcription
  5. Line indexes are stable references - remember line_count to process only new lines
  6. Speaker IDs are set when confident - or when line completes

Best Practices

Memory Management

Always copy transcript data before:
  • Making another API call on the same transcriber
  • Freeing the transcriber
The library owns all returned transcript memory.

Performance

  • Use 16kHz audio to avoid resampling overhead
  • For streaming, call moonshine_transcribe_stream() at intervals matching your latency requirements (e.g., every 500ms)
  • Add audio in whatever chunk sizes your audio source provides - the library handles buffering
  • Use multiple streams on one transcriber to share model resources across audio sources

Error Handling

int32_t result = moonshine_create_stream(transcriber, 0);
if (result < 0) {
  fprintf(stderr, "Stream creation failed: %s\n", 
          moonshine_error_to_string(result));
  return;
}
int32_t stream_handle = result;

See Also

C API Overview

High-level concepts and architecture

Python API

Higher-level Python bindings (recommended)

Build docs developers (and LLMs) love