C API Functions Reference - Moonshine Voice

Constants

Version

MOONSHINE_HEADER_VERSION

int32_t

default:"20000"

Header file version. Pass this to moonshine_load_transcriber_from_files() to ensure compatibility.Format: MAJOR * 10000 + MINOR * 100 + PATCH

Version 2.0.0 = 20000
Version 2.3.7 = 20307

Model Architectures

MOONSHINE_MODEL_ARCH_TINY

uint32_t

default:"0"

Tiny model (26M parameters, 12.66% WER for English)

MOONSHINE_MODEL_ARCH_BASE

uint32_t

default:"1"

Base model (58M parameters, 10.07% WER for English)

MOONSHINE_MODEL_ARCH_TINY_STREAMING

uint32_t

default:"2"

Tiny streaming model (34M parameters, 12.00% WER for English)

MOONSHINE_MODEL_ARCH_BASE_STREAMING

uint32_t

default:"3"

Base streaming model (58M parameters)

MOONSHINE_MODEL_ARCH_SMALL_STREAMING

uint32_t

default:"4"

Small streaming model (123M parameters, 7.84% WER for English)

MOONSHINE_MODEL_ARCH_MEDIUM_STREAMING

uint32_t

default:"5"

Medium streaming model (245M parameters, 6.65% WER for English)

Error Codes

MOONSHINE_ERROR_NONE

int32_t

default:"0"

Operation completed successfully

MOONSHINE_ERROR_UNKNOWN

int32_t

default:"-1"

Unknown error occurred

MOONSHINE_ERROR_INVALID_HANDLE

int32_t

default:"-2"

Invalid transcriber or stream handle

MOONSHINE_ERROR_INVALID_ARGUMENT

int32_t

default:"-3"

Invalid function argument

Flags

MOONSHINE_FLAG_FORCE_UPDATE

uint32_t

default:"1 << 0"

Force stream analysis even if less than 200ms of new audio has been added

Intent Recognition Model Architectures

MOONSHINE_EMBEDDING_MODEL_ARCH_GEMMA_300M

uint32_t

default:"0"

Gemma 300M embedding model for intent recognition

Data Structures

transcriber_option_t

struct transcriber_option_t {
  const char *name;
  const char *value;
};

Advanced configuration options for transcriber creation.

name

const char*

Option name

value

const char*

Option value as a string

transcript_line_t

struct transcript_line_t {
  const char *text;
  const float *audio_data;
  size_t audio_data_count;
  float start_time;
  float duration;
  uint64_t id;
  int8_t is_complete;
  int8_t is_updated;
  int8_t is_new;
  int8_t has_text_changed;
  int8_t has_speaker_id;
  uint64_t speaker_id;
  uint32_t speaker_index;
  uint32_t last_transcription_latency_ms;
};

Represents a single segment of speech (phrase or sentence).

text

const char*

UTF-8 encoded transcription text

audio_data

const float*

Raw audio data for this segment (16kHz float PCM, -1.0 to 1.0)

audio_data_count

size_t

Number of audio samples in audio_data

start_time

float

Time offset from start of stream in seconds

duration

float

Duration of the segment in seconds

uint64_t

Stable 64-bit identifier for this line (remains constant across updates)

is_complete

int8_t

Streaming only: 1 if speaker has finished this segment, 0 if still speaking

is_updated

int8_t

Streaming only: 1 if line changed since last moonshine_transcribe_stream() call

is_new

int8_t

Streaming only: 1 if line was newly added since last call

has_text_changed

int8_t

Streaming only: 1 if text changed since last call

has_speaker_id

int8_t

1 if speaker_id has been calculated

speaker_id

uint64_t

Randomly-generated 64-bit identifier for the speaker (for diarization)

speaker_index

uint32_t

Order in which this speaker appeared in the transcript (0 = first speaker)

last_transcription_latency_ms

uint32_t

Streaming only: Latency of last transcription in milliseconds

transcript_t

struct transcript_t {
  struct transcript_line_t *lines;
  uint64_t line_count;
};

Complete transcription of an audio stream or file.

lines

transcript_line_t*

Array of transcript lines in chronological order

line_count

uint64_t

Number of lines in the transcript

moonshine_intent_callback

typedef void (*moonshine_intent_callback)(void *user_data,
                                         const char *trigger_phrase,
                                         const char *utterance,
                                         float similarity);

Callback function for intent recognition.

user_data

void*

User data pointer passed to moonshine_register_intent()

trigger_phrase

const char*

The registered trigger phrase that matched

utterance

const char*

The actual utterance that was recognized

similarity

float

Similarity score between 0 and 1

Transcriber Functions

moonshine_get_version

int32_t moonshine_get_version(void);

Returns the loaded Moonshine library version.

return

int32_t

Library version in format MAJOR * 10000 + MINOR * 100 + PATCH

This may differ from MOONSHINE_HEADER_VERSION if a newer shared library is loaded.

moonshine_error_to_string

const char *moonshine_error_to_string(int32_t error);

Converts error code to human-readable string.

error

int32_t

Error code from an API call

return

const char*

Human-readable error description

moonshine_transcript_to_string

const char *moonshine_transcript_to_string(const struct transcript_t *transcript);

Converts transcript to human-readable string for debugging.

transcript

const transcript_t*

Transcript to convert

return

const char*

String representation (valid until next call to this function)

moonshine_load_transcriber_from_files

int32_t moonshine_load_transcriber_from_files(
    const char *path,
    uint32_t model_arch,
    const struct transcriber_option_t *options,
    uint64_t options_count,
    int32_t moonshine_version);

Loads transcriber models from the filesystem.

path

const char*

required

Directory containing model files:

encoder_model.ort
decoder_model_merged.ort
tokenizer.bin

model_arch

uint32_t

required

Model architecture (e.g., MOONSHINE_MODEL_ARCH_BASE_STREAMING)

options

const transcriber_option_t*

Array of custom options (can be NULL)

options_count

uint64_t

Number of options in the array

moonshine_version

int32_t

required

Should be MOONSHINE_HEADER_VERSION for compatibility

return

int32_t

Non-negative transcriber handle on success, negative error code on failure

Example:

int32_t transcriber_handle = moonshine_load_transcriber_from_files(
  "path/to/models", MOONSHINE_MODEL_ARCH_BASE, NULL, 0,
  MOONSHINE_HEADER_VERSION);
if (transcriber_handle < 0) {
  fprintf(stderr, "Error: %s\n", moonshine_error_to_string(transcriber_handle));
}

moonshine_load_transcriber_from_memory

int32_t moonshine_load_transcriber_from_memory(
    const uint8_t *encoder_model_data,
    size_t encoder_model_data_size,
    const uint8_t *decoder_model_data,
    size_t decoder_model_data_size,
    const uint8_t *tokenizer_data,
    size_t tokenizer_data_size,
    uint32_t model_arch,
    const struct transcriber_option_t *options,
    uint64_t options_count,
    int32_t moonshine_version);

Loads transcriber models from memory buffers.

encoder_model_data

const uint8_t*

required

Binary data for encoder model

encoder_model_data_size

size_t

required

Size of encoder model data in bytes

decoder_model_data

const uint8_t*

required

Binary data for decoder model

decoder_model_data_size

size_t

required

Size of decoder model data in bytes

tokenizer_data

const uint8_t*

required

Binary data for tokenizer

tokenizer_data_size

size_t

required

Size of tokenizer data in bytes

model_arch

uint32_t

required

Model architecture constant

options

const transcriber_option_t*

Array of custom options (can be NULL)

options_count

uint64_t

Number of options

moonshine_version

int32_t

required

Should be MOONSHINE_HEADER_VERSION

return

int32_t

Non-negative transcriber handle on success, negative error code on failure

moonshine_free_transcriber

void moonshine_free_transcriber(int32_t transcriber_handle);

Releases all resources used by the transcriber.

transcriber_handle

int32_t

required

Handle returned by moonshine_load_transcriber_from_files() or moonshine_load_transcriber_from_memory()

After freeing, the handle may be reused for future transcribers. Remove all references to it.

moonshine_transcribe_without_streaming

int32_t moonshine_transcribe_without_streaming(
    int32_t transcriber_handle,
    float *audio_data,
    uint64_t audio_length,
    int32_t sample_rate,
    uint32_t flags,
    struct transcript_t **out_transcript);

Transcribes complete audio array (for files or recordings).

transcriber_handle

int32_t

required

Transcriber handle

audio_data

float*

required

PCM audio data array (values between -1.0 and 1.0)

audio_length

uint64_t

required

Number of samples in audio_data

sample_rate

int32_t

required

Sample rate in Hz (16000 recommended)

flags

uint32_t

Reserved for future use (pass 0)

out_transcript

transcript_t**

required

Pointer to receive transcript result

return

int32_t

MOONSHINE_ERROR_NONE (0) on success, error code on failure

The transcript data is owned by the transcriber and valid until the next call or until the transcriber is freed.

Example:

transcript_t *transcript = NULL;
int32_t error = moonshine_transcribe_without_streaming(
  transcriber_handle, audio_data, audio_length, 16000, 0, &transcript);
if (error == MOONSHINE_ERROR_NONE) {
  for (size_t i = 0; i < transcript->line_count; i++) {
    printf("%s\n", transcript->lines[i].text);
  }
}

Streaming Functions

moonshine_create_stream

int32_t moonshine_create_stream(int32_t transcriber_handle, uint32_t flags);

Creates a new audio stream for real-time transcription.

transcriber_handle

int32_t

required

Transcriber handle

flags

uint32_t

Reserved (pass 0)

return

int32_t

Non-negative stream handle on success, negative error code on failure

moonshine_free_stream

int32_t moonshine_free_stream(int32_t transcriber_handle, int32_t stream_handle);

Releases stream resources.

transcriber_handle

int32_t

required

Transcriber handle

stream_handle

int32_t

required

Stream handle to free

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_start_stream

int32_t moonshine_start_stream(int32_t transcriber_handle, int32_t stream_handle);

Starts a new transcription session on the stream.

transcriber_handle

int32_t

required

Transcriber handle

stream_handle

int32_t

required

Stream handle

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

Clears any previous transcript data. Call this after audio input discontinuities (e.g., when user unmutes).

moonshine_stop_stream

int32_t moonshine_stop_stream(int32_t transcriber_handle, int32_t stream_handle);

Stops the current transcription session.

transcriber_handle

int32_t

required

Transcriber handle

stream_handle

int32_t

required

Stream handle

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

Any active lines will be marked as complete. Call moonshine_transcribe_stream() after stopping to get final results.

moonshine_transcribe_add_audio_to_stream

int32_t moonshine_transcribe_add_audio_to_stream(
    int32_t transcriber_handle,
    int32_t stream_handle,
    const float *new_audio_data,
    uint64_t audio_length,
    int32_t sample_rate,
    uint32_t flags);

Adds audio data to the stream buffer.

transcriber_handle

int32_t

required

Transcriber handle

stream_handle

int32_t

required

Stream handle

new_audio_data

const float*

required

PCM audio samples (values between -1.0 and 1.0)

audio_length

uint64_t

required

Number of samples

sample_rate

int32_t

required

Sample rate in Hz

flags

uint32_t

Reserved (pass 0)

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

This function only buffers audio and does not perform transcription. Call moonshine_transcribe_stream() to get results. Safe to call from time-critical threads.

moonshine_transcribe_stream

int32_t moonshine_transcribe_stream(
    int32_t transcriber_handle,
    int32_t stream_handle,
    uint32_t flags,
    struct transcript_t **out_transcript);

Analyzes buffered audio and returns updated transcript.

transcriber_handle

int32_t

required

Transcriber handle

stream_handle

int32_t

required

Stream handle

flags

uint32_t

Bitwise OR of flags:

MOONSHINE_FLAG_FORCE_UPDATE: Force analysis even if < 200ms of new audio

out_transcript

transcript_t**

required

Pointer to receive updated transcript

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

By default, full analysis only occurs if 200ms+ of new audio has been added. Use MOONSHINE_FLAG_FORCE_UPDATE to override this throttling.

Example:

while (audio_available) {
  moonshine_transcribe_add_audio_to_stream(
    transcriber, stream, audio_chunk, chunk_size, 16000, 0);
  
  transcript_t *transcript = NULL;
  moonshine_transcribe_stream(transcriber, stream, 0, &transcript);
  
  // Check for new or updated lines
  for (size_t i = 0; i < transcript->line_count; i++) {
    if (transcript->lines[i].is_new || transcript->lines[i].has_text_changed) {
      printf("Updated: %s\n", transcript->lines[i].text);
    }
  }
}

Intent Recognition Functions

moonshine_create_intent_recognizer

int32_t moonshine_create_intent_recognizer(
    const char *model_path,
    uint32_t model_arch,
    const char *model_variant,
    float threshold);

Creates an intent recognizer for voice command matching.

model_path

const char*

required

Path to directory containing embedding model files

model_arch

uint32_t

required

Model architecture (currently only MOONSHINE_EMBEDDING_MODEL_ARCH_GEMMA_300M supported)

model_variant

const char*

Model quantization: “fp32”, “fp16”, “q8”, “q4”, “q4f16” (NULL defaults to “q4”)

threshold

float

Minimum similarity score (0.0-1.0) to trigger intent (default 0.7)

return

int32_t

Non-negative recognizer handle on success, negative error code on failure

moonshine_free_intent_recognizer

void moonshine_free_intent_recognizer(int32_t intent_recognizer_handle);

Frees intent recognizer resources.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

moonshine_register_intent

int32_t moonshine_register_intent(
    int32_t intent_recognizer_handle,
    const char *trigger_phrase,
    moonshine_intent_callback callback,
    void *user_data);

Registers an intent with a trigger phrase and callback.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

trigger_phrase

const char*

required

Phrase to match (e.g., “Turn on the lights”)

callback

moonshine_intent_callback

required

Function to call when intent is triggered

user_data

void*

User data passed to callback (can be NULL)

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

Example:

void on_lights_on(void *user_data, const char *trigger, 
                  const char *utterance, float similarity) {
  printf("Turning on lights (%.0f%% match)\n", similarity * 100);
  // Actually turn on lights...
}

moonshine_register_intent(recognizer, "Turn on the lights", 
                         on_lights_on, NULL);

moonshine_unregister_intent

int32_t moonshine_unregister_intent(
    int32_t intent_recognizer_handle,
    const char *trigger_phrase);

Removes a registered intent.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

trigger_phrase

const char*

required

Trigger phrase to remove

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_process_utterance

int32_t moonshine_process_utterance(
    int32_t intent_recognizer_handle,
    const char *utterance);

Processes an utterance and invokes matching intent callback.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

utterance

const char*

required

Utterance to match against registered intents

return

int32_t

1 if intent recognized, 0 if not, negative error code on failure

moonshine_set_intent_threshold

int32_t moonshine_set_intent_threshold(
    int32_t intent_recognizer_handle,
    float threshold);

Sets the similarity threshold for intent matching.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

threshold

float

required

New threshold (0.0-1.0)

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

moonshine_get_intent_threshold

float moonshine_get_intent_threshold(int32_t intent_recognizer_handle);

Gets the current similarity threshold.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

return

float

Current threshold (>= 0) on success, negative error code on failure

moonshine_get_intent_count

int32_t moonshine_get_intent_count(int32_t intent_recognizer_handle);

Gets the number of registered intents.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

return

int32_t

Number of intents (>= 0) on success, negative error code on failure

moonshine_clear_intents

int32_t moonshine_clear_intents(int32_t intent_recognizer_handle);

Removes all registered intents.

intent_recognizer_handle

int32_t

required

Intent recognizer handle

return

int32_t

MOONSHINE_ERROR_NONE on success, error code on failure

Streaming Guarantees

When using streaming transcription, the library provides these guarantees:

Lines are never removed - only added
Only the last line may be incomplete - all others are finalized
Line IDs are stable - use them to track lines across updates
Empty strings indicate detected speech with no transcription
Line indexes are stable references - remember line_count to process only new lines
Speaker IDs are set when confident - or when line completes

Best Practices

Memory Management

Always copy transcript data before:

Making another API call on the same transcriber
Freeing the transcriber

The library owns all returned transcript memory.

Performance

Use 16kHz audio to avoid resampling overhead
For streaming, call moonshine_transcribe_stream() at intervals matching your latency requirements (e.g., every 500ms)
Add audio in whatever chunk sizes your audio source provides - the library handles buffering
Use multiple streams on one transcriber to share model resources across audio sources

Error Handling

int32_t result = moonshine_create_stream(transcriber, 0);
if (result < 0) {
  fprintf(stderr, "Stream creation failed: %s\n", 
          moonshine_error_to_string(result));
  return;
}
int32_t stream_handle = result;

C API Overview

High-level concepts and architecture

Python API

Higher-level Python bindings (recommended)

Python API

Swift API

Java API

C++ API

C API

​Constants

​Version

​Model Architectures

​Error Codes

​Flags

​Intent Recognition Model Architectures

​Data Structures

​transcriber_option_t

​transcript_line_t

​transcript_t

​moonshine_intent_callback

​Transcriber Functions

​moonshine_get_version

​moonshine_error_to_string

​moonshine_transcript_to_string

​moonshine_load_transcriber_from_files

​moonshine_load_transcriber_from_memory

​moonshine_free_transcriber

​moonshine_transcribe_without_streaming

​Streaming Functions

​moonshine_create_stream

​moonshine_free_stream

​moonshine_start_stream

​moonshine_stop_stream

​moonshine_transcribe_add_audio_to_stream

​moonshine_transcribe_stream

​Intent Recognition Functions

​moonshine_create_intent_recognizer

​moonshine_free_intent_recognizer

​moonshine_register_intent

​moonshine_unregister_intent

​moonshine_process_utterance

​moonshine_set_intent_threshold

​moonshine_get_intent_threshold

​moonshine_get_intent_count

​moonshine_clear_intents

​Streaming Guarantees

​Best Practices

​Memory Management

​Performance

​Error Handling

​See Also

C API Overview

Python API

Build docs developers (and LLMs) love

Constants

Version

Model Architectures

Error Codes

Flags

Intent Recognition Model Architectures

Data Structures

transcriber_option_t

transcript_line_t

transcript_t

moonshine_intent_callback

Transcriber Functions

moonshine_get_version

moonshine_error_to_string

moonshine_transcript_to_string

moonshine_load_transcriber_from_files

moonshine_load_transcriber_from_memory

moonshine_free_transcriber

moonshine_transcribe_without_streaming

Streaming Functions

moonshine_create_stream

moonshine_free_stream

moonshine_start_stream

moonshine_stop_stream

moonshine_transcribe_add_audio_to_stream

moonshine_transcribe_stream

Intent Recognition Functions

moonshine_create_intent_recognizer

moonshine_free_intent_recognizer

moonshine_register_intent

moonshine_unregister_intent

moonshine_process_utterance

moonshine_set_intent_threshold

moonshine_get_intent_threshold

moonshine_get_intent_count

moonshine_clear_intents

Streaming Guarantees

Best Practices

Memory Management

Performance

Error Handling

See Also