DecodingConfig

DecodingConfig is a dataclass that controls how the model decodes audio and splits text into sentences.

Class Definition

from parakeet_mlx import DecodingConfig, Greedy, Beam, SentenceConfig

config = DecodingConfig(
    decoding=Greedy(),
    sentence=SentenceConfig()
)

Fields

decoding

Union[Greedy, Beam]

default:"Greedy()"

The decoding strategy to use. Can be either Greedy() for greedy decoding or Beam() for beam search decoding.See Greedy and Beam below for configuration options.

sentence

SentenceConfig

default:"SentenceConfig()"

Configuration for how to split transcribed text into sentences.See SentenceConfig for configuration options.

Greedy

Greedy decoding selects the most likely token at each step. This is the fastest decoding method.

from parakeet_mlx import DecodingConfig, Greedy

config = DecodingConfig(decoding=Greedy())

Greedy decoding has no configuration parameters.

Beam

Beam search decoding explores multiple hypotheses to find better transcriptions. Currently only available for TDT models.

from parakeet_mlx import DecodingConfig, Beam

config = DecodingConfig(
    decoding=Beam(
        beam_size=5,
        length_penalty=1.0,
        patience=1.0,
        duration_reward=0.7
    )
)

Fields

beam_size

int

default:"5"

Number of hypotheses to explore simultaneously. Larger values may improve accuracy but increase computation time.Example: beam_size=5

length_penalty

float

default:"1.0"

Penalty applied based on sequence length. Higher values favor longer sequences.

1.0: No penalty
> 1.0: Favor longer sequences
< 1.0: Favor shorter sequences

Example: length_penalty=0.013

patience

float

default:"1.0"

Controls how many candidate hypotheses to explore. Higher values allow more exploration.The maximum number of candidates is beam_size * patience.Example: patience=3.5

duration_reward

float

default:"0.7"

TDT-only parameter. Controls the balance between token logprobs and duration logprobs.

0.0: Only consider token logprobs
0.5: Equal weight to both
1.0: Only consider duration logprobs

Example: duration_reward=0.67

Examples

Greedy Decoding

from parakeet_mlx import from_pretrained, DecodingConfig, Greedy

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

config = DecodingConfig(decoding=Greedy())
result = model.transcribe("audio.wav", decoding_config=config)

print(result.text)

Beam Search Decoding

from parakeet_mlx import from_pretrained, DecodingConfig, Beam

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

config = DecodingConfig(
    decoding=Beam(
        beam_size=5,
        length_penalty=0.013,
        patience=3.5,
        duration_reward=0.67
    )
)

result = model.transcribe("audio.wav", decoding_config=config)
print(result.text)

With Sentence Configuration

from parakeet_mlx import from_pretrained, DecodingConfig, Beam, SentenceConfig

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

config = DecodingConfig(
    decoding=Beam(beam_size=5),
    sentence=SentenceConfig(
        max_words=30,
        silence_gap=5.0,
        max_duration=40.0
    )
)

result = model.transcribe("audio.wav", decoding_config=config)

for sentence in result.sentences:
    print(f"{sentence.start:.2f}s - {sentence.end:.2f}s: {sentence.text}")

SentenceConfig - Configure sentence splitting
BaseParakeet - Use DecodingConfig with model methods
Beam Decoding Guide - Learn more about beam search

Models

Configuration

Results

Audio Processing

Class Definition

Fields

Greedy

Beam

Fields

Examples

Greedy Decoding

Beam Search Decoding

With Sentence Configuration

Build docs developers (and LLMs) love

Models

Configuration

Results

Audio Processing

Documentation Index

​Class Definition

​Fields

​Greedy

​Beam

​Fields

​Examples

​Greedy Decoding

​Beam Search Decoding

​With Sentence Configuration

​Related

Build docs developers (and LLMs) love

Class Definition

Fields

Greedy

Beam

Fields

Examples

Greedy Decoding

Beam Search Decoding

With Sentence Configuration

Related