Skip to main content

Overview

ParakeetTDTCTC is a hybrid model that combines both Token-and-Duration Transducer (TDT) and Connectionist Temporal Classification (CTC) decoders. It inherits all functionality from ParakeetTDT but includes an additional CTC decoder for auxiliary training objectives.
The .generate() method uses the TDT decoder by default. The CTC decoder is available via .ctc_decoder for specialized use cases.

Class Definition

class ParakeetTDTCTC(ParakeetTDT)
Defined in parakeet.py:909-918

Inheritance

ParakeetTDTCTC inherits all methods from ParakeetTDT:
  • transcribe()
  • transcribe_stream()
  • generate()
  • decode()
See ParakeetTDT documentation for details on these methods.

Additional Properties

ctc_decoder

ctc_decoder
ConvASRDecoder
CTC decoder component for auxiliary training objectives. This decoder is separate from the main TDT decoder and is primarily used during training.
from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-ctc-variant")

# Access CTC decoder if needed
ctc_decoder = model.ctc_decoder

Usage

ParakeetTDTCTC models are used identically to ParakeetTDT models:
from parakeet_mlx import from_pretrained

# Load a TDT-CTC model
model = from_pretrained("mlx-community/parakeet-tdt-ctc-model")

# Transcribe (uses TDT decoder)
result = model.transcribe("audio.wav")
print(result.text)
Why use TDT-CTC? The hybrid architecture allows for multi-task learning during training, potentially improving accuracy. At inference time, you get all the benefits of TDT decoding (beam search, duration modeling) with the robustness from CTC auxiliary training.

Architecture

The TDT-CTC model includes:
  1. Shared Conformer Encoder - Processes audio into features
  2. TDT Decoder (primary) - Autoregressive decoder with duration modeling
  3. Joint Network - Combines encoder and decoder outputs
  4. CTC Decoder (auxiliary) - Non-autoregressive CTC head
During inference, only the TDT path (encoder → decoder → joint → tokens) is used by default.

Model Variants

Check Hugging Face for available TDT-CTC models:
# Example TDT-CTC model (check https://huggingface.co/mlx-community for actual models)
model = from_pretrained("mlx-community/parakeet-tdt-ctc-variant")
If you need to use the CTC decoder directly for inference, please open an issue on the GitHub repository. The current implementation defaults to TDT decoding.

Comparison with Other Models

FeatureTDT-CTCTDTCTC
Beam search✅ Yes✅ Yes❌ No
Duration modeling✅ Yes✅ Yes❌ No
CTC auxiliary✅ Yes❌ NoN/A
SpeedFastFastFastest
Streaming✅ Yes✅ Yes✅ Yes

Build docs developers (and LLMs) love