TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/huggingface/speech-to-speech/llms.txt
Use this file to discover all available pages before exploring further.
speech-to-speech package exposes a single command-line entrypoint — speech-to-speech — that starts the full VAD → STT → LLM → TTS pipeline. The same pipeline can also be launched via the Python module form:
-h prints the full reference:
Argument groups
Every flag belongs to one of the argument classes below. Each class is parsed independently and covers one stage or concern of the pipeline:| Argument class | Reference page | Scope |
|---|---|---|
ModuleArguments | Module Args | Top-level mode, backend selection, live transcription |
VADHandlerArguments | VAD Args | Silero VAD v5 sensitivity, timing, and audio enhancement |
WhisperSTTHandlerArguments | STT Args | Whisper (--stt whisper) |
FasterWhisperSTTHandlerArguments | STT Args | Faster-Whisper (--stt faster-whisper) |
ParakeetTDTSTTHandlerArguments | STT Args | Parakeet TDT (--stt parakeet-tdt) |
ParaformerSTTHandlerArguments | STT Args | Paraformer / FunASR (--stt paraformer) |
MLXAudioWhisperSTTHandlerArguments | STT Args | MLX Audio Whisper (--stt mlx-audio-whisper) |
LanguageModelBaseArguments | LLM Args | Shared model name, chat history, system prompt |
LanguageModelHandlerArguments | LLM Args | Transformers / mlx-lm local backends |
ResponsesApiLanguageModelHandlerArguments | LLM Args | OpenAI Responses API backend |
ChatCompletionsLanguageModelHandlerArguments | LLM Args | OpenAI Chat Completions backend |
Qwen3TTSHandlerArguments | TTS Args | Qwen3-TTS (--tts qwen3) |
KokoroTTSHandlerArguments | TTS Args | Kokoro-82M (--tts kokoro) |
PocketTTSHandlerArguments | TTS Args | Pocket TTS (--tts pocket) |
ChatTTSHandlerArguments | TTS Args | ChatTTS (--tts chatTTS) |
FacebookMMSTTSHandlerArguments | TTS Args | Facebook MMS (--tts facebookMMS) |
SocketReceiverArguments | Connection Args | TCP socket receiver |
SocketSenderArguments | Connection Args | TCP socket sender |
WebSocketStreamerArguments | Connection Args | WebSocket streamer |
Flag prefix system
Because multiple STT, LLM, and TTS backends can coexist in the argument namespace, each backend’s flags are namespaced by a prefix:| Backend | CLI prefix | Example |
|---|---|---|
| Whisper (Transformers) | --stt_ | --stt_model_name openai/whisper-large-v3 |
| Faster-Whisper | --faster_whisper_stt_ | --faster_whisper_stt_model_name large-v3 |
| Parakeet TDT | --parakeet_tdt_ | --parakeet_tdt_device mps |
| Paraformer | --paraformer_stt_ | --paraformer_stt_model_name paraformer-zh |
| MLX Audio Whisper | --mlx_audio_whisper_ | --mlx_audio_whisper_model_name mlx-community/whisper-large-v3-turbo |
| Local LLM (transformers/mlx-lm) | --llm_ | --llm_device cuda |
| Shared LLM (all backends) | (no prefix) | --model_name gpt-4o-mini |
| Responses API / Chat Completions | --responses_api_ | --responses_api_base_url http://localhost:8000/v1 |
| Qwen3-TTS | --qwen3_tts_ | --qwen3_tts_speaker Aiden |
| Kokoro TTS | --kokoro_ | --kokoro_voice bm_fable |
| Pocket TTS | --pocket_tts_ | --pocket_tts_voice jean |
| ChatTTS | --chat_tts_ | --chat_tts_device cuda |
| Facebook MMS | --facebook_mms_ | --facebook_mms_device cuda |
The gen_kwargs pattern
Generation parameters follow the <handler_prefix>_gen_<param> naming convention. At parse time the pipeline strips the handler prefix and collects every gen_-prefixed field into a gen_kwargs dict that is forwarded directly to the underlying model’s generate() call:
gen_* parameters.
Passing arguments from a JSON file
When the JSON config file is the only argument (no other CLI flags), the CLI reads configuration from that file instead of the shell:Default command and its full expansion
Runningspeech-to-speech with no arguments starts the realtime server with Parakeet TDT, the OpenAI Responses API, and Qwen3-TTS. The short form:
The default command requires
OPENAI_API_KEY to be set in your environment, or the key passed explicitly via --responses_api_api_key. For non-OpenAI providers also set --responses_api_base_url.