Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KittenML/KittenTTS/llms.txt

Use this file to discover all available pages before exploring further.

Signature

KittenTTS.generate(
    text,
    voice="expr-voice-5-m",
    speed=1.0,
    clean_text=False,
) -> numpy.ndarray

Parameters

text
str
required
The input text to synthesize. Pass plain prose — numbers, currencies, and abbreviations are not expanded automatically unless clean_text=True.
voice
str
default:"\"expr-voice-5-m\""
Voice to use for synthesis. Accepts any friendly name from available_voices: Bella, Jasper, Luna, Bruno, Rosie, Hugo, Kiki, Leo.The default "expr-voice-5-m" is the internal ID for Leo. You can pass either the friendly name or the internal ID.
speed
float
default:"1.0"
Speech speed multiplier.
  • 1.0 — normal speed
  • Values below 1.0 slow down speech (e.g., 0.75 is 75% speed)
  • Values above 1.0 speed it up (e.g., 1.5 is 150% speed)
clean_text
bool
default:"False"
If True, runs the TextPreprocessor pipeline before synthesis. This expands numbers, currencies, abbreviations, and more into spoken form.By default this is False — pass text that is already in spoken form, or enable this option to let KittenTTS handle expansion automatically.

Returns

audio
numpy.ndarray
Audio samples as a 1-D float32 numpy array at 24 kHz. You can write this directly to a file with soundfile.write() or play it back with sounddevice.play().
clean_text defaults to False in KittenTTS.generate(). If you pass raw text containing numbers or special characters without enabling clean_text, the model may mispronounce them. Either pre-process the text yourself or set clean_text=True.

Usage examples

from kittentts import KittenTTS

tts = KittenTTS()
audio = tts.generate("Welcome to KittenTTS.")

Build docs developers (and LLMs) love