Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KittenML/KittenTTS/llms.txt

Use this file to discover all available pages before exploring further.

model.generate returns a NumPy array of audio samples at 24,000 Hz. You can save or process this array however you like. This guide covers the most common approaches. soundfile is installed automatically with KittenTTS and supports WAV, FLAC, OGG, and other formats.
import soundfile as sf
import numpy as np
from kittentts import KittenTTS

model = KittenTTS("KittenML/kitten-tts-nano-0.8")
audio = model.generate("Hello world", voice="Bella")

# Save as WAV (recommended)
sf.write("output.wav", audio, 24000)

# Save as FLAC (lossless)
sf.write("output.flac", audio, 24000)

# Save as OGG
sf.write("output.ogg", audio, 24000)
The third argument to sf.write is the sample rate. Always use 24000 to match KittenTTS output.

Using generate_to_file

For convenience, you can skip the intermediate array entirely:
model.generate_to_file(
    "Hello world",
    "output.wav",
    voice="Bella",
    speed=1.0,
    sample_rate=24000
)
generate_to_file accepts the same voice and speed parameters as generate, plus an explicit sample_rate argument.

Inspecting the audio array

The array returned by generate is a standard NumPy array. You can inspect its properties before saving:
audio = model.generate("Hello world", voice="Bella")

print(f"Shape: {audio.shape}")            # e.g. (1, 48000) for ~2 seconds
print(f"dtype: {audio.dtype}")            # float32
print(f"Duration: {audio.shape[-1] / 24000:.2f} seconds")
print(f"Sample rate: 24000 Hz")
Audio is returned as 32-bit float samples normalized to the range [-1.0, 1.0]. Most audio libraries accept this format directly.

Format comparison

FormatExtensionLossyNotes
WAV.wavNoUniversally compatible, larger files
FLAC.flacNoLossless compression, smaller than WAV
OGG.oggYesSmallest files, slight quality loss
Use WAV or FLAC when audio quality matters (e.g., post-processing, archiving). Use OGG for streaming or web delivery where file size is a priority.

Build docs developers (and LLMs) love