Skip to main content

Method Signature

client.audio.speech.create(
    input: str,
    model: Union[str, Literal["tts-1", "tts-1-hd", "gpt-4o-mini-tts", "gpt-4o-mini-tts-2025-12-15"]],
    voice: Union[str, Literal["alloy", "ash", "ballad", "coral", "echo", "sage", "shimmer", "verse", "marin", "cedar"], dict],
    instructions: Optional[str] = None,
    response_format: Optional[Literal["mp3", "opus", "aac", "flac", "wav", "pcm"]] = None,
    speed: Optional[float] = None,
    stream_format: Optional[Literal["sse", "audio"]] = None
) -> BinaryAPIResponse

Parameters

input
str
required
The text to generate audio for. The maximum length is 4096 characters.
model
Union[str, Literal]
required
One of the available TTS models:
  • tts-1 - Standard quality text-to-speech
  • tts-1-hd - High definition text-to-speech
  • gpt-4o-mini-tts - Advanced TTS with instruction support
  • gpt-4o-mini-tts-2025-12-15 - Latest advanced TTS model
voice
Union[str, dict]
required
The voice to use when generating the audio. Supported built-in voices are:
  • alloy
  • ash
  • ballad
  • coral
  • echo
  • fable
  • onyx
  • nova
  • sage
  • shimmer
  • verse
  • marin
  • cedar
You may also provide a custom voice object with an id, for example {"id": "voice_1234"}.
instructions
str
Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.
response_format
Literal['mp3', 'opus', 'aac', 'flac', 'wav', 'pcm']
The format to audio in. Supported formats are:
  • mp3 (default)
  • opus
  • aac
  • flac
  • wav
  • pcm
speed
float
The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
stream_format
Literal['sse', 'audio']
The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.

Response

Returns a BinaryAPIResponse object containing the audio file data. The response can be saved to a file or streamed directly.

Examples

from dedalus_labs import DedalusLabs
from pathlib import Path

client = DedalusLabs()

# Generate speech and save to file
response = client.audio.speech.create(
    input="Hello! Welcome to our text-to-speech demo.",
    model="tts-1",
    voice="alloy"
)

with open("output.mp3", "wb") as f:
    f.write(response.content)
# Generate high-quality speech with custom speed
response = client.audio.speech.create(
    input="This is a test of the high definition text-to-speech system.",
    model="tts-1-hd",
    voice="nova",
    speed=1.2,
    response_format="wav"
)

Path("output.wav").write_bytes(response.content)
# Use advanced model with instructions
response = client.audio.speech.create(
    input="I'm excited to announce our new product launch!",
    model="gpt-4o-mini-tts",
    voice="shimmer",
    instructions="Speak with enthusiasm and energy",
    response_format="opus"
)

with open("announcement.opus", "wb") as f:
    f.write(response.content)
# Using a custom voice
response = client.audio.speech.create(
    input="Custom voice demonstration",
    model="gpt-4o-mini-tts-2025-12-15",
    voice={"id": "voice_abc123"},
    speed=0.9
)

Path("custom_voice.mp3").write_bytes(response.content)

Build docs developers (and LLMs) love