Skip to main content

Method Signature

client.audio.translations.create(
    file: FileTypes,
    model: str,
    prompt: Optional[str] = None,
    response_format: Optional[str] = None,
    temperature: Optional[float] = None
) -> TranslationCreateResponse

Parameters

file
FileTypes
required
The audio file to translate. Supported formats include:
  • mp3
  • mp4
  • mpeg
  • mpga
  • m4a
  • wav
  • webm
Maximum file size is 25 MB.
model
str
required
Model ID to use for translation (e.g., "openai/whisper-1").
prompt
str
An optional text to guide the model’s style. The prompt should be in English.
response_format
str
The format of the transcript output. Supported formats:
  • json (default) - Simple JSON with text
  • text - Plain text
  • srt - SubRip subtitle format
  • verbose_json - JSON with detailed metadata
  • vtt - Web Video Text Tracks format
temperature
float
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Response

The response varies based on the response_format parameter:

JSON Format (default)

text
str
The translated text in English.

Verbose JSON Format

text
str
The translated text in English.
language
str
The language of the output translation (always english).
duration
float
The duration of the input audio.
segments
List[Segment]
Segments of the translated text and their corresponding details.

Examples

from dedalus_labs import DedalusLabs

client = DedalusLabs()

# Basic translation from any language to English
with open("french_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        file=audio_file,
        model="openai/whisper-1"
    )

print(translation.text)
# Get detailed translation with timestamps
with open("spanish_interview.wav", "rb") as audio_file:
    translation = client.audio.translations.create(
        file=audio_file,
        model="openai/whisper-1",
        response_format="verbose_json"
    )

print(f"Original language: {translation.language}")
print(f"Duration: {translation.duration}s")
print(f"English translation: {translation.text}")

for segment in translation.segments:
    print(f"{segment.start:.2f}s - {segment.end:.2f}s: {segment.text}")
# Generate English SRT subtitles from foreign language audio
with open("german_video.m4a", "rb") as audio_file:
    srt_output = client.audio.translations.create(
        file=audio_file,
        model="openai/whisper-1",
        response_format="srt"
    )

with open("english_subtitles.srt", "w") as f:
    f.write(srt_output.text)
# Use prompt to guide translation style
with open("japanese_presentation.wav", "rb") as audio_file:
    translation = client.audio.translations.create(
        file=audio_file,
        model="openai/whisper-1",
        prompt="This is a technical presentation about AI and machine learning.",
        temperature=0.2
    )

print(translation.text)
# Get plain text translation
with open("mandarin_lecture.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        file=audio_file,
        model="openai/whisper-1",
        response_format="text"
    )

with open("lecture_translation.txt", "w") as f:
    f.write(translation.text)

Build docs developers (and LLMs) love