POST /v1/audio/translations
Translates audio from a non-English language into English text. The request body must be sent asmultipart/form-data with the audio file included as a file field. This endpoint maps to the createTranslation operation and is compatible with the OpenAI Whisper translation API.
This endpoint always produces English output. To transcribe audio without translation, use audio transcriptions instead.
Request headers
The provider to route the request to (e.g.
openai). Required when not using a config.Your provider API key.
A JSON config object or config ID that defines routing, fallbacks, retries, and more.
A virtual key ID from Portkey Cloud.
Request body
This endpoint accepts
multipart/form-data. The audio file must be uploaded as a file field named file.The audio file to translate. Supported formats:
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Maximum file size is 25 MB.The speech-to-text model to use (e.g.
whisper-1).Optional English text to guide the model’s output style or provide context. The prompt should be in English even if the source audio is in another language.
The format of the translation output. One of
json, text, srt, verbose_json, or vtt.Sampling temperature between
0 and 1. Higher values produce more varied output.Response
The response format depends on theresponse_format parameter.
json (default)
The translated English text.
verbose_json
The full translated text.
The detected language of the source audio.
The duration of the audio file in seconds.
Segment-level data for the translation.
text, srt, vtt — Plain-text or subtitle format strings.