Basic usage
Transcribe an audio file to text:Async usage
Transcribe audio asynchronously:Supported formats
The transcription API supports the following audio formats:
- mp3
- mp4
- mpeg
- mpga
- m4a
- wav
- webm
File upload methods
- PathLike
- Bytes
- Tuple
Use
Path objects for local files:Language specification
Improve accuracy by specifying the input language using ISO-639-1 codes:Common language codes
Common language codes
en- Englishes- Spanishfr- Frenchde- Germanit- Italianpt- Portuguesenl- Dutchja- Japaneseko- Koreanzh- Chineseru- Russianar- Arabichi- Hindi
Response formats
Choose from multiple output formats:- JSON (default)
- Text
- SRT (subtitles)
- VTT (WebVTT)
- Verbose JSON
Prompting for context
Provide context to guide the transcription style and content:The
prompt parameter helps the model:- Maintain consistent spelling of uncommon words
- Match a specific writing style
- Continue from previous audio segments
- Use domain-specific terminology correctly
Temperature control
Adjust the sampling temperature for different transcription behaviors:temperature=0.0- More deterministic and consistenttemperature=0.5- Balanced (default)temperature=1.0- More varied transcriptions