This guide will walk you through setting up the project and transcribing your first audio file using the ElevenLabs Scribe API.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/konhi/elevenlabs-speech-to-text-api-ui/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before you begin, make sure you have:- Bun installed on your system
- An ElevenLabs API key (sign up for free at ElevenLabs)
- An audio or video file to transcribe
Install dependencies
Clone the repository and install the required packages using Bun:This will install all dependencies from
package.json, including:@elevenlabs/elevenlabs-js- Official ElevenLabs SDKreactandreact-dom- React framework@radix-ui/*- Accessible UI componentstailwindcss- Styling framework
Start the development server
Launch the development server with hot module reloading:You should see output similar to:Open your browser and navigate to
http://localhost:3000 to see the application.Get your ElevenLabs API key
To use the Speech-to-Text API, you’ll need an ElevenLabs API key:
- Go to ElevenLabs
- Sign up or log in to your account
- Navigate to your profile settings
- Copy your API key
Transcribe your first audio file
Now you’re ready to transcribe audio:
- Enter your API key - Paste your ElevenLabs API key into the “ElevenLabs API Key” field
- Upload an audio file - Click “Choose File” and select an audio or video file (supports MP3, WAV, M4A, AAC, OGG, WebM, and more)
- Configure options (optional) - Customize transcription settings:
- Model: Choose between Scribe V1 or Scribe V2 (V2 recommended)
- Diarize: Enable speaker detection for multi-speaker audio
- Timestamps Granularity: Choose word or character-level timestamps
- Language Code: Specify a language (e.g., “en”, “es”, “fr”) for better accuracy
- Click “Transcribe Audio” - The application will send your file to the ElevenLabs API
Transcription time depends on your audio file length and the selected options. Most files process in seconds.
View and interact with results
Once transcription completes, you’ll see:
- Full transcript text - The complete transcription of your audio
- Audio player - Play back the original audio file
- Interactive transcript - Click any word to jump to that timestamp in the audio
- Speaker labels - If you enabled diarization, you can rename speakers for clarity
Next steps
Now that you’ve transcribed your first audio file, you can:Explore advanced options
Try different models, enable entity detection, add custom keyterms, or adjust the temperature parameter
Read the installation guide
Learn more about setting up the development environment and project structure