Documentation Index Fetch the complete documentation index at: https://mintlify.com/konhi/elevenlabs-speech-to-text-api-ui/llms.txt
Use this file to discover all available pages before exploring further.
This guide covers everything you need to install and run the ElevenLabs Speech-to-Text UI on your local machine.
System requirements
Before installing, ensure your system meets these requirements:
Operating System : macOS, Linux, or Windows (with WSL recommended)
Node.js : Not required - Bun is an all-in-one JavaScript runtime
RAM : At least 4GB available
Disk Space : ~500MB for dependencies and build files
Install Bun
This project uses Bun as its JavaScript runtime, package manager, and dev server. Bun is significantly faster than Node.js and npm.
Install Bun on macOS or Linux
Run the installation script: curl -fsSL https://bun.sh/install | bash
This will download and install the latest version of Bun.
Install Bun on Windows
On Windows, you can install Bun using PowerShell: powershell - c "irm bun.sh/install.ps1 | iex"
Alternatively, use WSL (Windows Subsystem for Linux) and follow the macOS/Linux instructions.
Verify Bun installation
Check that Bun is installed correctly: You should see a version number like 1.3.4 or higher.
Bun serves as both the package manager (like npm) and the runtime (like Node.js), so you donβt need to install Node.js or npm separately.
Clone and setup the project
Clone the repository
Clone the project to your local machine: git clone < repository-ur l >
cd < project-director y >
Install dependencies
Install all required packages using Bun: This command reads from package.json and installs all dependencies: {
"dependencies" : {
"@elevenlabs/elevenlabs-js" : "^2.34.0" ,
"@radix-ui/react-checkbox" : "^1.3.3" ,
"@radix-ui/react-label" : "^2.1.7" ,
"@radix-ui/react-progress" : "^1.1.8" ,
"@radix-ui/react-select" : "^2.2.6" ,
"@radix-ui/react-slot" : "^1.2.3" ,
"bun-plugin-tailwind" : "^0.1.2" ,
"class-variance-authority" : "^0.7.1" ,
"clsx" : "^2.1.1" ,
"lucide-react" : "^0.563.0" ,
"react" : "^19" ,
"react-dom" : "^19" ,
"tailwind-merge" : "^3.3.1"
}
}
Bunβs package installation is typically 10-20x faster than npm, so this should complete in seconds.
Verify the installation
Ensure everything is set up correctly by starting the development server: The server starts using the configuration in src/index.ts: import { serve } from "bun" ;
import index from "./index.html" ;
const server = serve ({
routes: {
"/*" : index ,
},
development: process . env . NODE_ENV !== "production" && {
hmr: true , // Hot module reloading
console: true , // Echo browser console to server
},
});
console . log ( `π Server running at ${ server . url } ` );
You should see: π Server running at http://localhost:3000
Open the application
Navigate to http://localhost:3000 in your web browser. You should see the Speech-to-Text Playground interface with:
An API key input field
An audio file upload section
Configuration options for transcription
A βTranscribe Audioβ button
Development scripts
The project includes several npm scripts defined in package.json:
Start development server
Runs the development server with hot module reloading enabled. Any changes to source files will automatically refresh the browser.
Run production build
Starts the server in production mode (sets NODE_ENV=production). This disables hot reloading and console echoing for better performance.
Build the project
Executes the custom build script defined in build.ts to compile and bundle the application.
Check for unused code
Runs Knip to identify unused files, dependencies, and exports in your codebase.
Project structure
Hereβs an overview of the key directories and files:
.
βββ src/
β βββ index.ts # Bun server entry point
β βββ index.html # HTML template
β βββ index.css # Global styles
β βββ App.tsx # Root React component
β βββ frontend.tsx # Frontend entry point
β βββ components/
β β βββ ui/ # Reusable UI components
β β βββ button.tsx
β β βββ card.tsx
β β βββ input.tsx
β β βββ select.tsx
β β βββ ...
β βββ features/
β βββ speech-to-text-playground/
β β βββ speech-to-text-playground.tsx # Main playground component
β β βββ transcription-form.tsx # Form for API key and options
β β βββ transcription-result.tsx # Results display
β β βββ speech-to-text-types.ts # TypeScript types
β β βββ transcript-utils.ts # Helper functions
β βββ transcript-view/
β βββ transcript-viewer.tsx # Interactive transcript UI
β βββ use-transcript-viewer.ts # Transcript viewer hooks
β βββ ...
βββ package.json # Dependencies and scripts
βββ tsconfig.json # TypeScript configuration
βββ build.ts # Build script
Understanding the core implementation
The main transcription logic is in src/features/speech-to-text-playground/speech-to-text-playground.tsx:
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js" ;
// Initialize the ElevenLabs client with your API key
const browserClient = new ElevenLabsClient ({ apiKey });
// Call the Speech-to-Text API
const transcriptResponse = await browserClient . speechToText . convert ({
file ,
modelId: "scribe_v2" ,
languageCode: options . languageCode || undefined ,
tagAudioEvents: options . tagAudioEvents || false ,
numSpeakers: options . numSpeakers || undefined ,
timestampsGranularity: options . timestampsGranularity || "character" ,
diarize: options . diarize || false ,
useMultiChannel: options . useMultiChannel || false ,
keyterms: options . keyterms || undefined ,
entityDetection: options . entityDetection || undefined ,
});
The default configuration is defined in the same file:
const defaultTranscriptOptions : TranscriptOptions = {
modelId: "scribe_v2" ,
tagAudioEvents: false ,
timestampsGranularity: "character" ,
diarize: false ,
useMultiChannel: false ,
};
Troubleshooting
Port already in use
If you see an error about port 3000 being in use, either:
Stop the process using that port
Modify the server configuration in src/index.ts to use a different port
Module not found errors
If you encounter module resolution errors:
This will clear the cache and reinstall all dependencies.
TypeScript errors
The project uses TypeScript with strict type checking. If you see type errors:
Check that all dependencies are installed: bun install
Verify your TypeScript version: bun tsc --version
Review the type definitions in src/features/speech-to-text-playground/speech-to-text-types.ts
Make sure youβre using Bun version 1.3.4 or higher. Earlier versions may have compatibility issues with some dependencies.
Next steps
Quickstart guide Follow the quickstart to transcribe your first audio file
Back to introduction Return to the introduction page