Documentation Index
Fetch the complete documentation index at: https://mintlify.com/konhi/elevenlabs-speech-to-text-api-ui/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The audio playback system provides a complete set of controls for playing audio files and navigating through transcripts. It features a custom scrub bar for precise seeking, play/pause controls, and automatic time synchronization.
Audio Component
The viewer uses a hidden HTML5 audio element controlled by React:
function TranscriptViewerAudio({ ...props }: ComponentPropsWithoutRef<"audio">) {
const { audioProps } = useTranscriptViewerContext();
return <audio {...audioProps} {...props} />;
}
Audio Props Configuration
The audio element is configured automatically by the container:
const audioProps = useMemo(
() => ({
ref: audioRef,
controls: false, // Use custom controls
preload: "metadata" as const,
src: audioSrc,
children: <source src={audioSrc} type={audioType} />,
}),
[audioRef, audioSrc]
);
Play/Pause Control
The play/pause button automatically toggles based on playback state:
function TranscriptViewerPlayPauseButton({
className,
children,
onClick,
...props
}: TranscriptViewerPlayPauseButtonProps) {
const { isPlaying, play, pause } = useTranscriptViewerContext();
const Icon = isPlaying ? Pause : Play;
function handleClick(event: React.MouseEvent<HTMLButtonElement>) {
if (isPlaying) pause();
else play();
onClick?.(event);
}
const content =
typeof children === "function"
? (children as RenderChildren)({ isPlaying })
: children;
return (
<Button onClick={handleClick} className={className} {...props}>
{content ?? <Icon />}
</Button>
);
}
Usage
Default Icon
Custom Content
Styled Button
<TranscriptViewerPlayPauseButton />
Displays a Play or Pause icon automatically.<TranscriptViewerPlayPauseButton>
{({ isPlaying }) => isPlaying ? "Pause" : "Play"}
</TranscriptViewerPlayPauseButton>
Render custom content based on playback state.<TranscriptViewerPlayPauseButton
variant="outline"
size="lg"
className="w-full"
/>
All Button props are supported.
Playback Control Functions
The viewer hook provides imperative playback controls:
const play = useCallback(
function play() {
const audio = audioRef.current;
if (!audio) return;
if (audio.paused) {
void audio.play();
}
},
[audioRef]
);
const pause = useCallback(
function pause() {
const audio = audioRef.current;
if (audio && !audio.paused) {
audio.pause();
}
},
[audioRef]
);
Scrub Bar
The scrub bar provides visual feedback and seeking capabilities:
function TranscriptViewerScrubBar({
className,
showTimeLabels = true,
labelsClassName,
trackClassName,
progressClassName,
thumbClassName,
...props
}: TranscriptViewerScrubBarProps) {
const { duration, currentTime, seekToTime, startScrubbing, endScrubbing } =
useTranscriptViewerContext();
return (
<ScrubBarContainer
duration={duration}
value={currentTime}
onScrub={seekToTime}
onScrubStart={startScrubbing}
onScrubEnd={endScrubbing}
className={className}
{...props}
>
<ScrubBarTrack className={trackClassName}>
<ScrubBarProgress className={progressClassName} />
<ScrubBarThumb className={thumbClassName} />
</ScrubBarTrack>
{showTimeLabels && (
<div className="flex justify-between">
<ScrubBarTimeLabel time={currentTime} />
<ScrubBarTimeLabel time={duration} />
</div>
)}
</ScrubBarContainer>
);
}
Scrub Bar Components
- ScrubBarContainer: Handles mouse/touch events for scrubbing
- ScrubBarTrack: The background track
- ScrubBarProgress: Visual indicator of playback progress
- ScrubBarThumb: Draggable handle for seeking
- ScrubBarTimeLabel: Formatted time display
Customization
<TranscriptViewerScrubBar
showTimeLabels={true}
trackClassName="h-2"
progressClassName="bg-blue-500"
thumbClassName="w-4 h-4"
labelsClassName="text-sm text-gray-600"
/>
Seeking Functionality
Seek to Time
Seek to a specific time in seconds:
const seekToTime = useCallback(
function seekToTime(time: number) {
const node = audioRef.current;
if (!node) return;
syncSeekTimeState(time);
node.currentTime = time;
},
[audioRef, syncSeekTimeState]
);
Seek to Word
Jump to a specific word in the transcript:
const seekToWord = useCallback(
function seekToWord(word: number | TranscriptWord) {
const target = typeof word === "number" ? words[word] : word;
if (!target) return;
seekToTime(target.startTime);
},
[seekToTime, words]
);
This enables clickable words in the transcript:
<TranscriptViewerWords
renderWord={({ word, status }) => (
<span
className="cursor-pointer hover:underline"
onClick={() => seekToWord(word)}
>
{word.text}
</span>
)}
/>
Scrubbing State Management
When the user drags the scrub bar, the RAF loop is paused for performance:
const startScrubbing = useCallback(
function startScrubbing() {
setIsScrubbing(true);
stopRaf(); // Pause the animation frame loop
},
[stopRaf]
);
const endScrubbing = useCallback(
function endScrubbing() {
setIsScrubbing(false);
const node = audioRef.current;
if (node && !node.paused) {
startRaf(); // Resume if still playing
}
},
[audioRef, startRaf]
);
The isScrubbing state can be used to show visual feedback during seeking, such as a larger preview or tooltip.
Audio Event Handling
The viewer listens to all relevant audio events:
useEffect(
function setupAudioEventListeners() {
const audio = audioRef.current;
if (!audio) return;
function handlePlay() {
setIsPlaying(!audio.paused);
startRaf();
onPlay?.();
}
function handlePause() {
setIsPlaying(!audio.paused);
setCurrentTime(audio.currentTime);
stopRaf();
onPause?.();
}
function handleEnded() {
setIsPlaying(false);
setCurrentTime(audio.currentTime);
stopRaf();
onEnded?.();
}
function handleTimeUpdateEvent() {
setCurrentTime(audio.currentTime);
onTimeUpdate?.(audio.currentTime);
}
function handleSeeked() {
setCurrentTime(audio.currentTime);
handleTimeUpdateRef.current(audio.currentTime);
}
function handleDuration() {
setDuration(Number.isFinite(audio.duration) ? audio.duration : 0);
onDurationChange?.(audio.duration);
}
// Initial sync
setIsPlaying(!audio.paused);
setCurrentTime(audio.currentTime);
setDuration(Number.isFinite(audio.duration) ? audio.duration : 0);
if (!audio.paused) {
startRaf();
}
audio.addEventListener("play", handlePlay);
audio.addEventListener("pause", handlePause);
audio.addEventListener("ended", handleEnded);
audio.addEventListener("timeupdate", handleTimeUpdateEvent);
audio.addEventListener("seeked", handleSeeked);
audio.addEventListener("durationchange", handleDuration);
audio.addEventListener("loadedmetadata", handleDuration);
return function cleanupAudioEventListeners() {
stopRaf();
audio.removeEventListener("play", handlePlay);
audio.removeEventListener("pause", handlePause);
audio.removeEventListener("ended", handleEnded);
audio.removeEventListener("timeupdate", handleTimeUpdateEvent);
audio.removeEventListener("seeked", handleSeeked);
audio.removeEventListener("durationchange", handleDuration);
audio.removeEventListener("loadedmetadata", handleDuration);
};
},
[audioRef, startRaf, stopRaf, onPlay, onPause, onEnded, onTimeUpdate, onDurationChange]
);
Duration Fallback
If audio metadata isn’t loaded, the viewer uses alignment data as fallback:
function getAlignmentFallbackDuration(
alignment: CharacterAlignmentResponseModel | null | undefined,
words: TranscriptWord[]
): number {
const ends = alignment?.characterEndTimesSeconds;
if (Array.isArray(ends) && ends.length) {
const last = ends[ends.length - 1];
return typeof last === "number" && Number.isFinite(last) ? last : 0;
}
if (words.length) {
const lastWord = words[words.length - 1];
const lastWordEnd = lastWord?.endTime;
return typeof lastWordEnd === "number" && Number.isFinite(lastWordEnd)
? lastWordEnd
: 0;
}
return 0;
}
State Synchronization
The viewer maintains multiple state sources:
const syncSeekTimeState = useCallback(
function syncSeekTimeState(time: number) {
setCurrentTime(time);
handleTimeUpdateRef.current(time);
},
[]
);
When seeking:
- Update React state (
currentTime)
- Update current word index via
handleTimeUpdate
- Update audio element’s
currentTime
Always update state before updating the audio element’s currentTime to ensure the UI is in sync when the “seeked” event fires.
Complete Example
Here’s a full playback interface:
<TranscriptViewerContainer
audioSrc={result.audioUrl}
audioType={audioType}
alignment={result.alignment}
>
<div className="flex items-center gap-4 p-4 bg-gray-50 rounded">
<TranscriptViewerPlayPauseButton
variant="outline"
size="icon"
/>
<TranscriptViewerScrubBar className="flex-1" />
</div>
<TranscriptViewerWords
className="p-6"
renderWord={({ word, status }) => (
<span
className={cn(
"cursor-pointer transition-all",
status === "current" && "text-blue-600 font-bold scale-110",
status === "spoken" && "text-gray-400",
)}
onClick={() => seekToWord(word)}
>
{word.text}
</span>
)}
/>
<TranscriptViewerAudio />
</TranscriptViewerContainer>
Next Steps