useTranscriptViewer Hook

The useTranscriptViewer hook provides low-level access to transcript viewer state and controls. Use this hook when building custom transcript viewer UIs.

Import

import { useTranscriptViewer } from "@/features/transcript-view/use-transcript-viewer";

Basic Usage

function CustomTranscriptViewer({ alignment, audioSrc }) {
  const viewer = useTranscriptViewer({ alignment });

  return (
    <div>
      <audio ref={viewer.audioRef} src={audioSrc} />
      <button onClick={viewer.isPlaying ? viewer.pause : viewer.play}>
        {viewer.isPlaying ? "Pause" : "Play"}
      </button>
      <div>
        {viewer.currentTime.toFixed(2)} / {viewer.duration.toFixed(2)}
      </div>
    </div>
  );
}

Parameters

alignment

CharacterAlignmentResponseModel

required

Character-level alignment data from ElevenLabs Speech-to-Text API. Contains:

characters - Array of individual characters
characterStartTimesSeconds - Start time for each character (in seconds)
characterEndTimesSeconds - End time for each character (in seconds)

hideAudioTags

boolean

default:"true"

Whether to hide audio tags (e.g., [music], [applause]) from the transcript. When true, any text between [ and ] is filtered out during segment composition.

segmentComposer

SegmentComposer

Custom function to compose transcript segments from alignment data. If not provided, uses the default composeSegments function.

type SegmentComposer = (
  alignment: CharacterAlignmentResponseModel
) => ComposeSegmentsResult;

type ComposeSegmentsResult = {
  segments: TranscriptSegment[];
  words: TranscriptWord[];
};

onPlay

() => void

Callback invoked when audio playback starts

onPause

() => void

Callback invoked when audio playback pauses

onTimeUpdate

(time: number) => void

Callback invoked when audio time updates. Receives the current time in seconds.

onEnded

() => void

Callback invoked when audio playback completes

onDurationChange

(duration: number) => void

Callback invoked when audio duration is loaded or changes. Receives the duration in seconds.

Return Value

The hook returns an object with the following properties:

State

segments

TranscriptSegment[]

All transcript segments including both words and gaps (whitespace/punctuation)

words

TranscriptWord[]

Only the word segments, excluding gaps. Each word contains:

{
  kind: "word";
  segmentIndex: number;
  wordIndex: number;
  text: string;
  startTime: number;
  endTime: number;
}

spokenSegments

TranscriptSegment[]

Segments that have already been spoken (before current word)

unspokenSegments

TranscriptSegment[]

Segments that haven’t been spoken yet (after current word)

currentWord

TranscriptWord | null

The word currently being spoken, or null if no word is active

currentSegmentIndex

number

Index of the current segment in the segments array. Returns -1 if no current word.

currentWordIndex

number

Index of the current word in the words array. Returns -1 if no current word.

isPlaying

boolean

Whether audio is currently playing

isScrubbing

boolean

Whether user is actively scrubbing/seeking through the timeline

duration

number

Total audio duration in seconds. Falls back to alignment data if audio duration isn’t available.

currentTime

number

Current playback position in seconds

audioRef

RefObject<HTMLAudioElement | null>

React ref to attach to your audio element

Actions

play

() => void

Start audio playback. Safe to call even if already playing.

pause

() => void

Pause audio playback. Safe to call even if already paused.

seekToTime

(time: number) => void

Seek to a specific time in seconds. Updates both the audio element and internal state.

viewer.seekToTime(30.5); // Seek to 30.5 seconds

seekToWord

(word: number | TranscriptWord) => void

Seek to the start of a specific word. Accepts either a word index or a TranscriptWord object.

viewer.seekToWord(5); // Seek to the 6th word (0-indexed)
viewer.seekToWord(viewer.words[10]); // Seek to a specific word object

startScrubbing

() => void

Called when user starts scrubbing. Pauses animation frame updates for smoother scrubbing.

endScrubbing

() => void

Called when user stops scrubbing. Resumes animation frame updates if audio is playing.

Implementation Details

Animation Frame Updates

The hook uses requestAnimationFrame for smooth time updates during playback:

// Simplified internal implementation
function startRaf() {
  function tick() {
    const time = audioRef.current.currentTime;
    setCurrentTime(time);
    handleTimeUpdateRef.current(time);
    rafRef.current = requestAnimationFrame(tick);
  }
  rafRef.current = requestAnimationFrame(tick);
}

This ensures:

Smooth UI updates (60fps when possible)
Efficient word highlighting without lag
Automatic cleanup when component unmounts

Word Index Tracking

The hook tracks the current word using binary search for efficiency:

// From word-index.ts
function findWordIndex(words: TranscriptWord[], time: number): number {
  let lo = 0;
  let hi = words.length - 1;
  while (lo <= hi) {
    const mid = Math.floor((lo + hi) / 2);
    const word = words[mid];
    if (time >= word.startTime && time < word.endTime) {
      return mid;
    }
    if (time < word.startTime) {
      hi = mid - 1;
    } else {
      lo = mid + 1;
    }
  }
  return -1;
}

Optimizations:

Binary search for O(log n) lookups
Sequential forward search when moving to next word
Caches current word index to avoid redundant searches

Duration Fallback

If audio duration isn’t available, the hook calculates it from alignment data:

function getAlignmentFallbackDuration(
  alignment: CharacterAlignmentResponseModel,
  words: TranscriptWord[]
): number {
  const ends = alignment?.characterEndTimesSeconds;
  if (Array.isArray(ends) && ends.length) {
    return ends[ends.length - 1];
  }
  if (words.length) {
    return words[words.length - 1].endTime;
  }
  return 0;
}

Event Listener Management

The hook automatically manages audio event listeners:

useEffect(() => {
  const audio = audioRef.current;
  if (!audio) return;

  function handlePlay() {
    setIsPlaying(true);
    startRaf();
    onPlay?.();
  }

  function handlePause() {
    setIsPlaying(false);
    stopRaf();
    onPause?.();
  }

  // ... more event handlers

  audio.addEventListener("play", handlePlay);
  audio.addEventListener("pause", handlePause);
  // ... more listeners

  return () => {
    // Cleanup
    stopRaf();
    audio.removeEventListener("play", handlePlay);
    audio.removeEventListener("pause", handlePause);
    // ... more cleanup
  };
}, [audioRef, onPlay, onPause, ...]);

Advanced Example

import { useTranscriptViewer } from "@/features/transcript-view/use-transcript-viewer";

function AdvancedTranscriptViewer({ alignment, audioSrc }) {
  const viewer = useTranscriptViewer({
    alignment,
    hideAudioTags: true,
    onPlay: () => console.log("Playback started"),
    onPause: () => console.log("Playback paused"),
    onTimeUpdate: (time) => {
      // Send analytics every 5 seconds
      if (Math.floor(time) % 5 === 0) {
        analytics.track("transcript_progress", { time });
      }
    },
  });

  return (
    <div>
      {/* Audio element */}
      <audio
        ref={viewer.audioRef}
        src={audioSrc}
        preload="metadata"
      />

      {/* Custom controls */}
      <div className="flex gap-2 mb-4">
        <button onClick={viewer.play} disabled={viewer.isPlaying}>
          Play
        </button>
        <button onClick={viewer.pause} disabled={!viewer.isPlaying}>
          Pause
        </button>
        <button onClick={() => viewer.seekToTime(0)}>
          Restart
        </button>
      </div>

      {/* Progress bar */}
      <div className="mb-4">
        <input
          type="range"
          min={0}
          max={viewer.duration}
          value={viewer.currentTime}
          onChange={(e) => viewer.seekToTime(Number(e.target.value))}
          onMouseDown={viewer.startScrubbing}
          onMouseUp={viewer.endScrubbing}
          className="w-full"
        />
        <div className="flex justify-between text-sm">
          <span>{viewer.currentTime.toFixed(2)}s</span>
          <span>{viewer.duration.toFixed(2)}s</span>
        </div>
      </div>

      {/* Transcript with custom rendering */}
      <div className="space-y-2">
        {viewer.segments.map((segment) => {
          if (segment.kind === "gap") {
            return <span key={segment.segmentIndex}>{segment.text}</span>;
          }

          const isSpoken = viewer.spokenSegments.includes(segment);
          const isCurrent = viewer.currentWord === segment;
          const isUnspoken = viewer.unspokenSegments.includes(segment);

          return (
            <span
              key={segment.segmentIndex}
              onClick={() => viewer.seekToWord(segment)}
              className={`
                cursor-pointer
                ${isSpoken ? "text-gray-400" : ""}
                ${isCurrent ? "text-blue-600 font-bold" : ""}
                ${isUnspoken ? "text-black" : ""}
              `}
            >
              {segment.text}
            </span>
          );
        })}
      </div>

      {/* Debug info */}
      <div className="mt-4 p-4 bg-gray-100 rounded">
        <div>Current Word Index: {viewer.currentWordIndex}</div>
        <div>Current Segment Index: {viewer.currentSegmentIndex}</div>
        <div>Is Playing: {viewer.isPlaying ? "Yes" : "No"}</div>
        <div>Is Scrubbing: {viewer.isScrubbing ? "Yes" : "No"}</div>
        <div>Total Words: {viewer.words.length}</div>
        <div>Total Segments: {viewer.segments.length}</div>
      </div>
    </div>
  );
}

Type Definitions

type UseTranscriptViewerProps = {
  alignment: CharacterAlignmentResponseModel;
  segmentComposer?: SegmentComposer;
  hideAudioTags?: boolean;
  onPlay?: () => void;
  onPause?: () => void;
  onTimeUpdate?: (time: number) => void;
  onEnded?: () => void;
  onDurationChange?: (duration: number) => void;
};

type UseTranscriptViewerResult = {
  segments: TranscriptSegment[];
  words: TranscriptWord[];
  spokenSegments: TranscriptSegment[];
  unspokenSegments: TranscriptSegment[];
  currentWord: TranscriptWord | null;
  currentSegmentIndex: number;
  currentWordIndex: number;
  seekToTime: (time: number) => void;
  seekToWord: (word: number | TranscriptWord) => void;
  audioRef: RefObject<HTMLAudioElement | null>;
  isPlaying: boolean;
  isScrubbing: boolean;
  duration: number;
  currentTime: number;
  play: () => void;
  pause: () => void;
  startScrubbing: () => void;
  endScrubbing: () => void;
};

Overview

Features

Transcript View

UI Components

Import

Basic Usage

Parameters

Return Value

State

Actions

Implementation Details

Animation Frame Updates

Word Index Tracking

Duration Fallback

Event Listener Management

Advanced Example

Type Definitions

See Also

Build docs developers (and LLMs) love

Overview

Features

Transcript View

UI Components

Documentation Index

​Import

​Basic Usage

​Parameters

​Return Value

​State

​Actions

​Implementation Details

​Animation Frame Updates

​Word Index Tracking

​Duration Fallback

​Event Listener Management

​Advanced Example

​Type Definitions

​See Also

Build docs developers (and LLMs) love

Import

Basic Usage

Parameters

Return Value

State

Actions

Implementation Details

Animation Frame Updates

Word Index Tracking

Duration Fallback

Event Listener Management

Advanced Example

Type Definitions

See Also