Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/cocreating/4StemPlayer/llms.txt

Use this file to discover all available pages before exploring further.

AudioEngine is the single source of truth for all audio state in 4Stem Band Player. Located at src/lib/audio/AudioEngine.ts, it wraps the Web Audio API into a testable, snapshot-driven class that AppShell.svelte subscribes to. Components never touch AudioContext directly — they call engine methods and re-render from the immutable AudioEngineSnapshot objects the engine emits.

Location and Constants

// src/lib/audio/AudioEngine.ts

export const STEM_ORDER = ['vocals', 'guitar', 'strings', 'drums', 'bass', 'fx', 'other'] as const;
STEM_ORDER controls the preferred display order for stem tracks in the mixer and waveform list. Stems not present in a song are simply absent from the snapshot — the order applies to whatever subset is loaded.

Constructor and Dependency Injection

The engine is constructed with an EngineOptions object. All fields are optional, which allows unit tests to inject fakes without any browser globals:
export class AudioEngine {
  constructor(options: EngineOptions = {})
}

interface EngineOptions {
  audioContext?: AudioContext;
  fetchArrayBuffer?: (url: string) => Promise<ArrayBuffer>;
  createPitchShiftNode?: (audioContext: AudioContext) => Promise<PitchShiftNodeLike>;
  driftCorrectionIntervalMs?: number;
  wait?: (milliseconds: number) => Promise<void>;
  decodeProfile?: DecodeProfile | null;
  createOfflineAudioContext?: (channels: number, length: number, sampleRate: number) => OfflineRenderContextLike;
  pitchTempoMode?: PitchTempoMode;
  createRenderedBuffer?: RenderedBufferFactory;
}
audioContext
AudioContext
A browser AudioContext instance. Defaults to constructing a new AudioContext (or webkitAudioContext) from window. Inject a fake in tests to avoid browser globals.
fetchArrayBuffer
(url: string) => Promise<ArrayBuffer>
Function used to fetch stem MP3 data. Defaults to a fetch-based implementation that throws on non-OK responses. Inject a stub in tests to return pre-built ArrayBuffer values.
createPitchShiftNode
(audioContext: AudioContext) => Promise<PitchShiftNodeLike>
Factory that creates a SoundTouch AudioWorklet node for real-time pitch/tempo processing. Defaults to registering and constructing a SoundTouchNode.
driftCorrectionIntervalMs
number
Interval in milliseconds at which the engine polls the audio clock and emits position snapshots during playback. Defaults to 80 ms; AppShell raises this to 150 ms on mobile to reduce main-thread load.
decodeProfile
DecodeProfile | null
When provided, decoded stems are downmixed and/or resampled to shrink in-memory footprint. A { mono: true, sampleRate: 22050 } profile drops a six-stem song from ~450 MB to ~110 MB. Omit or pass null for full-fidelity desktop playback.
pitchTempoMode
'realtime' | 'render'
Controls how pitch and tempo changes are applied. 'realtime' routes audio through live SoundTouch worklets (best on desktop). 'render' pre-renders each stem offline whenever pitch or tempo changes and plays plain decoded buffers (best on mobile, where worklets underrun). Defaults to 'realtime'.

Public Method Signatures

loadSong

async loadSong(song: LoadableSong): Promise<void>
Loads a new song, destroying all resources from the previous song first. Accepted interface:
interface LoadableSong {
  id: string;
  title: string;
  stems: Array<{
    name: string;   // e.g. 'bass', 'drums', 'vocals'
    label: string;  // Display label, e.g. 'Bass'
    url: string;    // Absolute URL to the MP3 file
  }>;
}
Behavior:
  • Calls destroy() to release previous audio buffers and nodes
  • Sets loading: true and emits a snapshot immediately
  • Fetches all stems concurrently with fetchArrayBuffer
  • Decodes each buffer with AudioContext.decodeAudioData()
  • Applies decodeProfile downmix/resample if configured
  • Sets duration to the maximum decoded buffer duration across all stems
  • Collects errors per stem; throws with a combined message if any stem failed

play

async play(): Promise<void>
Starts synchronized playback from the current position. Returns immediately if already playing, if the engine is still starting up, if no stems are loaded, or if there are load errors. Behavior:
  • Resumes the AudioContext (required after user gesture on some browsers)
  • Initializes SoundTouch worklet nodes for any stems that need pitch/tempo shifting (realtime mode)
  • Advances the playback epoch so any in-flight graph work is discarded if superseded
  • Calculates a shared startedAt timestamp from audioContext.currentTime
  • Creates one AudioBufferSourceNode per loaded stem and starts all of them from the same offset — the shared offset guarantees synchronization
  • Starts the drift correction interval timer

pause

pause(): void
Pauses playback at the current position. Behavior:
  • Captures getPosition() (wall-clock–adjusted playhead) before stopping sources
  • Stops and disconnects all AudioBufferSourceNode instances
  • Sets playing: false and advances the epoch
  • Restores master gain in case a render/transition fade was in progress
  • Stops the drift correction timer

stop

stop(): void
Stops playback and resets the playhead to 0. Behavior:
  • Stops and disconnects all AudioBufferSourceNode instances
  • Sets playing: false and position: 0
  • Advances the epoch so orphaned async operations abandon themselves

seek

seek(time: number): void
Moves the playhead to time seconds, clamped to [0, duration]. Behavior:
  • Clamps the requested position; non-finite values resolve to 0
  • Advances the epoch
  • If currently playing: stops existing source nodes, resets startedAt to audioContext.currentTime, and creates new source nodes starting from the new offset — all stems remain synchronized at the new position

setVolume

setVolume(name: string, volume: number): void
Sets the volume for a named stem. volume is clamped to [0, 1]. The gain change is applied through a short linear ramp (DEFAULT_RAMP_SECONDS = 0.018) to avoid audible clicks.

setMuted

setMuted(name: string, muted: boolean): void
Mutes or unmutes a stem. A muted stem has its effective gain set to 0 via its per-stem GainNode, regardless of its volume setting.

setSolo

setSolo(name: string, solo: boolean): void
Solos or un-solos a stem. When any stem has solo: true, all stems without solo: true have their effective gain forced to 0. All gain changes use the short linear ramp.

setTempoRatio

async setTempoRatio(value: number): Promise<void>
Changes the playback speed. value is clamped to [0.5, 1.5]. In realtime mode, live worklet playback rates are updated. In render mode, all stems are re-rendered offline at the new tempo ratio before playback resumes.

setGlobalTransposeSemitones

async setGlobalTransposeSemitones(value: number): Promise<void>
Transposes all pitch-adjustable stems (everything except drums) by the given number of semitones. Clamps to the SoundTouch-supported range. In render mode, triggers an offline re-render of all stems.

adjustGlobalTransposeSemitones

async adjustGlobalTransposeSemitones(delta: number): Promise<void>
Convenience wrapper that adds delta to the current globalTransposeSemitones and calls setGlobalTransposeSemitones.

subscribe

subscribe(listener: (snapshot: AudioEngineSnapshot) => void): () => void
Registers a listener that receives an AudioEngineSnapshot on every state change. Returns an unsubscribe function. The listener is called immediately with the current snapshot when registered.

getSnapshot

getSnapshot(): AudioEngineSnapshot
Returns the current immutable snapshot synchronously. The subscribe callback is the preferred integration path for Svelte components, but getSnapshot is available for one-off reads.

destroy

destroy(): void
Stops all playback, disconnects and discards all audio nodes and buffers, clears stem state, and resets all position/duration/epoch counters. Called automatically by loadSong before loading a new song.

Audio Signal Graph

Every stem has its own gain node. Gain nodes feed a single master gain node, which feeds a dynamics compressor acting as a brickwall limiter, then the AudioContext destination:
AudioBufferSourceNode (per stem)
  └── [PitchShiftNode (SoundTouch worklet, realtime mode only)]
      └── GainNode (per stem)
          └── AnalyserNode (per stem, for VU meter)
              └── masterGainNode (GainNode)
                  └── masterLimiterNode (DynamicsCompressorNode)
                      └── AudioContext.destination
The limiter uses a high ratio (20:1), fast attack (3 ms), and moderate release (120 ms) to prevent hard clipping from summed stems or time-stretch overshoot.

Gain Routing Rules

Snapshot Shape

export interface AudioEngineSnapshot {
  songId: string | null;
  title: string | null;
  globalTransposeSemitones: number;
  duration: number;
  position: number;
  tempoRatio: number;
  playing: boolean;
  loading: boolean;
  /** True while stems are being re-rendered offline after a transpose/tempo change. */
  rendering: boolean;
  /** Progress of the in-flight offline render (done of total stems). */
  renderProgress: { done: number; total: number };
  errors: string[];
  stems: Record<string, StemPlaybackState>;
}

export interface StemPlaybackState {
  name: string;
  label: string;
  url: string;
  loading: boolean;
  loaded: boolean;
  error: string | null;
  muted: boolean;
  solo: boolean;
  volume: number;
  effectiveGain: number;
  meterLevel: number;
  pitchAdjustable: boolean;
  effectivePitchSemitones: number;
  pitchShiftError: string | null;
}
Snapshots are plain objects — no class instances, no circular references. Components can safely spread or destructure them.

Concurrency and Safety Guards

Three interlocking mechanisms prevent race conditions when the user rapidly changes songs, seeks, or toggles pitch:

Playback Epoch

Every action that changes playback state (play, pause, stop, seek, load, transpose) increments an integer epoch counter. Async operations that build or restart the audio graph capture the epoch before they start and check it before starting sources — if the epoch has advanced, the operation abandons itself silently.

Start Guard

A starting boolean prevents duplicate play calls from racing while the SoundTouch worklet is initializing (which can take tens of milliseconds). If play() is called again before initialization completes, the second call returns immediately.

Graph Mutation Queue

All audio graph rebuilds run inside runExclusive(), which serializes them through a promise chain (graphMutation). Back-to-back transpose changes cannot overlap into interleaved source sets, even if the user moves a slider quickly.

Pitch and Tempo: Realtime vs. Render Mode

AppShell.svelte sets pitchTempoMode: 'render' automatically on mobile viewports (max-width: 820px) where running several live worklets simultaneously would underrun the audio thread and cause stems to fall out of sync.

Testing

src/lib/audio/AudioEngine.test.ts exercises the engine by injecting a fake AudioContext and a fake fetchArrayBuffer through the constructor — no real browser audio APIs are needed:
const engine = new AudioEngine({
  audioContext: new FakeAudioContext() as unknown as AudioContext,
  fetchArrayBuffer: async (url) => fakeBuffers[url],
});
Test coverage includes:
  • Stem loading and per-stem snapshot state (loading, loaded, error)
  • Synchronized source starts (all sources started at the same AudioContext.currentTime)
  • Seeking: position clamping, source node recreation while playing
  • Gain state: volume, mute, solo, effective gain computation
  • Resource cleanup: destroy() disconnects all nodes and clears all state
  • Epoch guard: async graph work initiated before a stop is discarded correctly
See also Stack for the Vitest configuration and overall testing approach.

Build docs developers (and LLMs) love