preprocess

The preprocess module handles audio extraction from video files using FFmpeg. It prepares audio tracks for synchronization analysis by converting them to a standard format.

extract_audio_from_videos

Extracts audio tracks from all video files in a directory using FFmpeg. Produces mono WAV files at a specified sample rate.

from src.preprocess import extract_audio_from_videos

extract_audio_from_videos(
    video_dir="./videos",
    audio_dir="./audio_output",
    target_sr=16000
)

Parameters

video_dir

str

required

Directory containing input video files (.mp4, .mov)

audio_dir

str

required

Output directory for extracted WAV files. Created if it doesn’t exist.

target_sr

int

default:"16000"

Target sample rate in Hz for the output audio files. Common values:

16000 - Default, good balance of quality and processing speed
44100 - CD quality
48000 - Professional video standard

Returns

No return value. Audio files are written to disk with the same base filename as the video:

video1.mp4 → video1.wav
camera_A.mov → camera_A.wav

Raises

RuntimeError

exception

Raised if FFmpeg is not found in system PATH

CalledProcessError

exception

Raised if FFmpeg fails to extract audio from a video file

Output Format

All extracted audio files are:

Format: WAV (uncompressed)
Channels: Mono (1 channel) - stereo tracks are mixed to mono
Sample rate: As specified by target_sr parameter
Bit depth: 16-bit PCM
Naming: Same basename as input video file with .wav extension

Implementation Details

The function:

Scans video_dir for video files with extensions .mp4 or .mov
For each video, runs FFmpeg with the following conversion:
- -ac 1 - Convert to mono
- -ar {target_sr} - Resample to target sample rate
- -vn - Discard video stream (audio only)
Writes mono WAV files to audio_dir

Usage Example

import os
from src.preprocess import extract_audio_from_videos
from src import config

# Extract audio for synchronization
video_dir = config.VIDEO_DIR
audio_dir = config.AUDIO_DIR

extract_audio_from_videos(
    video_dir=video_dir,
    audio_dir=audio_dir,
    target_sr=16000
)

# Extracted files are now ready for sync analysis
print(f"Audio files extracted to: {audio_dir}")
print(f"Files: {os.listdir(audio_dir)}")

Integration with Sync Workflow

This function is typically called before audio-based synchronization:

# Complete audio sync workflow
from src.preprocess import extract_audio_from_videos
from src.audio_sync import estimate_offsets_robust

# Step 1: Extract audio
extract_audio_from_videos(
    video_dir="./raw_videos",
    audio_dir="./extracted_audio"
)

# Step 2: Compute offsets using GCC-PHAT
offsets = estimate_offsets_robust(
    audio_dir="./extracted_audio",
    max_offset_sec=10.0
)

print("Computed offsets:", offsets)

The Flask UI automatically handles audio extraction when audio sync method is selected. This function is primarily useful for programmatic/batch processing workflows.

Requirements

FFmpeg must be installed and available in your system PATH:

macOS: brew install ffmpeg
Windows: Download from ffmpeg.org and add to PATH
Linux: sudo apt install ffmpeg

Verify installation: ffmpeg -version

Performance

Audio extraction performance depends on:

Video duration and codec
Target sample rate (lower rates process faster)
Disk I/O speed

Typical performance for 1080p videos:

5-minute video: ~5-10 seconds extraction time
30-minute video: ~20-30 seconds extraction time

Processing runs sequentially (one video at a time), but each video is independent.

audio_sync

Use extracted audio for GCC-PHAT synchronization

Configuration

Configure audio and video directory paths

Core Modules

Utilities

Evaluation

extract_audio_from_videos

Parameters

Returns

Raises

Output Format

Implementation Details

Usage Example

Integration with Sync Workflow

Requirements

Performance

See Also

audio_sync

Configuration

Build docs developers (and LLMs) love

Core Modules

Utilities

Evaluation

​extract_audio_from_videos

​Parameters

​Returns

​Raises

​Output Format

​Implementation Details

​Usage Example

​Integration with Sync Workflow

​Requirements

​Performance

​See Also

audio_sync

Configuration

Build docs developers (and LLMs) love

extract_audio_from_videos

Parameters

Returns

Raises

Output Format

Implementation Details

Usage Example

Integration with Sync Workflow

Requirements

Performance

See Also