Skip to main content
Multi-Camera Video Synchronization

What is Multi-Camera Video Synchronization?

A Flask-based tool for aligning multiple video tracks using visual motion detection or audio cross-correlation. Designed for synchronizing multi-view recordings where start times aren’t perfectly aligned. Whether you’re working with silent videos, noisy environments, or recordings from different camera angles, this tool provides robust synchronization with sub-frame accuracy.

Key Features

Dual Sync Methods

Choose between Visual (Motion) synchronization for silent videos or Audio (GCC-PHAT) for high-precision alignment using sound

Flask Web UI

Intuitive multi-step wizard with upload, sync, review, and export capabilities - no command line required

Sub-Frame Accuracy

Uses FFmpeg re-encoding with tpad and adelay filters to ensure precise alignment across all video players

Global Optimization

Computes pairwise offsets between all videos and uses weighted least-squares optimization for globally consistent alignment

Evaluation Suite

Fully script-driven pipeline for assessing accuracy, confidence reliability, and efficiency with publication-ready plots

Silent Video Support

Visual motion synchronization works even when videos have no audio or in noisy environments

How It Works

Motion-Based Synchronization
1

Extract Motion Energy

Process each video frame to detect motion patterns using optical flow analysis
# From src/visual_sync.py:39-94
def extract_motion_energy(video_path: str, 
                          downsample: int = 4,
                          blur_size: int = 5,
                          center_crop: bool = True,
                          step: int = 3) -> Tuple[np.ndarray, float]:
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    
    while True:
        ret, frame = cap.read()
        if not ret:
            break
            
        # Center crop and downsample for speed
        if center_crop:
            h, w = frame.shape[:2]
            frame = frame[int(h*0.25):int(h*0.75), int(w*0.25):int(w*0.75)]
        
        gray = cv2.cvtColor(small_frame, cv2.COLOR_BGR2GRAY)
        
        if prev_gray is not None:
            diff = cv2.absdiff(gray, prev_gray)
            energy = np.sum(thresh) / (thresh.shape[0] * thresh.shape[1] * 255)
            motion_energy.append(energy)
2

Cross-Correlate Motion Signals

Find temporal offsets by correlating motion timeseries across all video pairs
# From src/visual_sync.py:105-133
def correlate_motion_signals(sig1: np.ndarray, sig2: np.ndarray,
                             fps: float,
                             max_offset_sec: float = 20.0) -> Tuple[float, float]:
    # Normalize signals
    sig1_norm = (sig1 - np.mean(sig1)) / (np.std(sig1) + 1e-10)
    sig2_norm = (sig2 - np.mean(sig2)) / (np.std(sig2) + 1e-10)
    
    cc = correlate(sig1_norm, sig2_norm, mode='full')
    
    # Find peak within search window
    lag_frames = lag_idx - center
    offset_seconds = lag_frames / fps
    
    # Confidence scoring
    confidence = (peak - mean_cc) / (std_cc + 1e-10)
    return offset_seconds, confidence
3

Global Optimization

Use least-squares optimization to find globally consistent offsets across all videos
Visual sync works even when cameras have different angles - the timing of motion events (walking, gestures) remains consistent across views.

Technical Architecture

Installation

Set up Python environment and install dependencies

Quick Start

Get synchronized videos in under 5 minutes

Configuration

Customize sync method and processing parameters

Use Cases

Synchronize footage from multiple cameras capturing a sporting event from different angles. Visual sync works even when crowd noise varies between camera positions.
Align recordings from multiple cameras and microphones using audio cross-correlation for frame-perfect editing.
Synchronize silent or music-backed performances across multiple camera angles using motion-based alignment.
Academic research requiring precise temporal alignment of multi-camera recordings with sub-frame accuracy.

System Requirements

Python

Python 3.11+ (tested on 3.12)Required by scipy and pandas

FFmpeg

FFmpeg in system PATHRequired for audio extraction and video manipulation
Ready to get started? Head to the Installation Guide to set up your environment.

Build docs developers (and LLMs) love