Skip to main content
This guide explains how to interpret synchronization results, validate alignment quality, and troubleshoot issues when sync accuracy is poor.

Understanding Synchronization Offsets

What Are Offsets?

An offset is the temporal shift (in seconds) applied to each video to align them to a common reference point.
Process:
  1. One video is designated as the reference (typically the first uploaded file)
  2. All other videos are analyzed to determine their temporal shift relative to the reference
  3. Positive offsets mean the video starts after the reference
  4. Negative offsets mean the video starts before the reference
Example:
{
  "camera1.mp4": 0.0,      // Reference video (no offset)
  "camera2.mp4": -1.234,   // Starts 1.234s before reference
  "camera3.mp4": 0.567     // Starts 0.567s after reference
}
Source: ui.py:706

Viewing Calculated Offsets

Offsets are logged during synchronization and visible in the UI log viewer:
2026-03-03 14:32:15 - src.ui - INFO - [req_a4b2c9d1] Offsets calculated: 
  {'camera1.mp4': 0.0, 'camera2.mp4': -1.234, 'camera3.mp4': 0.567}
You can also find them in logs/video_sync.log (ui.py:707).
Large offsets (> 5 seconds) are normal if your cameras started recording at different times. The maximum searchable offset is 20 seconds by default (ui.py:669, 696).

Confidence Scores

Confidence scores quantify the reliability of the calculated offsets.

Score Interpretation

confidence
float
Correlation strength between synchronized signals.Scale:
  • 0.8 - 1.0: Excellent (high certainty, strong correlation)
  • 0.5 - 0.8: Good (reliable, moderate correlation)
  • 0.2 - 0.5: Fair (acceptable, but verify manually)
  • 0.0 - 0.2: Poor (unreliable, likely incorrect)
Minimum threshold: 0.2 (ui.py:671)

What Affects Confidence?

High confidence when:
  • Videos contain synchronized, visible motion (people walking, objects moving)
  • Motion events are distinct and non-repetitive
  • Cameras have overlapping fields of view
  • Lighting conditions are consistent
Low confidence when:
  • Minimal motion (static scenes, sleeping subjects)
  • Repetitive motion (oscillating fan, pendulum)
  • Cameras pointed at completely different scenes
  • Extreme lighting differences (indoor vs. outdoor)
High confidence when:
  • Clear, impulsive sounds (claps, door slams, music beats)
  • Low background noise
  • Similar microphone quality across cameras
  • Synchronized audio events in overlapping time
Low confidence when:
  • Continuous, non-distinct sounds (white noise, wind)
  • High background noise or echo
  • Mismatched audio quality (studio mic vs. camera mic)
  • Videos have no overlapping audio content
The current implementation logs confidence scores but does not display them in the UI. You must check logs/video_sync.log to view confidence values.

Viewing Confidence Scores

Confidence scores are emitted during the offset estimation phase:
2026-03-03 14:32:10 - src.audio_sync - INFO - Estimated offset for camera2.mp4: 
  -1.234s (confidence: 0.87)
For visual sync, confidence may be implicit in correlation plots saved to visual_sync_debug/ (ui.py:701).

Validating Synchronization Quality

Manual Verification Methods

1

Use the Universal Seek Bar

Drag the seek bar to jump to action-heavy moments:
  • Clap or hand wave at the start of recording
  • Door closing, object dropping
  • People entering/exiting frame
Expected result: Events occur at the exact same timestamp across all videos (±1 frame).Implementation: ui.py:530-577
2

Check Sync Indicator Images

After synchronization, bounding box overlay images are saved to results/:What to look for:
  • Bounding boxes should align around the same objects across camera views
  • Timestamps should match within ±0.1 seconds
Source: ui.py:709-718
3

Compare Audio Waveforms (Advanced)

Export synchronized videos and load them into an audio editor (Audacity, Adobe Audition):
  1. Import all _synced.mp4 files
  2. Zoom in on a sharp audio transient (clap, snap)
  3. Verify waveform peaks align within 1-2 samples
This method requires external tools but provides sample-level accuracy validation.

Automated Validation (Evaluation Suite)

For rigorous accuracy assessment, use the built-in evaluation pipeline:
# 1. Place original videos in evaluation/originals/
cp camera*.mp4 evaluation/originals/

# 2. Generate synthetic offset dataset
python -m evaluation.offset_generation

# 3. Run batch synchronization (both methods)
python -m evaluation.run_batch

# 4. Compute accuracy metrics
python -m evaluation.compute_metrics

# 5. Generate visual reports
python -m evaluation.visualize_results
Source: README.md:72-97

Common Synchronization Issues

Cause: Sub-frame rounding errors or player buffering issues.Diagnosis:
  • Check if offset precision is < 0.01s (10ms)
  • Test in a different video player (VLC, MPV)
  • Verify all videos have the same frame rate
Solution:
  • Re-encode videos to a common frame rate before synchronization:
    ffmpeg -i input.mp4 -r 30 -c:v libx264 output_30fps.mp4
    
Cause: Insufficient overlapping content or poor signal quality.Diagnosis:
  • Check if videos were recorded at the same time
  • Verify audio tracks exist (for audio sync)
  • Inspect motion content (for visual sync)
Solutions:
  1. Switch methods: Try SYNC_METHOD = "audio" if using visual, or vice versa
  2. Add synchronization markers: Re-record with a visible/audible cue (clap, flash)
  3. Increase max_offset_sec: If videos have large time gaps (ui.py:669, 696)
Cause: Synchronization algorithms are not rotation/flip invariant.Solution: Pre-process videos to correct orientation:
# Flip horizontally
ffmpeg -i input.mp4 -vf hflip output.mp4

# Rotate 180 degrees
ffmpeg -i input.mp4 -vf "transpose=2,transpose=2" output.mp4
Cause: Temporal overlap is insufficient or one camera has drastically different content.Diagnosis:
  • Compare video durations and start times
  • Check if one camera has a blocked view or lens cap
  • Verify audio tracks aren’t muted on specific cameras
Solution:
  • Exclude problematic videos from the upload set
  • Synchronize videos in smaller groups (e.g., front cameras vs. side cameras)
  • Manually trim videos to overlapping time windows before upload
Cause: apply_video_offsets() failed during FFmpeg re-encoding.Diagnosis: Check logs for FFmpeg errors:
ERROR - FFmpeg failed: Output file #0 does not contain any stream
Solutions:
  • Verify FFmpeg installation: ffmpeg -version
  • Check disk space in OUTPUT_DIR
  • Try re-encoding input videos to a standard codec:
    ffmpeg -i input.mov -c:v libx264 -c:a aac output.mp4
    

Understanding FFmpeg Processing

Synchronization is applied using FFmpeg’s tpad (video) and adelay (audio) filters.

How Offsets Are Applied

# Pad 1.5 seconds of black frames at the beginning
ffmpeg -i camera2.mp4 -vf "tpad=start_duration=1.5" \
       -af "adelay=1500|1500" output.mp4
Source: ui.py:724 (apply_video_offsets function)
Re-encoding may introduce slight quality loss. If you need lossless synchronization, modify the apply_video_offsets function to use -c copy (stream copy) where possible.

Verifying FFmpeg Output

Check synchronized video properties:
ffprobe -v error -show_entries format=duration \
        -of default=noprint_wrappers=1:nokey=1 camera1_synced.mp4
Expected result: All synchronized videos should have identical durations (within 0.1s).

Export Format Details

Filename Convention

Synchronized videos follow this pattern:
{original_basename}_synced.{original_extension}
Examples:
  • camera1.mp4camera1_synced.mp4
  • interview_cam2.movinterview_cam2_synced.mov
Source: ui.py:810-813

ZIP Archive Structure

The “Download All (ZIP)” button creates:
synced_videos.zip
├── camera1_synced.mp4
├── camera2_synced.mp4
└── camera3_synced.mp4
Implementation: ui.py:816-836 (in-memory ZIP creation using BytesIO)
The ZIP does not include original raw videos or debug artifacts. Only the final synchronized outputs are bundled.

Advanced Analysis Techniques

Extracting Motion Energy Timeseries

For visual sync, motion energy plots are saved to visual_sync_debug/:
1

Locate Debug Files

ls /tmp/video_synchronization/visual_sync_debug/
# Output: motion_energy_plot.png, correlation_matrix.png
2

Interpret Motion Energy Plot

X-axis: Time (seconds)
Y-axis: Motion magnitude (arbitrary units)
What to look for:
  • Peaks should align across all camera traces
  • Flat regions indicate no motion (problematic for sync)
  • Similar peak magnitudes suggest good overlap
3

Interpret Correlation Matrix

Shows pairwise correlation strength between all video pairs.Color scale: Red (high correlation) to Blue (low correlation)Ideal result: Bright red diagonal, moderate-to-strong off-diagonal values

Frame-Level Accuracy Testing

To verify sub-frame synchronization:
import cv2

# Load synchronized videos
cap1 = cv2.VideoCapture('camera1_synced.mp4')
cap2 = cv2.VideoCapture('camera2_synced.mp4')

# Jump to action moment (e.g., 10.5 seconds)
cap1.set(cv2.CAP_PROP_POS_MSEC, 10500)
cap2.set(cv2.CAP_PROP_POS_MSEC, 10500)

ret1, frame1 = cap1.read()
ret2, frame2 = cap2.read()

# Visual comparison or feature matching
cv2.imshow('Camera 1', frame1)
cv2.imshow('Camera 2', frame2)
cv2.waitKey(0)
Use feature matching algorithms (ORB, SIFT) to quantify spatial alignment if cameras have overlapping views.

Interpreting Evaluation Metrics

When running the evaluation suite, key metrics to focus on:

Accuracy Metrics

MAE
float
Mean Absolute Error: Average magnitude of offset errors.Good values:
  • Audio: < 0.05s (sub-frame accuracy at 30fps)
  • Visual: < 0.1s (3-frame accuracy at 30fps)
Poor values:
  • 0.5s: Indicates fundamental synchronization failure
RMSE
float
Root Mean Squared Error: Penalizes large errors more than MAE.Interpretation: If RMSE >> MAE, you have outlier cases with very poor sync.

Cross-Method Agreement

audio_video_diff
float
Absolute difference between audio and visual offset estimates.Good values: < 0.2s (methods agree)Poor values: > 1.0s (methods fundamentally disagree, indicates one is wrong)

Resource Usage

peak_memory_mb
float
Maximum RAM usage during synchronization.Typical values:
  • Audio: 500-1000 MB
  • Visual: 1500-3000 MB (higher due to frame decoding)
peak_cpu_percent
float
Maximum CPU utilization.Typical values: 80-200% (single-core + FFmpeg subprocess)

Best Practices for High-Quality Sync

1

Add Synchronization Markers

At the start of recording:
  • Visual: Hold a clapperboard or make a sharp hand gesture
  • Audio: Clap loudly or use a tone generator
This creates a clear reference point for both methods.
2

Ensure Temporal Overlap

Start all cameras before the action begins and stop after it ends. Aim for at least 30 seconds of overlapping content.
3

Match Camera Settings

Use consistent:
  • Frame rate (30fps or 60fps across all cameras)
  • Resolution (1080p or 4K)
  • Codec (H.264 preferred)
Mismatched settings can introduce sync drift.
4

Test Before Production

Run a quick test recording:
  1. Record 10 seconds with all cameras
  2. Upload to the sync tool
  3. Verify confidence scores > 0.5
This validates your setup before critical recordings.

Next Steps

Using the UI

Learn the upload and review workflow

Configuration

Customize synchronization methods and parameters

Build docs developers (and LLMs) love