Prerequisites
Source Videos
Place 4 original videos (
.mp4, .mov, or .avi) in evaluation/originals/.- Videos should be at least 10 seconds long
- At least one video should contain audio for audio sync evaluation
- Varied motion levels and audio characteristics enable better sensitivity analysis
Stage 1: Offset Generation
Generate synthetic test cases by applying known temporal shifts to each original video.Command
What It Does
For each original video and each offset in the offset list[-1000, -500, -100, +100, +500, +1000] ms:
- Positive Offset (Padding)
- Negative Offset (Trimming)
Prepends black frames and silent audio to simulate a delayed start.FFmpeg Strategy:
- Uses
colorandaevalsrcfilters to generate black video + silence - Concatenates the padding with the original using
filter_complex - Example:
+500ms→ prepend 0.5 seconds of black/silence
Sensitivity Tag Computation
For each original video, the script computes three metadata tags:Motion Level
Samples the first 300 frames, computes mean frame-to-frame pixel difference, normalized to [0, 1].
- Low motion (< 0.05): Static shots, minimal movement
- High motion (> 0.2): Fast-paced action, camera motion
Output
This stage takes ~2-5 minutes per video depending on length and encoding speed. The
ultrafast preset is used to prioritize speed over compression.Stage 2: Batch Synchronization
Run both audio (GCC-PHAT) and visual (motion-based) synchronization on every synthetic test case.Command
What It Does
For each row insynthetic_metadata.csv:
Load Test Case
- Locate the original video in
evaluation/originals/ - Load the corresponding synthetic video from the metadata CSV
Run Audio Sync
- Extract audio from both videos using
src.preprocess.extract_audio_from_videos - Compute GCC-PHAT cross-correlation using
src.audio_sync.estimate_offsets_robust - Extract peak confidence score from the cross-correlation function
- Measure runtime and peak CPU/memory usage via
ResourceMonitor
Run Visual Sync
- Extract motion energy timeseries from both videos
- Compute cross-correlation using
src.visual_sync.sync_videos_by_motion - Extract peak confidence score from motion correlation
- Save motion signals to
diagnostics/*.npzfor later visualization - Measure runtime and peak CPU/memory usage
Record Results
Append two rows to
results.csv (one for audio, one for visual) with:estimated_offset_ms: Synchronization algorithm’s estimateabsolute_error_ms:|estimated_offset_ms - true_offset_ms|confidence_score: Peak correlation valueruntime_seconds: Total processing timepeak_cpu_percent,peak_memory_mb: Resource usage
Resource Monitoring
TheResourceMonitor context manager samples CPU and memory usage every 200ms in a background thread:
evaluation/run_batch.py
Output
This stage takes ~5-10 minutes per test case depending on video length. The total runtime for 24 cases is typically 2-4 hours.
Stage 3: Metrics Computation
Aggregate results into statistical summaries across five categories.Command
What It Does
- Accuracy Metrics
- Cross-Method Agreement
- Confidence Validation
- Efficiency
- Resource Usage
- Grouped (Sensitivity)
Per-method and per-offset analysis of synchronization error:
- MAE (Mean Absolute Error)
- RMSE (Root Mean Square Error)
- Median Error
- Max Error
- Per-Offset MAE: Breakdown by each of the 6 offset magnitudes
Output
The summary table is printed to the console for quick reference. The full structured metrics are available in
metrics_summary.json for programmatic access.Stage 4: Visualization
Generate 8 types of publication-ready plots from the results.Command
What It Does
Readsresults.csv and produces high-DPI (300 DPI) PNG files in evaluation/plots/:
Error vs Offset
Grouped bar chart of MAE by offset magnitude and method
Confidence vs Error
Scatter plot with regression lines showing confidence reliability
Audio-Video Diff Histogram
Distribution of cross-method estimate differences
Runtime Comparison
Bar chart of mean runtime by method
Error Distribution
Boxplot of error by method and offset with overlaid scatter points
Resource Usage
Dual bar chart of peak CPU and memory by method
Motion Before/After
Per-case overlay of motion signals before and after alignment
Sync Timelines
Per-case timeline diagrams with offset arrows (pad/trim)
Output
All plots use a clean, publication-ready style with consistent colors (blue for audio, orange for visual) and disabled top/right spines for a modern look.
Troubleshooting
FFmpeg errors during offset generation
FFmpeg errors during offset generation
Symptom:
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'Solution: Install FFmpeg and ensure it’s in your system PATH:- macOS:
brew install ffmpeg - Windows: Download from ffmpeg.org and add to PATH
- Linux:
sudo apt install ffmpeg
Audio sync fails for some videos
Audio sync fails for some videos
Symptom:
RuntimeError: Original video has no audio streamSolution: Audio sync requires both videos to have audio tracks. Either:- Use videos with audio for all source files, or
- Skip audio-less videos (they’ll be omitted from audio method results)
Out of memory during batch run
Out of memory during batch run
Symptom:
MemoryError or system slowdown during run_batch.pySolution: The batch runner processes videos sequentially and cleans up temp files after each case. If memory usage is still high:- Close other applications
- Reduce the number of source videos
- Process videos in smaller batches by editing the metadata CSV
Plots missing or empty
Plots missing or empty
Symptom: Some plots are skipped or show no dataSolution:
- before_after plots: Require diagnostics .npz files from
run_batch.py. Re-run batch sync if missing. - Resource usage plot: Requires
peak_cpu_percentandpeak_memory_mbcolumns in results.csv. Re-run batch sync with the latest version.
Next Steps
Metrics Reference
Complete documentation of all computed metrics
Visualization Gallery
Detailed descriptions and examples of all plot types