Documentation Index
Fetch the complete documentation index at: https://mintlify.com/DilwoarH/pdf-visual-regression/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The pdf-visual-diff tool provides two main configuration options:- Similarity threshold - Controls how strict the visual comparison is
- Output directory - Specifies where results are saved
Similarity threshold
The--threshold parameter controls the sensitivity of visual difference detection using the Structural Similarity Index (SSIM).
How SSIM works
SSIM is computed for each pair of pages and returns a score between 0 and 1:1.0= Identical images0.9= Very similar (minor differences)0.7= Moderately similar0.5= Significantly different0.0= Completely different
pdf_visual_diff.py:54
Threshold comparison logic
Pages are flagged as different when their SSIM score falls below the threshold: Source:pdf_visual_diff.py:56-57
A page with SSIM = 0.998 and threshold = 0.999 will be flagged as different because 0.998 < 0.999.
Choosing the right threshold
- Use case: Regression testing
- Use case: Content verification
- Use case: Layout verification
Recommended threshold: Best for:
0.999 or 1.0- Automated testing pipelines
- Detecting unintended changes
- Verifying pixel-perfect output
- Any visual change, no matter how small
- Anti-aliasing differences
- Font rendering variations
Default values
There is an important distinction between CLI and function defaults:Threshold examples
Example 1: Strict comparison (pixel-perfect)
Example 1: Strict comparison (pixel-perfect)
- Date changes
- Amount updates
- Font smoothing differences
- Compression artifacts
Example 2: Balanced comparison
Example 2: Balanced comparison
- Text changes
- Layout shifts
- Image differences
- Color variations
Example 3: Layout-only comparison
Example 3: Layout-only comparison
- Element repositioning
- Size changes
- Removed/added sections
Debugging threshold issues
If you’re getting unexpected results, check the SSIM values in the results.json file:The results.json file stores the threshold used but not individual page SSIM scores. To see actual SSIM values, you’ll need to modify the source code to log them.
Output directory configuration
The--output parameter specifies where results are saved.
Directory structure
The tool creates a timestamped subdirectory for each run: Implementation:pdf_visual_diff.py:14-17
Output directory examples
Timestamp format
The timestamp uses the formatYYYYDDMM_HHMMSS:
YYYY= 4-digit yearDD= 2-digit dayMM= 2-digit monthHH= 2-digit hour (24-hour format)MM= 2-digit minuteSS= 2-digit second
Output file types
The output directory contains:- Diff images - PNG files showing visual differences
- Named:
diff_page_N.pngwhere N is the page number - Generated for pages below threshold
- Named:
- Extra page images - PNG files for pages in only one PDF
- Named:
extra_page_N_only_in_pdfX.png - Generated when PDFs have different page counts
- Named:
- Results file - JSON file with comparison metadata
- Named:
results.json - Always generated
- Named:
Managing output
Cleaning old results
Cleaning old results
Organizing by project
Organizing by project
CI/CD artifact collection
CI/CD artifact collection
Advanced configuration patterns
Environment-based settings
Wrapper script with presets
See also
- Command reference - Complete CLI argument documentation
- Output formats - Understanding generated files
- Basic comparison - Getting started guide