Documentation Index
Fetch the complete documentation index at: https://mintlify.com/DilwoarH/pdf-visual-regression/llms.txt
Use this file to discover all available pages before exploring further.
What is PDF visual regression tester?
PDF Visual Regression Tester is a Python-based command-line tool that performs visual regression testing on PDF files. It compares two PDFs page by page and generates annotated images that highlight any visual differences detected between them. The tool uses advanced image comparison algorithms to identify even subtle differences in PDF rendering, making it ideal for testing document generation systems, validating PDF transformations, and ensuring consistency across document versions.Key features
Page-by-page comparison
Compares each corresponding page of two PDF files with high-resolution rendering (144 DPI) for accurate difference detection.
Difference highlighting
Generates visual “diff” images that mark the exact areas where differences were detected with red overlay highlights.
Command-line interface
Simple and intuitive CLI that integrates easily into CI/CD pipelines and automated testing workflows.
Smart page count handling
Automatically handles PDFs with different page counts, comparing up to the shorter document’s length and flagging extra pages.
Structural similarity index
Uses SSIM (Structural Similarity Index) from scikit-image for robust comparison that reduces false positives from minor rendering variations.
How it works
The tool leverages several powerful Python libraries to deliver accurate visual comparison:- PyMuPDF (
fitz): High-performance rendering of PDF pages into images (pixmaps) - scikit-image: Provides the
structural_similarityfunction for robust image comparison that goes beyond simple pixel-by-pixel checks - Pillow (PIL): Image manipulation for creating highlighted diff images and saving output
- NumPy: Efficient array operations for image data processing
- ReportLab: Used in the test suite for programmatically generating test PDFs
The SSIM algorithm helps reduce false positives from minor, imperceptible rendering variations that can occur between different PDF renderers or systems.
When to use this tool
Perfect for
- CI/CD pipelines: Automatically verify that PDF generation code changes don’t introduce visual regressions
- Document template testing: Ensure template modifications produce expected visual results
- Cross-system validation: Compare PDFs generated on different systems or with different libraries
- Version comparison: Validate that document updates maintain expected layout and formatting
- Regulatory compliance: Verify that critical documents remain visually consistent across versions
Example use cases
Testing invoice generation systems
Testing invoice generation systems
Compare generated invoices against reference PDFs to ensure that calculations, formatting, and layout remain consistent after code changes.
Validating government forms
Validating government forms
Ensure that official forms maintain exact visual specifications and compliance requirements across system updates.
Report generation QA
Report generation QA
Verify that automated report generation produces consistent visual output when data or templates change.
PDF transformation validation
PDF transformation validation
Test that PDF manipulation operations (merging, splitting, watermarking) produce expected visual results.
Output format
When differences are found, the tool generates:- Diff images: PNG files with red highlights showing exact difference locations (e.g.,
diff_page_1.png) - Extra page images: Separate images for pages that exist in only one PDF (e.g.,
extra_page_3_only_in_pdf2.png) - Results JSON: Detailed comparison metadata including timestamps, page counts, and diff locations
- Console summary: Human-readable summary of findings printed to stdout
diff_output/20261202_171728_diff/) to maintain comparison history.
The tool uses a configurable similarity threshold (default 0.999) to determine when pages are considered different. Higher thresholds are more strict.