Overview
ThePdfFormsParserService handles parsing PDF form fields (AcroForm) and filling them with data. It provides UTF-8 support, signature field detection, and automatic integration with the PDF signature service.
Key Features:
- Extract form fields from PDF documents
- Fill PDF forms with field data
- UTF-8 character encoding support
- Automatic signature field detection and handling
- Fallback parsing for problematic PDFs
- Human-readable label generation
pdf-formsgem (pdftk wrapper)PdfSignatureServicefor signature operations
app/services/pdf_forms_parser_service.rb
Initialization
Constructor
Absolute path to the PDF file to parse or fill
Public Methods
parse
Extracts all form fields from the PDF document.Array of field hashes with metadata. Returns empty array on error.
Sanitized field name (UTF-8 encoded)
Original field name from the PDF
Field type (e.g.,
Text, Button, Choice, Signature_Field)Always empty string (use
label_name for original value)Available options for choice fields
Human-readable field label (e.g., “Location Row 1”)
Original field value from the PDF
Whether this is a signature field
Signature metadata (name, signing_time, reason, location, etc.) if signed
PdfForms::PdftkError: Falls back to alternative parsing methodStandardError: Returns empty array and logs error- Automatically retries with
dump_data_fieldscommand if standard extraction fails
fill_form
Fills the PDF form with provided data and applies signatures.Path where the filled PDF will be saved
Array of field hashes with values to fill. Each hash should include:
name(string): Field namevalue(string): Field valueoriginal_name(string, optional): Original field name from PDFis_signature(boolean, optional): Whether this is a signature fieldcertificate_path(string, optional): Path to P12/PFX certificate for digital signaturecertificate_password(string, optional): Certificate passwordsignature_image_path(string, optional): Path to signature imagereason(string, optional): Signature reasonlocation(string, optional): Signature locationsigner_name(string, optional): Name of the signer
Path to the output PDF file
- Separates normal fields from signature requests
- Fills all non-signature fields using pdftk
- Applies each signature sequentially:
- If
certificate_pathprovided: Creates digital signature withPdfSignatureService.sign - If only
signature_image_pathprovided: Stamps image withPdfSignatureService.stamp_signature_image
- If
- Returns path to final output PDF
PdfForms::PdftkError: Attempts retry with alternative approach- UTF-8 encoding errors: Automatically sanitizes invalid characters
- Missing signature images: Logs warning and skips image stamping
Field Name Processing
The service provides automatic field name processing:Sanitization
- Removes invalid UTF-8 characters
- Preserves original name in
original_namefield - Both sanitized and original names are tried during form fill
Human Label Generation
Location_row_1→Location Row 1customerName→Customer Nameinspection_date→Inspection Date
Signature Field Detection
The service automatically detects signature fields:- Checks field type for
sigorsignature - Queries
PdfSignatureService.list_signature_fields()for additional metadata - Marks fields with
is_signature: true - Includes signature info if field is already signed
Fallback Parsing
When standard parsing fails, the service uses an alternative method:- Executes
pdftk dump_data_fieldscommand directly - Parses text output manually
- Reconstructs field objects
- Applies same filtering and signature detection
- PDFs with encoding issues
- Corrupted or non-standard AcroForms
- pdftk library errors
Complete Workflow Example
Best Practices
Parse Before Fill
Always parse the PDF first to understand available fields and their types
Handle UTF-8
Service automatically handles UTF-8 characters, but verify field names in parsed output
Separate Signatures
Signature fields are processed separately after normal fields are filled
Error Logging
Monitor Rails logs for encoding issues and fallback parsing events
Related Services
- PdfSignatureService - Digital signature operations
- PdfMergingService - Combine multiple PDFs