Overview
The PDF Form Parser uses Rails Active Storage to handle file uploads including:- PDF form templates - Original PDF files with form fields
- Generated PDFs - Completed and merged PDF documents
- Photos - Inspection photos attached to forms
- Signatures - Digital signature images
- User avatars - Profile pictures
Storage Services
The application is configured with three storage services inconfig/storage.yml:
Local Storage (Development/Test)
Stores files on the local filesystem:- Development: Files stored in
storage/directory - Test: Files stored in
tmp/storage/(cleaned between test runs)
DigitalOcean Spaces (Production)
S3-compatible object storage:Environment Configuration
The active storage service is set per environment:Development
Production
Setting Up DigitalOcean Spaces
1. Create a Space
- Log in to your DigitalOcean account
- Navigate to Spaces in the sidebar
- Click “Create a Space”
- Choose a datacenter region (e.g., NYC3, SFO3)
- Set a unique space name
- Choose public or private access
2. Generate API Keys
- Go to API → Spaces Keys
- Click “Generate New Key”
- Save the Access Key and Secret Key securely
3. Configure Environment Variables
Set these environment variables in your production environment:- The endpoint URL varies by region (nyc3, sfo3, sgp1, etc.)
- Region should typically be
us-east-1for Spaces compatibility - The bucket name must be globally unique
Using Amazon S3
To use Amazon S3 instead of DigitalOcean Spaces:1. Add S3 Configuration
Add this toconfig/storage.yml:
2. Update Environment Configuration
3. Set Environment Variables
The
aws-sdk-s3 gem is already included in the Gemfile and works with both S3 and S3-compatible services.File Processing Dependencies
The application requires several gems for file processing:System Dependencies
Install these system packages:- pdftk - PDF form field manipulation
- libvips - Fast image processing for variants
Image Transformations
Active Storage can generate image variants usingimage_processing:
- Photo thumbnails in inspection forms
- User avatar display
- Image compression for storage optimization
Active Storage Models
Attachments in the Application
The application uses Active Storage attachments:data column.
Database Tables
Active Storage creates three tables:File Upload Security
Content Type Validation
The application uses themarcel gem for MIME type detection:
File Size Limits
Configure maximum file sizes:Storage Best Practices
Development
- Use local disk storage for simplicity
- Add
storage/to.gitignore(already configured) - Test file uploads regularly
Production
- Always use cloud storage (S3, Spaces, etc.)
- Enable CORS if accessing files from different domains
- Set proper bucket permissions
- Private for sensitive documents
- Public-read for user-uploaded content (if needed)
- Configure CDN for better performance
- Enable versioning for backup
- Set lifecycle policies to manage costs
Security
- Never commit credentials to version control
- Use IAM roles when possible (AWS)
- Rotate access keys regularly
- Validate file types on upload
- Scan uploaded files for malware in production
- Use signed URLs for temporary access
Common Storage Tasks
Direct Upload (Optional)
For large files, enable direct uploads to cloud storage:Purging Files
Remove attachments:Downloading Files
Monitoring and Maintenance
Storage Usage
Monitor storage consumption:Cleanup Orphaned Files
Active Storage doesn’t automatically delete files when records are destroyed. Use:Troubleshooting
Files Not Uploading
-
Check environment configuration:
-
Verify credentials are set:
- Check bucket permissions
Image Processing Errors
If image variants fail:PDF Processing Errors
If PDF operations fail:Access Denied Errors
- Verify bucket name is correct
- Check access keys are valid
- Ensure bucket region matches configuration
- Verify bucket permissions allow read/write
Alternative Storage Services
The application can work with any S3-compatible service:Google Cloud Storage
Microsoft Azure
Mirror Service (Multi-Cloud)
Next Steps
- Configure authentication to manage user uploads
- Learn about creating templates and PDF processing
- Explore sync endpoints for offline functionality