Overview
The File Actions module provides AI-powered document processing capabilities, specifically for analyzing PDF files using Google’s Gemini model. It can read, understand, and summarize PDF content.analyzePdf
Analyzes a PDF document using Gemini AI to extract insights, summarize content, or answer specific questions about the document.Parameters
Response
AI-generated analysis or summary of the PDF content
Whether the analysis succeeded
Error message if analysis failed
Features
- Large Document Support: Uses Gemini 2.5 Flash which supports up to 1M tokens
- Size Validation: Enforces maximum PDF size (configured in
MAX_PDF_SIZE_MB) - Custom Prompts: Flexible prompt system for different analysis needs
- Base64 Processing: Handles PDF conversion internally
Example: Basic Summary
Example: Custom Analysis
Example: Question Answering
Example: Data Extraction
Use Cases
1. Document Summarization
2. Technical Manual Processing
3. Compliance Review
4. Key Information Extraction
5. Comparative Analysis
Configuration
Environment Variables
GOOGLE_GENERATIVE_AI_API_KEY: Required for PDF analysisMAX_PDF_SIZE_MB: Maximum PDF file size (defined inconfig/limits)
Model Configuration
- Model:
gemini-2.5-flash - Context Window: Up to 1M tokens (ideal for large documents)
- Input: PDF files via base64 encoding
Size Limits
PDFs are validated againstMAX_PDF_SIZE_MB before processing. Files exceeding this limit are rejected with a descriptive error:
Error Handling
The function includes comprehensive error handling:Common Errors
| Error | Cause | Solution |
|---|---|---|
| ”PDF vacío” | No file provided in FormData | Ensure ‘file’ field contains a valid PDF |
| ”PDF demasiado grande” | File exceeds size limit | Compress PDF or split into smaller files |
| ”Error desconocido al analizar PDF” | API error or network issue | Check API key and network connectivity |
Error Response Example
Performance Considerations
- Token Limit: Gemini 2.5 Flash supports up to 1M tokens, suitable for documents of 500+ pages
- Processing Time: Large PDFs may take 10-30 seconds to process
- Memory: PDFs are loaded into memory as base64, factor this into server resources
- Rate Limits: Subject to Google AI API rate limits
Best Practices
- Prompt Design: Be specific in your prompts for better results
- File Size: Keep PDFs under the configured limit for optimal performance
- Error Handling: Always check the
successfield before usingtext - Structured Data: Request JSON format in prompt when extracting structured data
- Language: Prompts can be in Spanish or English; responses match prompt language