Skip to main content

Overview

The PDF Processing feature enables intelligent analysis of PDF documents using Google Gemini 2.5 Flash. The system can extract content, summarize key points, answer questions about documents, and process both text-based and scanned PDFs.
Gemini Flash supports up to 1 million tokens, making it ideal for processing large documents and technical manuals.

Key Capabilities

Content Extraction

Extract and parse text content from PDF documents

Document Summarization

Generate concise summaries of long documents

Question Answering

Ask specific questions about document content

Multi-Page Support

Process documents of any length up to 10MB

Server Action

PDF analysis is implemented as a Next.js server action:
export async function analyzePdf(
  formData: FormData
): Promise<{ text: string; success: boolean; error?: string }> {
  try {
    const file = formData.get('file') as File | null;
    const prompt = formData.get('prompt') as string | null;
    
    if (!file) throw new Error('PDF vacío');
    
    // Validate file size
    const sizeInBytes = file.size;
    const sizeInMB = bytesToMB(sizeInBytes);
    
    if (sizeInMB > MAX_PDF_SIZE_MB) {
      throw new Error(
        `PDF demasiado grande (${sizeInMB.toFixed(1)}MB). Máximo: ${MAX_PDF_SIZE_MB}MB`
      );
    }
    
    const buffer = await file.arrayBuffer();
    const base64Content = Buffer.from(buffer).toString('base64');
    
    // Process with Gemini Flash
    const result = await generateText({
      model: google('gemini-2.5-flash'),
      messages: [
        {
          role: 'user',
          content: [
            {
              type: 'text',
              text: prompt || 'Analiza este documento PDF y resume sus puntos clave.',
            },
            {
              type: 'file',
              data: base64Content,
              mediaType: 'application/pdf',
            },
          ],
        },
      ],
    });
    
    return { text: result.text, success: true };
  } catch (error: unknown) {
    logger.error('Error análisis de PDF', error as Error);
    const errorMessage = error instanceof Error 
      ? error.message 
      : 'Error desconocido al analizar PDF';
    return { text: '', success: false, error: errorMessage };
  }
}

Integration with Chat

PDFs are processed when uploaded to the chat interface:
1

File Detection

System detects PDF attachment in the chat input
const pdfFile = message.files?.find(
  (file) => file.mediaType === 'application/pdf'
);
2

Size Validation

Validate file size before processing
const limitBytes = MAX_PDF_SIZE_BYTES;
if (fileSize > limitBytes) {
  throw new Error(
    `El archivo excede el límite de ${bytesToMB(limitBytes)}MB`
  );
}
3

Base64 Encoding

Convert PDF to base64 for API transmission
const response = await fetch(pdfFile.url);
const blob = await response.blob();

const fileDataUrl = await new Promise<string>((resolve, reject) => {
  const reader = new FileReader();
  reader.onload = () => resolve(reader.result as string);
  reader.onerror = reject;
  reader.readAsDataURL(blob);
});
4

User Message

Add user message with PDF attachment to chat
setMessages((prev) => [
  ...prev,
  {
    id: `user-${Date.now()}`,
    role: 'user',
    content: userText,
    parts: [
      { type: 'text', text: userText },
      { type: 'file', data: fileDataUrl, mediaType: 'application/pdf' },
    ],
    createdAt: new Date(),
  },
]);
5

Analysis

Call server action to process the PDF
const formData = new FormData();
formData.append('file', blob, fileName);
if (customPrompt) {
  formData.append('prompt', customPrompt);
}

const result = await analyzePdf(formData);
6

Display Results

Add analysis result as assistant message
if (result.success && result.text) {
  setMessages((prev) => [
    ...prev,
    {
      id: `analysis-${Date.now()}`,
      role: 'assistant',
      content: result.text,
      parts: [{ type: 'text', text: result.text }],
      createdAt: new Date(),
    },
  ]);
  
  toast.success('PDF analizado', 'El análisis se ha agregado al chat');
}

File Submission Hook

The useFileSubmission hook handles both PDF and image uploads:
export function useFileSubmission({
  setMessages,
  sendMessage,
  isListening,
  toggleListening,
}: UseFileSubmissionParams): UseFileSubmissionReturn {
  const [isAnalyzing, setIsAnalyzing] = useState(false);
  const [analyzingFileType, setAnalyzingFileType] = useState<'image' | 'pdf' | null>(null);
  
  const handleSubmit = useCallback(async (message: PromptInputMessage) => {
    // Check for supported files
    const imageFile = message.files?.find(
      (file) => file.mediaType?.startsWith('image/')
    );
    const pdfFile = message.files?.find(
      (file) => file.mediaType === 'application/pdf'
    );
    const targetFile = imageFile || pdfFile;
    
    if (targetFile && targetFile.url) {
      setIsAnalyzing(true);
      const isPdf = targetFile.mediaType === 'application/pdf';
      setAnalyzingFileType(isPdf ? 'pdf' : 'image');
      
      try {
        // Process file...
        if (isPdf) {
          result = await analyzePdf(formData);
        } else {
          result = await analyzePartImage(formData);
        }
        
        // Handle results...
      } finally {
        setIsAnalyzing(false);
        setAnalyzingFileType(null);
      }
    }
  }, [setMessages, sendMessage]);
  
  return { handleSubmit, isAnalyzing, analyzingFileType };
}

Custom Analysis Prompts

Provide specific instructions for document analysis:
const formData = new FormData();
formData.append('file', pdfBlob, 'manual.pdf');
formData.append('prompt', 'Proporciona un resumen ejecutivo de este documento en 3-5 puntos clave.');

const result = await analyzePdf(formData);

File Size Limits

// From app/config/limits.ts
export const MAX_PDF_SIZE_MB = 10;
export const MAX_PDF_SIZE_BYTES = 10 * 1024 * 1024;

export const SUPPORTED_PDF_MIME_TYPES = ['application/pdf'];
PDF files are limited to 10MB. Gemini can handle up to 2GB, but we limit to 10MB for better user experience and faster processing.

Loading States

Provide feedback during PDF processing:
<ChatStatusIndicators
  isAnalyzingImage={isAnalyzing}
  fileType={analyzingFileType || 'image'}
  // ... other props
/>
The chat displays:
  • “Analizando PDF…” while processing
  • Progress indicators during upload
  • Success/error messages after completion

Error Handling

Comprehensive error handling for common issues:
if (result.success && result.text) {
  // Success: Display analysis
  displayAnalysis(result.text);
} else {
  // Error: Show user-friendly message
  const errorMsg = `❌ Error al analizar PDF: ${result.error || 'Error desconocido'}`;
  
  setMessages((prev) => [
    ...prev,
    {
      id: `error-${Date.now()}`,
      role: 'assistant',
      content: errorMsg,
      parts: [{ type: 'text', text: errorMsg }],
      createdAt: new Date(),
    },
  ]);
  
  toast.error('Error de análisis', result.error || 'No se pudo analizar el PDF');
}

Common Error Types

ErrorCauseSolution
File too largePDF exceeds 10MBCompress PDF or split into smaller files
Invalid formatNot a valid PDFEnsure file is a proper PDF document
Corrupted filePDF cannot be readTry re-generating or repairing the PDF
API quotaGemini API limit reachedWait for quota reset or upgrade plan
Network errorConnection issueCheck internet connection and retry

Performance Considerations

Large Documents

Processing time increases with document length. Consider pagination for very large files.

Token Limits

Gemini Flash supports 1M tokens (~750k words), sufficient for most technical documents.

Concurrent Requests

Limit concurrent PDF analyses to prevent API rate limiting.

Caching

Consider caching analysis results for frequently accessed documents.

Best Practices

1

Specific Prompts

Provide clear, specific prompts for better analysis results:
  • ❌ “Analyze this PDF”
  • ✅ “Extract the maintenance schedule from pages 5-10”
2

File Optimization

Optimize PDFs before upload:
  • Remove unnecessary images
  • Compress when possible
  • Use text-based PDFs instead of scanned images when available
3

Context Preservation

Include relevant context in your prompts:
formData.append('prompt', 
  'Este es un manual de mantenimiento industrial. ' +
  'Extrae todos los procedimientos de calibración mencionados.'
);
4

Error Recovery

Always provide options to retry or modify the request:
  • Allow re-upload with different prompts
  • Suggest file optimization for large files
  • Provide fallback options when analysis fails

Use Cases

  • Extract specifications and requirements
  • Identify maintenance procedures
  • Parse equipment manuals
  • Analyze compliance documents
  • Summarize inspection reports
  • Extract key metrics and findings
  • Identify trends and patterns
  • Generate executive summaries
  • Verify safety procedure compliance
  • Extract hazard warnings
  • Identify required certifications
  • Parse regulatory documents
  • Answer questions about procedures
  • Extract learning objectives
  • Identify required skills
  • Generate training summaries

Configuration

# .env.local
GOOGLE_GENERATIVE_AI_API_KEY=your_api_key_here

# Optional: Adjust file size limits in config
MAX_PDF_SIZE_MB=10

Multimodal Chat

Full chat interface with all modalities

Voice Commands

Voice transcription and commands

Image Analysis

Visual recognition capabilities

Build docs developers (and LLMs) love