Documentation Index
Fetch the complete documentation index at: https://mintlify.com/LuisCastilloCruz/VIGIA/llms.txt
Use this file to discover all available pages before exploring further.
Overview
This endpoint processes uploaded documents (PDFs and images) and extracts text using OCR technology powered by Tesseract. The service automatically detects the document type and applies appropriate processing methods.
- Images: PNG, JPG, JPEG, WEBP, TIF, TIFF
- Documents: PDF (multi-page support)
Request
The document or image file to process. Must be one of the supported formats.
Example Request
curl -X POST https://api.vigia.com/api/v1/ocr/preview \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@document.pdf"
import requests
url = "https://api.vigia.com/api/v1/ocr/preview"
files = {"file": open("document.pdf", "rb")}
headers = {"Authorization": "Bearer YOUR_API_KEY"}
response = requests.post(url, files=files, headers=headers)
print(response.json())
Response
The extracted text from the document. For multi-page PDFs, text from all pages is concatenated with newline separators.
Example Response
{
"text": "Paciente: C.R.\nEdad: 45 años\nSexo: Masculino\nPeso: 72 kg\n\nProducto sospechoso: Ibuprofeno 400 mg\nFecha de inicio: 2025-08-10\n\nEvento adverso: Presentó urticaria generalizada tras administración del medicamento.\n\nReportante: Dra. María González"
}
OCR Processing Details
Image Processing
For image files, the service:
- Loads the image using OpenCV (if available) or PIL
- Converts to grayscale
- Applies Otsu’s thresholding for better text detection
- Runs Tesseract OCR with Spanish and English language support
PDF Processing
For PDF files, the service uses two strategies:
Primary Method: pdf2image + Poppler
- Converts each PDF page to high-resolution images
- Applies OCR to each page individually
- Concatenates results from all pages
Fallback Method: PyMuPDF (fitz)
- Rasterizes PDF pages at 200 DPI
- Processes each page with Tesseract
- Used when Poppler is not available
Language Support
The OCR engine attempts text extraction in the following order:
- Spanish (
spa)
- Spanish + English (
spa+eng)
- English only (
eng)
- Default language (fallback)
Configuration
OCR behavior is configured via environment variables:
TESSERACT_CMD: Path to Tesseract executable
POPPLER_PATH: Path to Poppler utilities (for PDF processing)
OCR_CONFIG: Tesseract configuration (default: --oem 3 --psm 6)
Debug Endpoint
To check OCR service configuration and available languages:
Returns:
{
"TESSERACT_CMD_env": "/usr/bin/tesseract",
"pytesseract_cmd": "/usr/bin/tesseract",
"tesseract_version": "tesseract 5.3.0",
"langs": ["eng", "spa", "spa_old"],
"POPPLER_PATH": "/usr/bin"
}
Error Handling
Error message if OCR processing fails
Common Errors
- Unsupported file format: The uploaded file type is not supported
- OCR disabled: Tesseract is not properly installed or configured
- PDF processing failed: Poppler/PyMuPDF dependencies missing
- Invalid file: The uploaded file is corrupted or cannot be read
Best Practices
- Image Quality: Upload high-resolution images (300 DPI minimum) for best results
- File Size: Keep files under 10 MB for optimal processing time
- Document Orientation: Ensure text is properly oriented (not rotated)
- Contrast: High contrast between text and background improves accuracy
- Multi-page PDFs: Processing time increases linearly with page count
- Single page: ~2-3 seconds
- Multi-page PDF: ~2-3 seconds per page
- Large images: ~3-5 seconds depending on resolution
Next Steps
After extracting text with OCR, you can:
- Use Extract Data to parse structured fields from the text
- Use Translate to translate the extracted text to other languages