Available Examples
Romeo and Juliet Extraction
Process entire documents from URLs with parallel processing. Extract characters, emotions, and relationships from the complete text of Shakespeare’s Romeo and Juliet.
Medication Extraction
Extract structured medical information from clinical text. Demonstrates both basic NER and relationship extraction for healthcare applications.
Batch Processing
Save ~50% on costs for large-scale workloads using Vertex AI Batch API. Includes automatic routing, caching, and fault tolerance.
Japanese Extraction
Extract structured information from Japanese text using UnicodeTokenizer for correct character-based segmentation and alignment.
Key Concepts Demonstrated
Long Document Processing
The Romeo and Juliet example shows how to handle large texts (147,843 characters) with:- Sequential extraction passes for improved recall
- Parallel processing for speed optimization
- Smart chunking strategies for better accuracy
- Interactive visualization of thousands of entities
Domain-Specific Extraction
The Medication Extraction examples demonstrate:- Named Entity Recognition (NER) for medical entities
- Relationship Extraction (RE) using attribute-based grouping
- Position tracking for entity verification
- Structured output for healthcare applications
Cost Optimization
The Batch Processing guide covers:- Vertex AI Batch API integration for ~50% cost savings
- Automatic routing between real-time and batch processing
- GCS-based caching for instant result retrieval
- Lifecycle management for storage optimization
Multilingual Support
The Japanese Extraction example illustrates:- Using UnicodeTokenizer for non-spaced languages
- Correct grapheme segmentation and alignment
- Few-shot examples for multilingual tasks
- Character position tracking in Unicode text
Getting Started
Each example includes:- Complete, runnable code
- Sample output and visualizations
- Best practices and optimization tips
- Detailed explanations of key parameters