The Gemini provider enables LangExtract to use Google’s Gemini models, including support for structured output, Vertex AI, and batch processing.
Quick Start
import langextract as lx
result = lx.extract(
text = "Your document text" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract key information" ,
examples = [ ... ]
)
API Setup
Option 1: AI Studio (API Key)
Get an API key from AI Studio .
Environment Variable
.env File
Direct in Code
export LANGEXTRACT_API_KEY = "your-api-key-here"
# Add to .env file
LANGEXTRACT_API_KEY = your-api-key-here
# Keep secure
echo '.env' >> .gitignore
result = lx.extract(
text = "Your text" ,
model_id = "gemini-2.5-flash" ,
api_key = "your-api-key-here" ,
prompt_description = "Extract information" ,
examples = [ ... ]
)
Option 2: Vertex AI (Service Accounts)
For production use with Vertex AI :
result = lx.extract(
text = "Your document text" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract entities" ,
examples = [ ... ],
language_model_params = {
"vertexai" : True ,
"project" : "your-project-id" ,
"location" : "us-central1" # or "global"
}
)
Vertex AI requires both project and location parameters when vertexai=True.
Model Selection
Available Models
The Gemini provider supports models matching the pattern ^gemini or ^palm:
gemini-2.5-flash (recommended) - Fast, cost-effective, high quality
gemini-2.5-pro - Advanced reasoning for complex tasks
gemini-1.5-flash - Legacy fast model
gemini-1.5-pro - Legacy advanced model
# Recommended for most use cases
result = lx.extract(
text = "Your text" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract information" ,
examples = [ ... ]
)
# For complex reasoning tasks
result = lx.extract(
text = "Your text" ,
model_id = "gemini-2.5-pro" ,
prompt_description = "Extract complex relationships" ,
examples = [ ... ]
)
Configuration Options
Basic Parameters
result = lx.extract(
text = "Your document" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract entities" ,
examples = [ ... ],
# Provider-specific parameters:
temperature = 0.0 , # Sampling temperature (0.0 = deterministic)
max_output_tokens = 1000 , # Maximum tokens in response
top_p = 0.95 , # Nucleus sampling
top_k = 40 , # Top-K sampling
)
Advanced Configuration
from langextract.providers.gemini import GeminiLanguageModel
model = GeminiLanguageModel(
model_id = "gemini-2.5-flash" ,
api_key = "your-key" ,
temperature = 0.0 ,
max_workers = 10 , # Parallel API calls
# Gemini API configuration:
safety_settings = [
{ "category" : "HARM_CATEGORY_DANGEROUS_CONTENT" , "threshold" : "BLOCK_NONE" }
],
system_instruction = "You are an expert information extractor." ,
stop_sequences = [ "END" , "STOP" ]
)
Structured Output
Gemini supports strict JSON schema constraints for reliable structured output:
result = lx.extract(
text = "Your document" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract structured data" ,
examples = [ ... ],
use_schema_constraints = True # Enable strict JSON schema
)
When use_schema_constraints=True, LangExtract:
Analyzes your examples to build a JSON schema
Passes the schema to Gemini’s structured output API
Gemini guarantees valid JSON matching the schema
Structured output requires format_type=JSON (the default). It’s not compatible with YAML format.
Vertex AI Features
Custom Endpoints
For VPC Service Controls or custom endpoints:
from google.api_core.client_options import ClientOptions
result = lx.extract(
text = "Your text" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract data" ,
examples = [ ... ],
language_model_params = {
"vertexai" : True ,
"project" : "your-project" ,
"location" : "us-central1" ,
"http_options" : ClientOptions(
api_endpoint = "your-vpc-endpoint.googleapis.com"
)
}
)
Batch Processing
Save costs on large-scale tasks with Vertex AI Batch API:
result = lx.extract(
text = "Your document" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract information" ,
examples = [ ... ],
language_model_params = {
"vertexai" : True ,
"project" : "your-project" ,
"location" : "us-central1" ,
"batch" : {
"enabled" : True ,
"threshold" : 100 # Min prompts to use batch API
}
}
)
Batch API is triggered only when the number of prompts exceeds the threshold. Below the threshold, the real-time API is used automatically.
Parallel Processing
The Gemini provider automatically parallelizes multiple prompts:
result = lx.extract(
text = "Your long document" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract entities" ,
examples = [ ... ],
max_workers = 20 , # Process up to 20 chunks in parallel
max_chunk_size = 3000 # Split document into 3000-char chunks
)
Code Examples
import langextract as lx
import os
# Define your task
prompt = "Extract person names, locations, and dates in order of appearance."
examples = [
lx.data.ExampleData(
text = "Dr. Jane Smith visited Paris on March 15, 2024." ,
extractions = [
lx.data.Extraction(
extraction_class = "person" ,
extraction_text = "Dr. Jane Smith" ,
attributes = { "title" : "Dr." }
),
lx.data.Extraction(
extraction_class = "location" ,
extraction_text = "Paris" ,
attributes = { "type" : "city" }
),
lx.data.Extraction(
extraction_class = "date" ,
extraction_text = "March 15, 2024" ,
attributes = { "format" : "full date" }
)
]
)
]
# Run extraction
result = lx.extract(
text = "Prof. John Doe traveled to London on April 20, 2024." ,
model_id = "gemini-2.5-flash" ,
api_key = os.environ.get( 'LANGEXTRACT_API_KEY' ),
prompt_description = prompt,
examples = examples
)
print ( f "Found { len (result.extractions) } extractions" )
for ext in result.extractions:
print ( f " { ext.extraction_class } : { ext.extraction_text } " )
Using Vertex AI
import langextract as lx
result = lx.extract(
text = "Your production data" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract entities" ,
examples = [ ... ],
language_model_params = {
"vertexai" : True ,
"project" : "my-gcp-project" ,
"location" : "us-central1"
},
# Enable strict schema for production reliability
use_schema_constraints = True
)
Long Document Processing
import langextract as lx
# Process a long document with optimal settings
result = lx.extract(
text = "https://example.com/long-document.txt" , # Or pass text directly
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract all medication mentions" ,
examples = [ ... ],
# Chunking and parallelization:
max_chunk_size = 3000 , # Smaller chunks for accuracy
max_workers = 20 , # High parallelism for speed
extraction_passes = 3 , # Multiple passes for recall
# Provider configuration:
temperature = 0.0 , # Deterministic output
max_output_tokens = 2000 , # Allow longer responses
)
Rate Limits
Gemini API has rate limits. For large-scale production use:
Request Tier 2 quota for higher throughput
Use Vertex AI with appropriate quotas
Enable batch processing for cost efficiency
Error Handling
import langextract as lx
from langextract.core.exceptions import InferenceConfigError, InferenceRuntimeError
try :
result = lx.extract(
text = "Your text" ,
model_id = "gemini-2.5-flash" ,
prompt_description = "Extract data" ,
examples = [ ... ]
)
except InferenceConfigError as e:
# Configuration errors (missing API key, invalid parameters)
print ( f "Configuration error: { e } " )
except InferenceRuntimeError as e:
# Runtime errors (API errors, timeouts)
print ( f "Runtime error: { e } " )
Direct Provider Usage
For advanced use cases, instantiate the provider directly:
from langextract.providers.gemini import GeminiLanguageModel
from langextract.core.types import ScoredOutput
model = GeminiLanguageModel(
model_id = "gemini-2.5-flash" ,
api_key = "your-key" ,
temperature = 0.0 ,
max_workers = 10
)
# Run inference on prompts
prompts = [ "Extract entities from: ..." , "Summarize: ..." ]
for outputs in model.infer(prompts):
for scored_output in outputs:
print ( f "Score: { scored_output.score } , Output: { scored_output.output } " )
Next Steps
Provider Overview Learn about the provider architecture
OpenAI Provider Use OpenAI’s GPT models
Ollama Provider Run local models
Custom Providers Create your own providers