Skip to main content
The Gemini provider enables LangExtract to use Google’s Gemini models, including support for structured output, Vertex AI, and batch processing.

Quick Start

import langextract as lx

result = lx.extract(
    text="Your document text",
    model_id="gemini-2.5-flash",
    prompt_description="Extract key information",
    examples=[...]
)

API Setup

Option 1: AI Studio (API Key)

Get an API key from AI Studio.
export LANGEXTRACT_API_KEY="your-api-key-here"

Option 2: Vertex AI (Service Accounts)

For production use with Vertex AI:
result = lx.extract(
    text="Your document text",
    model_id="gemini-2.5-flash",
    prompt_description="Extract entities",
    examples=[...],
    language_model_params={
        "vertexai": True,
        "project": "your-project-id",
        "location": "us-central1"  # or "global"
    }
)
Vertex AI requires both project and location parameters when vertexai=True.

Model Selection

Available Models

The Gemini provider supports models matching the pattern ^gemini or ^palm:
  • gemini-2.5-flash (recommended) - Fast, cost-effective, high quality
  • gemini-2.5-pro - Advanced reasoning for complex tasks
  • gemini-1.5-flash - Legacy fast model
  • gemini-1.5-pro - Legacy advanced model
# Recommended for most use cases
result = lx.extract(
    text="Your text",
    model_id="gemini-2.5-flash",
    prompt_description="Extract information",
    examples=[...]
)

# For complex reasoning tasks
result = lx.extract(
    text="Your text",
    model_id="gemini-2.5-pro",
    prompt_description="Extract complex relationships",
    examples=[...]
)
Gemini models have lifecycle dates with planned retirements. Consult the official model version documentation for current stable versions.

Configuration Options

Basic Parameters

result = lx.extract(
    text="Your document",
    model_id="gemini-2.5-flash",
    prompt_description="Extract entities",
    examples=[...],
    # Provider-specific parameters:
    temperature=0.0,          # Sampling temperature (0.0 = deterministic)
    max_output_tokens=1000,  # Maximum tokens in response
    top_p=0.95,              # Nucleus sampling
    top_k=40,                # Top-K sampling
)

Advanced Configuration

from langextract.providers.gemini import GeminiLanguageModel

model = GeminiLanguageModel(
    model_id="gemini-2.5-flash",
    api_key="your-key",
    temperature=0.0,
    max_workers=10,           # Parallel API calls
    # Gemini API configuration:
    safety_settings=[
        {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"}
    ],
    system_instruction="You are an expert information extractor.",
    stop_sequences=["END", "STOP"]
)

Structured Output

Gemini supports strict JSON schema constraints for reliable structured output:
result = lx.extract(
    text="Your document",
    model_id="gemini-2.5-flash",
    prompt_description="Extract structured data",
    examples=[...],
    use_schema_constraints=True  # Enable strict JSON schema
)
When use_schema_constraints=True, LangExtract:
  1. Analyzes your examples to build a JSON schema
  2. Passes the schema to Gemini’s structured output API
  3. Gemini guarantees valid JSON matching the schema
Structured output requires format_type=JSON (the default). It’s not compatible with YAML format.

Vertex AI Features

Custom Endpoints

For VPC Service Controls or custom endpoints:
from google.api_core.client_options import ClientOptions

result = lx.extract(
    text="Your text",
    model_id="gemini-2.5-flash",
    prompt_description="Extract data",
    examples=[...],
    language_model_params={
        "vertexai": True,
        "project": "your-project",
        "location": "us-central1",
        "http_options": ClientOptions(
            api_endpoint="your-vpc-endpoint.googleapis.com"
        )
    }
)

Batch Processing

Save costs on large-scale tasks with Vertex AI Batch API:
result = lx.extract(
    text="Your document",
    model_id="gemini-2.5-flash",
    prompt_description="Extract information",
    examples=[...],
    language_model_params={
        "vertexai": True,
        "project": "your-project",
        "location": "us-central1",
        "batch": {
            "enabled": True,
            "threshold": 100  # Min prompts to use batch API
        }
    }
)
Batch API is triggered only when the number of prompts exceeds the threshold. Below the threshold, the real-time API is used automatically.

Parallel Processing

The Gemini provider automatically parallelizes multiple prompts:
result = lx.extract(
    text="Your long document",
    model_id="gemini-2.5-flash",
    prompt_description="Extract entities",
    examples=[...],
    max_workers=20,  # Process up to 20 chunks in parallel
    max_chunk_size=3000  # Split document into 3000-char chunks
)

Code Examples

Basic Extraction

import langextract as lx
import os

# Define your task
prompt = "Extract person names, locations, and dates in order of appearance."

examples = [
    lx.data.ExampleData(
        text="Dr. Jane Smith visited Paris on March 15, 2024.",
        extractions=[
            lx.data.Extraction(
                extraction_class="person",
                extraction_text="Dr. Jane Smith",
                attributes={"title": "Dr."}
            ),
            lx.data.Extraction(
                extraction_class="location",
                extraction_text="Paris",
                attributes={"type": "city"}
            ),
            lx.data.Extraction(
                extraction_class="date",
                extraction_text="March 15, 2024",
                attributes={"format": "full date"}
            )
        ]
    )
]

# Run extraction
result = lx.extract(
    text="Prof. John Doe traveled to London on April 20, 2024.",
    model_id="gemini-2.5-flash",
    api_key=os.environ.get('LANGEXTRACT_API_KEY'),
    prompt_description=prompt,
    examples=examples
)

print(f"Found {len(result.extractions)} extractions")
for ext in result.extractions:
    print(f"{ext.extraction_class}: {ext.extraction_text}")

Using Vertex AI

import langextract as lx

result = lx.extract(
    text="Your production data",
    model_id="gemini-2.5-flash",
    prompt_description="Extract entities",
    examples=[...],
    language_model_params={
        "vertexai": True,
        "project": "my-gcp-project",
        "location": "us-central1"
    },
    # Enable strict schema for production reliability
    use_schema_constraints=True
)

Long Document Processing

import langextract as lx

# Process a long document with optimal settings
result = lx.extract(
    text="https://example.com/long-document.txt",  # Or pass text directly
    model_id="gemini-2.5-flash",
    prompt_description="Extract all medication mentions",
    examples=[...],
    # Chunking and parallelization:
    max_chunk_size=3000,      # Smaller chunks for accuracy
    max_workers=20,           # High parallelism for speed
    extraction_passes=3,      # Multiple passes for recall
    # Provider configuration:
    temperature=0.0,          # Deterministic output
    max_output_tokens=2000,   # Allow longer responses
)

Rate Limits

Gemini API has rate limits. For large-scale production use:
  • Request Tier 2 quota for higher throughput
  • Use Vertex AI with appropriate quotas
  • Enable batch processing for cost efficiency

Error Handling

import langextract as lx
from langextract.core.exceptions import InferenceConfigError, InferenceRuntimeError

try:
    result = lx.extract(
        text="Your text",
        model_id="gemini-2.5-flash",
        prompt_description="Extract data",
        examples=[...]
    )
except InferenceConfigError as e:
    # Configuration errors (missing API key, invalid parameters)
    print(f"Configuration error: {e}")
except InferenceRuntimeError as e:
    # Runtime errors (API errors, timeouts)
    print(f"Runtime error: {e}")

Direct Provider Usage

For advanced use cases, instantiate the provider directly:
from langextract.providers.gemini import GeminiLanguageModel
from langextract.core.types import ScoredOutput

model = GeminiLanguageModel(
    model_id="gemini-2.5-flash",
    api_key="your-key",
    temperature=0.0,
    max_workers=10
)

# Run inference on prompts
prompts = ["Extract entities from: ...", "Summarize: ..."]
for outputs in model.infer(prompts):
    for scored_output in outputs:
        print(f"Score: {scored_output.score}, Output: {scored_output.output}")

Next Steps

Provider Overview

Learn about the provider architecture

OpenAI Provider

Use OpenAI’s GPT models

Ollama Provider

Run local models

Custom Providers

Create your own providers

Build docs developers (and LLMs) love