Skip to main content

Function Signature

def create_model(
    config: ModelConfig,
    examples: typing.Sequence[typing.Any] | None = None,
    use_schema_constraints: bool = False,
    fence_output: bool | None = None,
    return_fence_output: bool = False,
) -> base_model.BaseLanguageModel | tuple[base_model.BaseLanguageModel, bool]

Description

Create a language model instance from configuration using the factory pattern. This function handles provider resolution, API key management from environment variables, and optional schema constraint application for structured output.

Parameters

config
ModelConfig
required
Model configuration object specifying the model ID, optional provider, and provider-specific keyword arguments. See ModelConfig for details.
examples
Sequence[Any] | None
default:"None"
Optional examples for schema generation when use_schema_constraints=True. These examples are used to infer the output structure that the model should follow.
use_schema_constraints
bool
default:"False"
Whether to apply schema constraints from examples. When True, the model will be configured to produce structured output matching the example format.
fence_output
bool | None
default:"None"
Explicit fence output preference. If None, automatically computed based on the schema requirements. Fencing wraps output in markdown code blocks for parsing.
return_fence_output
bool
default:"False"
If True, returns a tuple of (model, fence_output_value) instead of just the model. Useful for determining whether output will be fenced.

Returns

model
BaseLanguageModel
An instantiated language model provider ready for inference. The specific type depends on the provider (e.g., GeminiModel, OpenAIModel).
tuple
tuple[BaseLanguageModel, bool]
When return_fence_output=True, returns a tuple containing:
  • The instantiated model
  • Boolean indicating whether the model requires fenced output

Exceptions

  • ValueError: If neither model_id nor provider is specified in the config.
  • ValueError: If no provider is registered for the given model_id.
  • InferenceConfigError: If provider instantiation fails due to invalid configuration or missing dependencies.

Usage Examples

Basic Model Creation

from langextract.factory import create_model, ModelConfig

# Create a Gemini model
config = ModelConfig(model_id="gemini-2.5-flash")
model = create_model(config)

# Create an OpenAI model
config = ModelConfig(model_id="gpt-4o")
model = create_model(config)

With Provider-Specific Arguments

from langextract.factory import create_model, ModelConfig

# Create model with custom parameters
config = ModelConfig(
    model_id="gemini-2.5-flash",
    provider_kwargs={
        "api_key": "your-api-key",
        "temperature": 0.7,
        "max_tokens": 1000
    }
)
model = create_model(config)

With Schema Constraints

from langextract.factory import create_model, ModelConfig
from langextract.core.data import ExampleData

# Define examples for structured output
examples = [
    ExampleData(input="text1", output={"category": "news", "sentiment": "positive"}),
    ExampleData(input="text2", output={"category": "blog", "sentiment": "neutral"})
]

# Create model with schema constraints
config = ModelConfig(model_id="gemini-2.5-flash")
model = create_model(
    config,
    examples=examples,
    use_schema_constraints=True
)

Explicit Provider Selection

from langextract.factory import create_model, ModelConfig

# Disambiguate when multiple providers support the same model
config = ModelConfig(
    model_id="gpt-4o",
    provider="OpenAIProvider"  # Explicitly specify provider
)
model = create_model(config)

With Fence Output Control

from langextract.factory import create_model, ModelConfig

# Get model and fence output setting
config = ModelConfig(model_id="gemini-2.5-flash")
model, requires_fence = create_model(
    config,
    return_fence_output=True
)

print(f"Model requires fenced output: {requires_fence}")

Using Environment Variables

import os
from langextract.factory import create_model, ModelConfig

# API keys are automatically loaded from environment
# Set one of: GEMINI_API_KEY, OPENAI_API_KEY, or LANGEXTRACT_API_KEY
os.environ["GEMINI_API_KEY"] = "your-api-key"

config = ModelConfig(model_id="gemini-2.5-flash")
model = create_model(config)  # API key loaded automatically

Environment Variables

The function automatically resolves API keys and configuration from environment variables:
  • GEMINI_API_KEY or LANGEXTRACT_API_KEY: For Gemini models
  • OPENAI_API_KEY or LANGEXTRACT_API_KEY: For GPT models
  • OLLAMA_BASE_URL: For Ollama models (defaults to http://localhost:11434)
If multiple API keys are found (e.g., both GEMINI_API_KEY and LANGEXTRACT_API_KEY), the more specific key takes precedence and a warning is issued.

Notes

  • The factory loads built-in and plugin providers automatically.
  • Provider resolution is based on model ID patterns (e.g., “gemini” in the model ID routes to GeminiProvider).
  • Schema constraints enable structured output extraction with type validation.
  • Vertex AI authentication can be used by setting vertexai=True in provider_kwargs.

Build docs developers (and LLMs) love