Overview
BaseLanguageModel is an abstract base class that defines the interface for all language model providers in LangExtract. All provider implementations (Gemini, OpenAI, Ollama, etc.) must inherit from this class and implement its abstract methods.
Class Definition
Constructor
Applies constraints when decoding the output. Defaults to no constraint.
Additional keyword arguments passed to the model at initialization. These are stored and merged with runtime kwargs during inference.
Methods
infer()
Performs language model inference on a batch of prompts.Batch of input prompts for inference. Single element list can be used for a single input.
Additional arguments for inference, like temperature and max_decode_steps.
Iterator yielding batches of probable output texts, sorted by descending score.
infer_batch()
Convenience method for batch inference with configurable batch size.List of prompts to process.
Batch size for processing (currently unused, reserved for future optimization).
List of lists of ScoredOutput objects, one per input prompt.
parse_output()
Parses raw model output as JSON or YAML.Raw output string from the model (without code fences).
Parsed Python object (dict or list).
ValueError if output cannot be parsed as JSON or YAML.
apply_schema()
Applies a schema instance to the provider for structured output.The schema instance to apply, or None to clear.
set_fence_output()
Sets explicit fence output preference for code block formatting.True: Force code fences (json oryaml)False: Disable fences (raw JSON/YAML)None: Auto-detect based on schema
merge_kwargs()
Merges stored initialization kwargs with runtime kwargs.Kwargs provided at inference time. These take precedence over stored kwargs.
Merged kwargs dictionary.
Class Methods
get_schema_class()
Returns the schema class this provider supports (e.g., Pydantic schemas).The schema class, or None if provider doesn’t support schemas.
Properties
schema
requires_fence_output
set_fence_output(), otherwise computes from schema.
Usage Example
Notes
- All provider implementations must implement the abstract
infer()method - The
parse_output()method expects raw JSON/YAML without code fences; fence extraction is handled by the resolver - Use
merge_kwargs()to combine initialization and runtime parameters - Schema support is optional but recommended for structured output