Overview
Themodel_provider module provides a unified abstraction layer for working with both OpenAI models (GPT-5, GPT-4.1) and HuggingFace models (MediPhi, MedGemma) deployed via TGI or HuggingFace Inference Endpoints with OpenAI-compatible APIs.
ModelConfig
Configuration dataclass for language models.Fields
Short name/key for the model (e.g., ‘gpt-4.1’, ‘mediphi’)
Full model identifier (e.g., ‘gpt-4.1’, ‘microsoft/MediPhi’)
Model provider type
Environment variable name for HuggingFace endpoint URL (e.g., ‘MEDIPHI_ENDPOINT_URL’). Only used for HuggingFace models. Should point to base endpoint URL without /v1.
Sampling temperature for the model
Example
create_llm()
Factory function to create a language model based on configuration.Signature
Parameters
ModelConfig instance specifying the model to create
Returns
Configured language model instance. OpenAI models return standard ChatOpenAI instances. HuggingFace models return ChatOpenAI instances pointing to a TGI/HF Inference Endpoint that exposes the OpenAI-compatible API.
Raises
ValueError: If endpoint URL environment variable is not set for HuggingFace modelsValueError: If HF_TOKEN environment variable is not set for HuggingFace modelsValueError: If provider is unknown
Behavior
OpenAI Models:- Returns standard
ChatOpenAIinstance - Uses
model_idandtemperaturefrom config - Requires
OPENAI_API_KEYenvironment variable
- Returns
ChatOpenAIinstance configured for TGI/HF Inference Endpoint - Normalizes endpoint URL to OpenAI-compatible format (ending in /v1)
- Requires environment variable specified in
endpoint_url_env - Requires
HF_TOKENenvironment variable for authentication - Sets
max_tokens=512to prevent infinite loops in TGI models
Example
get_model_identity()
Resolve provider and model identity from registry hints and runtime LLM instance. Keeps metadata and pricing lookups consistent across OpenAI and OpenAI-compatible providers.Signature
Parameters
Requested model name to look up in registry
Runtime LLM instance to extract model information from
Returns
Dictionary containing:
provider: Provider type (“openai”, “huggingface”, or “unknown”)model_id: Full model identifiermodel_name: Short model name/key
Resolution Logic
- First checks if
model_namematches a key inMODELS_REGISTRY - Then checks if runtime model name from
llm.model_namematches registry - Then searches registry for matching
model_id - Falls back to inference:
- If model ID contains ”/”, infers “huggingface”
- Otherwise infers “openai”
- Uses “unknown” if no information available
Example
MODELS_REGISTRY
Canonical registry of all available models with their configurations.Type
Registered Models
OpenAI Models:gpt-5: GPT-5 model with temperature 0.0gpt-5.2: GPT-5.2 model with temperature 0.0
microsoft/MediPhi-Instruct: Medical-specialized Phi model- Short name:
mediphi - Endpoint:
MEDIPHI_ENDPOINT_URLenvironment variable
- Short name:
google/medgemma-1.5-4b-it: Medical Gemma 1.5B instruction-tuned model- Short name:
medgemma - Endpoint:
MEDGEMMA_ENDPOINT_URLenvironment variable
- Short name:
Example
Environment Variables
Required for All Models
OPENAI_API_KEY: API key for OpenAI models (loaded from .env if not set)
Required for HuggingFace Models
HF_TOKEN: HuggingFace authentication token- Model-specific endpoint URLs:
MEDIPHI_ENDPOINT_URL: Endpoint URL for MediPhi modelMEDGEMMA_ENDPOINT_URL: Endpoint URL for MedGemma model
Optional
PRICING_CONFIG_PATH: Path to custom pricing configuration JSON
Utility Functions
load_dotenv_if_needed()
Automatically loads environment variables from.env file if OPENAI_API_KEY is not already set.
create_llm(). Looks for .env file in project root (two levels up from the module).
