SQLMorph uses large language models in two distinct roles: completion models generate natural-language query variants in the JQE and TQA pipelines, and embedding models power the semantic evaluation metrics that compare column names by meaning rather than exact string match. All providers are accessed through a singleDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/sqlmorph/llms.txt
Use this file to discover all available pages before exploring further.
ModelManager.create_model() factory that accepts a ModelProvider enum, a ModelType enum, and a provider-specific model name enum. This page shows how to configure each supported provider.
ModelManager factory
create_model() returns an instance of the appropriate class (OpenAIChatCompletion, OllamaChatCompletion, HuggingFaceChatCompletion, or their embedding counterparts). All completion instances expose get_chat_completion(messages) and all embedding instances expose get_embedding(input_data).
Selects the backend. One of
ModelProvider.OPENAI, ModelProvider.OLLAMA, or ModelProvider.HUGGINGFACE.ModelType.COMPLETION for chat/instruction models; ModelType.EMBEDDING for embedding models.A provider-specific enum value identifying the model. Must match the chosen
model_provider.Your OpenAI API key. Required when
model_provider is ModelProvider.OPENAI. Pass os.getenv("OPENAI_API_KEY") after sourcing scripts/load_dotenv.sh.Optional Portkey gateway API key. When set, all OpenAI requests are routed through the Portkey gateway for observability and caching.
Optional Portkey config ID for advanced routing and fallback rules. Used together with
portkey_api_key.Provider configuration
- OpenAI
- Ollama
- HuggingFace
Set Completion modelsEmbedding modelsAvailable OpenAI models
OPENAI_API_KEY in your .env file and source it before running experiments:| Enum value | Model string | Type |
|---|---|---|
OpenAIModel.GPT_52 | gpt-5.2 | Completion |
OpenAIModel.O1_PREVIEW | o1-preview | Completion |
OpenAIModel.O1_MINI | o1-mini | Completion |
OpenAIModel.GPT_4O | gpt-4o | Completion |
OpenAIModel.GPT_4O_MINI | gpt-4o-mini | Completion |
OpenAIModel.GPT_4_TURBO | gpt-4-turbo | Completion |
OpenAIModel.GPT_4 | gpt-4 | Completion |
OpenAIModel.GPT_3_5_TURBO | gpt-3.5-turbo | Completion |
OpenAIModel.TEXT_EMBEDDING_3_SMALL | text-embedding-3-small | Embedding |
OpenAIModel.TEXT_EMBEDDING_3_LARGE | text-embedding-3-large | Embedding |
OpenAIModel.TEXT_EMBEDDING_ADA_002 | text-embedding-ada-002 | Embedding |
GPT-4o (
OpenAIModel.GPT_4O) is the default model for JQE NL query generation. For metrics, set EMBEDDING_MODEL in scripts/metrics_config.sh to one of TEXT_EMBEDDING_3_SMALL, TEXT_EMBEDDING_3_LARGE, or TEXT_EMBEDDING_ADA_002.Choosing a provider
| Use case | Recommended provider | Notes |
|---|---|---|
| JQE NL query generation | OpenAI (GPT_4O) | Default in the JQE pipeline. |
| TQA NL query generation | OpenAI (GPT_4O) | Requires OPENAI_API_KEY. |
| Semantic evaluation metrics | OpenAI (embedding models) | Only OpenAI embeddings are currently supported for metrics. Configure via EMBEDDING_MODEL in scripts/metrics_config.sh. |
| Local / offline experiments | Ollama | No API key required; requires the Ollama daemon. |
| Custom research pipelines | HuggingFace | Full control over model weights; downloads from HuggingFace Hub. |