Configure TrustifAI with YAML

TrustifAI reads all of its runtime settings from a single YAML file — by default config_file.yaml in your working directory. You pass the path to this file when you instantiate Trustifai or AsyncTrustifai, so different environments (development, staging, production) can each carry their own config without touching application code.

File structure overview

config_file.yaml has seven top-level sections. Each section is described below, followed by a complete reference example.

env_file:   # path to your .env file
tracing:    # MLflow experiment tracking
llm:        # language model via LiteLLM
embeddings: # embedding model via LiteLLM
reranker:   # optional reranker model
metrics:    # per-metric thresholds and enable/disable flags
score_weights: # aggregation weights (must sum to ≤ 1.0)

`env_file`

Points to a .env file containing API keys and secrets. This keeps credentials out of your YAML and out of source control.

env_file: "creds.env"

The .env file follows standard KEY=value syntax. TrustifAI uses LiteLLM under the hood, so any key LiteLLM recognises is valid here:

OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
MISTRAL_API_KEY=...
AZURE_API_KEY=...
COHERE_API_KEY=...
OPENROUTER_API_KEY=...
HF_TOKEN=...
# or any other key supported by LiteLLM

env_file is optional. If your keys are already exported as shell environment variables, you can omit this field entirely.

`tracing`

Controls MLflow experiment tracking. Tracing is disabled by default and is an optional feature — install trustifai[trace] to enable it.

tracing:
  type: "default"
  params:
    enabled: false
    tracking_uri: null        # e.g. "http://localhost:5000" or an MLflow-hosted URI
    experiment_name: "trustifai_experiment"

Field	Description
`enabled`	Set to `true` to activate MLflow logging
`tracking_uri`	URI of your MLflow tracking server. Leave `null` to use the local `./mlruns` default
`experiment_name`	Name of the MLflow experiment that runs are grouped under

See Track experiments with MLflow tracing for the full walkthrough.

`llm`

Configures the language model used for metric evaluation (evidence coverage uses LLM-based NLI by default, and epistemic consistency samples multiple generations).

llm:
  type: "openai"   # LiteLLM provider prefix
  params:
    model_name: "gpt-4o"
    api_type: "chat_completion"   # "chat_completion" (default) or "responses"
  kwargs:
    temperature: 0.01
    max_tokens: 2048
    top_p: 0.95
    max_retries: 5
    timeout: 300
    logprobs: true

The type field is the LiteLLM provider prefix. TrustifAI supports any model that LiteLLM can route, including:

Provider	`type` value	Example `model_name`
OpenAI	`openai`	`gpt-4o`, `gpt-4o-mini`
Anthropic	`anthropic`	`claude-3-5-sonnet-20241022`
Google Gemini	`gemini`	`gemini/gemini-1.5-pro`
Mistral	`mistral`	`mistral/mistral-large-latest`
Ollama (local)	`ollama`	`ollama/llama3`
Azure AI	`azure_ai`	`azure_ai/gpt-4o`
NVIDIA NIM	`nvidia_nim`	`nvidia_nim/meta/llama-3.1-8b-instruct`
HuggingFace	`huggingface`	`huggingface/mistralai/Mistral-7B-v0.1`
OpenRouter	`openrouter`	`openrouter/anthropic/claude-3.5-sonnet`

Set logprobs: true in kwargs if you want the online Confidence Score metric. Models that do not support log probabilities will return a zeroed confidence result.

`embeddings`

Configures the embedding model used to compute query, answer, and document vector representations.

embeddings:
  type: "openrouter"
  params:
    model_name: "openai/text-embedding-3-small"
  kwargs: {}

The same LiteLLM provider prefixes apply here. Common choices:

# OpenAI
embeddings:
  type: "openai"
  params:
    model_name: "text-embedding-3-small"

# Local via Ollama
embeddings:
  type: "ollama"
  params:
    model_name: "ollama/nomic-embed-text"

`reranker`

An optional reranker model used by the evidence coverage metric when strategy: "reranker" is set. Omit this section entirely if you use the default "llm" strategy.

reranker:
  type: "cohere"
  params:
    model_name: "rerank-v4.0-fast"
    kwargs: {}

Supported providers include cohere, together_ai, azure_ai, fireworks_ai, and voyage.

`metrics`

A list of metric configurations. Each entry controls whether the metric is active and sets its classification thresholds.

metrics:
  - type: "evidence_coverage"
    enabled: true
    params:
      strategy: "llm"          # "llm" (default) or "reranker"
      STRONG_GROUNDING: 0.85
      PARTIAL_GROUNDING: 0.60

  - type: "consistency"
    enabled: true
    params:
      STABLE_CONSISTENCY: 0.85
      FRAGILE_CONSISTENCY: 0.60

  - type: "source_diversity"
    enabled: true
    params:
      HIGH_DIVERSITY: 0.85
      MODERATE_DIVERSITY: 0.60

  - type: "semantic_drift"
    enabled: true
    params:
      STRONG_ALIGNMENT: 0.85
      PARTIAL_ALIGNMENT: 0.60

  - type: "trust_score"
    params:
      RELIABLE_TRUST: 0.80
      ACCEPTABLE_TRUST: 0.60

To disable a metric — for example, if your use case does not benefit from source diversity — set enabled: false. Its weight is automatically zeroed out and the remaining weights are re-normalized.

  - type: "source_diversity"
    enabled: false   # metric is skipped; weight is excluded from normalization
    params:
      HIGH_DIVERSITY: 0.85
      MODERATE_DIVERSITY: 0.60

`score_weights`

Controls how each metric contributes to the final Trust Score. Weights are combined by weighted sum; the library validates that they do not exceed 1.0 and normalizes them automatically after disabled metrics are removed.

score_weights:
  - type: "evidence_coverage"
    params:
      weight: 0.40
  - type: "consistency"
    params:
      weight: 0.20
  - type: "source_diversity"
    params:
      weight: 0.10
  - type: "semantic_drift"
    params:
      weight: 0.30

Weights must sum to at most 1.0 before normalization. If the raw sum exceeds 1.0, TrustifAI raises a ValueError at startup. After a metric is disabled, the remaining weights are automatically re-normalized to sum to exactly 1.0.

Complete example

The following is the full default config_file.yaml shipped with TrustifAI:

env_file: "creds.env"

tracing:
  type: "default"
  params:
    enabled: false
    tracking_uri: null
    experiment_name: "trustifai_experiment"

llm:
  type: "openai"
  params:
    model_name: "gpt-5.5"
    api_type: "chat_completion"
  kwargs:
    temperature: 0.01
    max_tokens: 2048
    top_p: 0.95
    max_retries: 5
    timeout: 300
    logprobs: true

embeddings:
  type: "openrouter"
  params:
    model_name: "openai/text-embedding-3-small"
  kwargs: {}

reranker:
  type: "cohere"
  params:
    model_name: "rerank-v4.0-fast"
    kwargs: {}

metrics:
  - type: "evidence_coverage"
    enabled: true
    params:
      strategy: "llm"
      STRONG_GROUNDING: 0.85
      PARTIAL_GROUNDING: 0.6

  - type: "consistency"
    enabled: true
    params:
      STABLE_CONSISTENCY: 0.85
      FRAGILE_CONSISTENCY: 0.6

  - type: "source_diversity"
    enabled: true
    params:
      HIGH_DIVERSITY: 0.85
      MODERATE_DIVERSITY: 0.6

  - type: "semantic_drift"
    enabled: true
    params:
      STRONG_ALIGNMENT: 0.85
      PARTIAL_ALIGNMENT: 0.6

  - type: "trust_score"
    params:
      RELIABLE_TRUST: 0.8
      ACCEPTABLE_TRUST: 0.6

score_weights:
  - type: "evidence_coverage"
    params:
      weight: 0.40
  - type: "consistency"
    params:
      weight: 0.20
  - type: "source_diversity"
    params:
      weight: 0.10
  - type: "semantic_drift"
    params:
      weight: 0.30

Loading the config

Pass the path to Trustifai or AsyncTrustifai at instantiation time:

from trustifai import Trustifai
from trustifai.async_pipeline import AsyncTrustifai

# Synchronous
engine = Trustifai("config_file.yaml")

# Async batch
engine = AsyncTrustifai("config_file.yaml")

Config.from_yaml parses the YAML, flattens all metric thresholds into a single MetricThresholds object, and normalizes the weights — all before your first evaluation call.

Batch evaluation

Run concurrent evaluations over large datasets with AsyncTrustifai.

Custom metrics

Add custom metric types and configure their weights in this file.

MLflow tracing

Enable and configure the tracing section for experiment tracking.

Integrations

Connect TrustifAI to LangChain, LlamaIndex, and other frameworks.

Get Started

Core Concepts

Guides

Configure TrustifAI with YAML

File structure overview

`env_file`

`tracing`

`llm`

`embeddings`

`reranker`

`metrics`

`score_weights`

Complete example

Loading the config

Batch evaluation

Custom metrics

MLflow tracing

Integrations

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Documentation Index

​File structure overview

​env_file

​tracing

​llm

​embeddings

​reranker

​metrics

​score_weights

​Complete example

​Loading the config

Batch evaluation

Custom metrics

MLflow tracing

Integrations

Build docs developers (and LLMs) love

File structure overview

`env_file`

`tracing`

`llm`

`embeddings`

`reranker`

`metrics`

`score_weights`

Complete example

Loading the config