Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TrustifAI/trustifai/llms.txt

Use this file to discover all available pages before exploring further.

Online metrics are computed during LLM generation, not after. Rather than evaluating a finished response against retrieved documents, they tap into the generation process itself — specifically the log probabilities that the LLM assigns to each output token. This makes them fast (no extra API calls) and complementary to offline metrics, giving you a real-time view into how certain the model was while producing its response.

The Confidence Score

The Confidence Score is TrustifAI’s single online metric. It quantifies how sure the LLM was about its own output by analyzing the probability distribution across generated tokens. How it works: TrustifAI captures the per-token log probability (logprob) values returned by the LLM alongside the generated text. From these it computes:
  1. Geometric mean probabilityexp(mean(logprobs)) gives the normalized per-token probability for the entire sequence. This captures the model’s average certainty across all tokens.
  2. Variance penaltyexp(-variance(logprobs)) penalizes sequences where the model oscillated between high- and low-confidence tokens. Consistent uncertainty (uniform logprobs) is rated less harshly than erratic uncertainty.
The final score combines both:
confidence_score = exp(mean(logprobs)) × exp(-variance(logprobs))
The result is a value in [0, 1] where higher means the model was more consistently certain about its output.
The Confidence Score is only available for LLMs that expose token log probabilities. This includes OpenAI models (e.g., gpt-4o, gpt-4, gpt-4-turbo) when logprobs: true is set in your config. Models served through providers that strip logprob data will return score: 0.0 with label N/A.

Threshold labels

ScoreLabelInterpretation
≥ 0.90High ConfidenceModel is highly certain about its output
≥ 0.70Medium ConfidenceModel shows moderate uncertainty
< 0.70Low ConfidenceModel is uncertain — treat output with caution
These thresholds have defaults of 0.90 and 0.70. You can override them by adding them to any metric’s params section in your YAML — for example, alongside the trust_score thresholds:
metrics:
  - type: "trust_score"
    params:
      RELIABLE_TRUST: 0.80
      ACCEPTABLE_TRUST: 0.60
      HIGH_CONFIDENCE: 0.90    # override confidence threshold
      MEDIUM_CONFIDENCE: 0.70

Using generate() to get Confidence Scores

The generate() method wraps an LLM call and automatically computes the Confidence Score from the returned logprobs.
1

Initialize the engine

The engine reads your LLM config, including the model name and logprob settings, from the YAML file.
from trustifai import Trustifai

trust_engine = Trustifai(config_path="config_file.yaml")
2

Call generate()

Pass your prompt (and an optional system prompt). TrustifAI automatically requests logprobs from the LLM.
result = trust_engine.generate(
    prompt="What is the capital of France?",
    system_prompt="You are a helpful assistant."
)
3

Read the response and confidence metadata

The return value contains both the generated text and the full confidence breakdown.
print(f"Response:    {result['response']}")
print(f"Confidence:  {result['metadata']['confidence_score']}")
print(f"Label:       {result['metadata']['confidence_label']}")

Interpreting the result dict

generate() returns a dictionary with two top-level keys:
{
    "response": "The capital of France is Paris.",
    "metadata": {
        "confidence_score": 0.94,
        "confidence_label": "High Confidence",
        "confidence_details": {
            "explanation": "Model is highly confident in its response based on logprobs.",
            "avg_logprob": -0.06,
            "variance": 0.03,
            "token_count": 8
        },
        "logprobs_available": true,
        "execution_metadata": {
            "total_cost_usd": 0.000031
        }
    }
}
FieldDescription
responseThe generated text
metadata.confidence_scoreConfidence Score in [0, 1]
metadata.confidence_labelHigh Confidence, Medium Confidence, or Low Confidence
metadata.confidence_details.avg_logprobMean log probability across tokens
metadata.confidence_details.varianceVariance of log probabilities (higher = more erratic)
metadata.confidence_details.token_countNumber of tokens scored
metadata.logprobs_availablefalse if the LLM did not return logprobs
metadata.execution_metadata.total_cost_usdAPI cost for this call

Enabling logprobs in config

Make sure your LLM config requests logprobs. TrustifAI sets this automatically when generate() is called, but having it in the config ensures consistency:
llm:
  type: "openai"
  params:
    model_name: "gpt-4o"
    api_type: "chat_completion"
  kwargs:
    logprobs: true
    max_tokens: 2048
    temperature: 0.01
The Confidence Score is only as reliable as the LLM’s calibration. A well-calibrated model assigns high log probabilities to tokens it is genuinely likely to get right, and lower log probabilities when it is uncertain. Many models — especially smaller, fine-tuned, or instruction-tuned ones — are poorly calibrated and may express high confidence even when hallucinating. Treat the Confidence Score as a useful signal, not a guarantee, and always combine it with offline metrics for full trustworthiness evaluation.

Combining online and offline metrics

The Confidence Score operates independently of the offline metrics. A typical production workflow combines both:
from trustifai import Trustifai, MetricContext
from langchain_core.documents import Document

trust_engine = Trustifai(config_path="config_file.yaml")

# Step 1: Generate with real-time confidence
generation = trust_engine.generate(
    prompt="Summarize the key findings from the documents.",
    system_prompt="You are a research assistant."
)

# Step 2: Evaluate the generated answer offline
context = MetricContext(
    query="Summarize the key findings from the documents.",
    answer=generation["response"],
    documents=[Document(page_content="...", metadata={"source": "report.pdf"})]
)
trust_result = trust_engine.get_trust_score(context)

print(f"Online confidence:  {generation['metadata']['confidence_score']}")
print(f"Offline Trust Score: {trust_result['score']} ({trust_result['label']})")
This two-step pattern gives you the fullest picture: online confidence tells you how certain the model was during generation, and the offline Trust Score tells you how well the finished response holds up against your retrieved documents.

Build docs developers (and LLMs) love