Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TrustifAI/trustifai/llms.txt

Use this file to discover all available pages before exploring further.

ConfidenceMetric is TrustifAI’s only online metric — it runs during LLM generation rather than post-hoc. It converts the per-token log probability stream emitted by the LLM into a single confidence score that reflects both the model’s average certainty and the consistency of that certainty across the generated sequence. Unlike offline metrics, you do not call ConfidenceMetric.calculate directly: it is invoked automatically inside Trustifai.generate().
ConfidenceMetric requires a language model that exposes token-level log probabilities. OpenAI-compatible APIs (including Gemini via response_logprobs=True) and most self-hosted models support this. If your LLM does not return logprobs, generate() returns score: 0.0 with label: "N/A".

Static method

class ConfidenceMetric:
    @staticmethod
    def calculate(
        logprobs: List[float],
        evaluator: ThresholdEvaluator,
    ) -> Dict[str, Any]: ...
logprobs
List[float]
required
List of per-token log probability values (negative floats) as returned by the LLM API. The list corresponds to the generated response tokens in order. An empty list returns score: 0.0, label: "N/A".
evaluator
ThresholdEvaluator
required
A ThresholdEvaluator instance used to map the computed score to a (label, explanation) pair. TrustifAI passes this automatically when calling from generate().

Score computation

The score is derived in three steps:
  1. Average log probabilityavg_logprob = mean(logprobs), a length-normalized proxy for sequence probability.
  2. Variance penaltypenalty = exp(−var(logprobs)), which reduces the score when the model was inconsistently uncertain across tokens.
  3. Final scorescore = exp(avg_logprob) × penalty, clipped to [0.0, 1.0] by the natural range of the formula.

Return value

calculate returns a plain dict (not a MetricResult) for compatibility with the generate() response envelope:
{
    "score": float,      # 0.0 – 1.0
    "label": str,        # human-readable confidence label
    "details": {
        "explanation": str,
        "avg_logprob": float,   # rounded to 2 d.p.
        "variance": float,      # rounded to 2 d.p.
        "token_count": int,
    }
}
Labels (configurable via thresholds in your config):
  • "High Confidence" — model is highly certain
  • "Medium Confidence" — moderate uncertainty
  • "Low Confidence" — model is uncertain about its output

Usage via Trustifai.generate()

You access ConfidenceMetric through the generate() method, which handles logprob collection and metric calculation automatically:
from trustifai import Trustifai

engine = Trustifai("config_file.yaml")

output = engine.generate(
    prompt="What is the boiling point of water at sea level?",
    system_prompt="You are a helpful assistant."
)

print(output["metadata"]["confidence_score"])
# 0.81

print(output["metadata"]["confidence_label"])
# "High Confidence"
The full generate() return value has two top-level keys — "response" (the generated text) and "metadata" (confidence and cost info):
{
    "response": "Water boils at 100 °C (212 °F) at standard sea-level pressure.",
    "metadata": {
        "confidence_score": 0.81,
        "confidence_label": "High Confidence",
        "confidence_details": {
            "explanation": "Model is highly confident in its response based on logprobs.",
            "avg_logprob": -0.14,
            "variance": 0.07,
            "token_count": 23,
        },
        "logprobs_available": True,
        "execution_metadata": {
            "total_cost_usd": 0.000031
        }
    }
}
Do not call ConfidenceMetric.calculate directly in production code. The logprobs list must be in the exact format returned by the LLM integration layer. Calling generate() ensures the logprobs are captured and formatted correctly before being passed to the metric.

Build docs developers (and LLMs) love