Trust Score: how TrustifAI weights and aggregates it

TrustifAI evaluates LLM and RAG responses across multiple independent dimensions and combines them into a single Trust Score — a number between 0 and 1 that answers the question: can you rely on this response? Rather than treating quality as a single axis, TrustifAI treats it as a multi-dimensional signal, which makes it far harder to game and far more informative when something goes wrong.

Multi-dimensional evaluation model

The Trust Score is built from up to five component metrics, computed in parallel:

4 offline metrics — calculated after a response has already been generated, against the retrieved documents.
1 online metric — calculated in real time during generation, using token log probabilities.

Each metric produces a score in the [0, 1] range. These component scores are then combined into the final Trust Score using a configurable weighted sum.

Weighted aggregation formula

The aggregation formula is a simple weighted linear combination:

Trust Score = Σ (weight_i × metric_score_i)

Weights are read from your config_file.yaml and automatically normalized so they always sum to 1.0. This means you can freely adjust relative priorities without worrying about the math.

Default weights

Metric	Key	Default Weight
Evidence Coverage	`evidence_coverage`	0.40
Semantic Drift	`semantic_drift`	0.30
Epistemic Consistency	`consistency`	0.20
Source Diversity	`source_diversity`	0.10

These defaults reflect a deliberate priority ordering: factual grounding matters most, followed by topical alignment, then consistency across generations, and finally breadth of sourcing.

Configuring weights in YAML

score_weights:
  - type: "evidence_coverage"
    params:
      weight: 0.40   # Highest priority — factual accuracy
  - type: "semantic_drift"
    params:
      weight: 0.30
  - type: "consistency"
    params:
      weight: 0.20
  - type: "source_diversity"
    params:
      weight: 0.10

Weights are proportional, not absolute. Setting evidence_coverage to 0.8 and semantic_drift to 0.2 (with others at 0) produces the same scores as 0.4 and 0.1. TrustifAI normalizes before computing.

Any metric with a weight: 0 (or omitted entirely) is automatically excluded from both the computation and the reasoning graph, so you can disable metrics without deleting their config block.

Decision labels

After computing the final score, TrustifAI maps it to one of three human-readable decision labels using configurable thresholds:

Score range	Label	Meaning
≥ 0.80	`RELIABLE`	Response is well-grounded and consistent. Safe to use.
≥ 0.60	`ACCEPTABLE (WITH CAUTION)`	Response has some weaknesses. Review before acting on it.
< 0.60	`UNRELIABLE`	Response has significant trust issues. Do not use without human review.

These thresholds are set under metrics[type=trust_score] in your config:

metrics:
  - type: "trust_score"
    params:
      RELIABLE_TRUST: 0.8
      ACCEPTABLE_TRUST: 0.6

Calling `get_trust_score()`

Build a MetricContext

Wrap your query, answer, and retrieved documents in a MetricContext object. TrustifAI accepts LangChain Document objects, LlamaIndex nodes, plain strings, dicts, or lists — it normalizes them automatically.

from trustifai import Trustifai, MetricContext
from langchain_core.documents import Document

context = MetricContext(
    query="What is the capital of India?",
    answer="The capital is New Delhi.",
    documents=[
        Document(
            page_content="New Delhi is the capital of India.",
            metadata={"source": "wiki.txt"}
        )
    ]
)

Initialize the engine

Point Trustifai at your config file. The engine reads weights, thresholds, and LLM/embedding settings from YAML.

trust_engine = Trustifai(config_path="config_file.yaml")

Call get_trust_score()

Pass the context to get_trust_score(). All active metrics run in parallel and the results are aggregated automatically.

result = trust_engine.get_trust_score(context)
print(f"Trust Score: {result['score']} | Decision: {result['label']}")

Interpreting the result dict

get_trust_score() returns a dictionary with four keys:

{
    "score": 0.84,                     # Final weighted Trust Score [0, 1]
    "label": "RELIABLE",               # Decision label
    "details": {                       # Per-metric breakdown
        "evidence_coverage": {
            "score": 1.0,
            "label": "Strong Grounding",
            "details": { ... }
        },
        "semantic_drift": {
            "score": 0.91,
            "label": "Strong Alignment",
            "details": { ... }
        },
        "consistency": {
            "score": 0.87,
            "label": "Stable Consistency",
            "details": { ... }
        },
        "source_diversity": {
            "score": 0.80,
            "label": "High Trust",
            "details": { ... }
        }
    },
    "execution_metadata": {
        "total_cost_usd": 0.000412     # Cumulative LLM/embedding API cost
    }
}

Field	Type	Description
`score`	`float`	Final Trust Score, rounded to 2 decimal places
`label`	`str`	One of `RELIABLE`, `ACCEPTABLE (WITH CAUTION)`, or `UNRELIABLE`
`details`	`dict`	Per-metric `score`, `label`, and diagnostic `details`
`execution_metadata.total_cost_usd`	`float`	Sum of all LLM and embedding API costs incurred

If no documents are provided, get_trust_score() immediately returns a score of 0.0 with label Unreliable without making any API calls. Always pass at least one retrieved document.

Async usage

For high-concurrency server deployments, use the native async variant to avoid blocking the event loop:

result = await trust_engine.a_get_trust_score(context)

The async path runs metric calculations concurrently with asyncio.gather and uses an async embedding pipeline, making it significantly faster under load.

Get Started

Core Concepts

Guides

Trust Score: how TrustifAI weights and aggregates it

Multi-dimensional evaluation model

Weighted aggregation formula

Default weights

Configuring weights in YAML

Decision labels

Calling `get_trust_score()`

Interpreting the result dict

Async usage

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Documentation Index

​Multi-dimensional evaluation model

​Weighted aggregation formula

​Default weights

​Configuring weights in YAML

​Decision labels

​Calling get_trust_score()

​Interpreting the result dict

​Async usage

Build docs developers (and LLMs) love

Multi-dimensional evaluation model

Weighted aggregation formula

Default weights

Configuring weights in YAML

Decision labels

Calling `get_trust_score()`

Interpreting the result dict

Async usage