TrustifAI gives you a principled, multi-dimensional Trust Score for any LLM or RAG response — going far beyond a single correctness check. Instead of a black-box number, TrustifAI breaks trustworthiness into four orthogonal signals (evidence coverage, epistemic consistency, semantic drift, and source diversity), combines them with configurable weights, and renders an interactive reasoning graph that shows exactly why a response was deemed reliable or unreliable.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/TrustifAI/trustifai/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
Score your first RAG response in under five minutes with a working code example.
Installation
Install TrustifAI via pip and configure your environment variables.
Core Concepts
Understand how Trust Score is computed from four independent trust signals.
API Reference
Explore the full public API: Trustifai, AsyncTrustifai, MetricContext, and more.
What TrustifAI evaluates
TrustifAI computes trust across two evaluation modes: Offline metrics — for already-generated RAG responses:| Metric | What it detects |
|---|---|
| Evidence Coverage | Hallucinations — verifies every claim against retrieved documents |
| Epistemic Consistency | Model inconsistency — measures semantic stability across stochastic re-generations |
| Semantic Drift | Topic drift — ensures the answer stays within the document’s semantic envelope |
| Source Diversity | Over-reliance on a single source vs. synthesis across multiple sources |
| Metric | What it detects |
|---|---|
| Confidence Score | Real-time certainty via token log probability analysis |
Why TrustifAI
Most evaluation frameworks return a single scalar. TrustifAI returns a structured, explainable result:- Weighted aggregation — tune metric importance to your use case (e.g., weight evidence coverage higher for medical Q&A)
- Reasoning graph — a DAG visualization showing metric scores, aggregation, and the final decision
- Custom metrics — plug in your own evaluation logic without touching core library code
- Async-first — native async pipeline with concurrency control and rate limiting for large-scale batch evaluation
- LiteLLM-backed — works with OpenAI, Anthropic, Gemini, Azure, Mistral, Ollama, and more
Offline metrics explained
Deep dive into Evidence Coverage, Epistemic Consistency, Semantic Drift, and Source Diversity.
Online metrics explained
Learn how the Confidence Score uses token log probabilities for real-time trust signals.
Reasoning graphs
Understand and customize the interactive DAG visualization of your evaluation logic.
Custom metrics
Extend TrustifAI with your own evaluation logic using the BaseMetric interface.