Documentation Index
Fetch the complete documentation index at: https://mintlify.com/TrustifAI/trustifai/llms.txt
Use this file to discover all available pages before exploring further.
BatchResult is returned by evaluate_dataset after all contexts in a dataset have been processed. Successful evaluations are stored in order in .results, while any context that raised an unhandled exception is captured in .failed — a single failure never aborts the rest of the batch. Aggregate statistics are computed as properties on demand, so there is no overhead when you only need the raw results.
Fields
Successful trust score dicts, sorted in the original dataset order. Each dict is the output of
Trustifai.get_trust_score() for one MetricContext. Suitable for direct conversion to a pandas DataFrame with pd.DataFrame(batch.results).List of evaluation failures. Each entry contains:
Total number of
MetricContext objects passed to evaluate_dataset.Number of contexts that completed evaluation without raising an exception.
Wall-clock duration of the full batch run in seconds, measured from the first task dispatch to the last task completion.
Computed properties
Mean trust score across all successful evaluations, rounded to 4 decimal places. Returns
0.0 if no results are available.Min, median, and max trust scores across successful results.
Count of each trust label across successful results, keyed by label string. Labels reflect your configured thresholds — typical values are
"RELIABLE", "ACCEPTABLE", and "UNRELIABLE".Fraction of contexts that failed, rounded to 4 decimal places. Computed as
len(failed) / total. Returns 0.0 if total is zero..summary() method
Returns a formatted multi-line text summary of the batch run, including throughput and aggregate score statistics. Useful for quick inspection in notebooks or CLI output.
Example output
pandas integration
batch.results is a list of plain dicts with a consistent schema, making it a natural fit for DataFrame analysis:
batch.results preserves the original dataset order — the async pipeline internally sorts by the input index before returning, so row i in the DataFrame corresponds to context i in your input list.