Precision for a single cutoff, AveragePrecision to account for ranking order, and MeanAveragePrecision to aggregate across multiple queries.
Concepts
Precision@K answers: “Of the top-K items I returned, what fraction were relevant?” It treats all positions equally and ignores items ranked below K. Average Precision (AP) answers: “How well did I rank the relevant items near the top?” It computes precision at each rank where a relevant item appears, then averages those values — so retrieving relevant items earlier earns a higher score than retrieving them later. Mean Average Precision (MAP) is the macro-average of AP across a set of queries. It provides a single number summarizing system-wide retrieval quality.QueryResult
The QueryResult struct bundles the predicted ranking and relevance judgements for a single query, and is the input type for MeanAveragePrecision.
The ranked list of item identifiers returned by the system, ordered from most to least relevant.
A map of item identifier to relevance grade. Items with a grade of 1 or higher are considered relevant; items with a grade below 1 are treated as non-relevant.
Precision
0.0 when k is 0.
The ranked list of retrieved item identifiers.
A map of item identifier to relevance grade. Items with grade ≥ 1 are counted as hits.
The cutoff depth. Only the first
k items in predicted are evaluated.float64 — the ratio of relevant hits in the top-K to K.
Example
Items “A” and “B” are relevant (grades 3 and 2), “C” is not (grade 0). With K=3, precision = 2/3 ≈ 0.667. Item “D” is never considered because it falls outside the K=3 cutoff.
AveragePrecision
0.0 if no relevant items are found.
The ranked list of retrieved item identifiers.
A map of item identifier to relevance grade. Items with grade ≥ 1 are treated as relevant.
The cutoff depth. Only the first
k items in predicted are evaluated.float64 — the average of per-rank precision scores at each relevant hit.
Example
The first hit is “A” at rank 2 (precision = 1/2), the second hit is “B” at rank 3 (precision = 2/3). AP = (0.5 + 0.667) / 2 ≈ 0.583. “C” and “D” are not relevant so they contribute nothing.
MeanAveragePrecision
AveragePrecision across all queries in results. MAP summarises retrieval quality over an entire test set, making it the standard offline evaluation metric for ranked retrieval systems. Returns 0.0 for an empty slice.
A slice of
QueryResult values, one per query. Each entry contains the system’s ranked output and the corresponding relevance judgements.The cutoff depth passed to
AveragePrecision for every query.float64 — the mean Average Precision across all queries.