Skip to main content
Recall measures the completeness of a retrieval system: out of all relevant items that exist, how many did the system actually return in the top-K results? Where Precision asks “how many retrieved items were relevant?”, Recall asks “how many relevant items were retrieved?” — the two are complementary and are often combined into an F1 score.

Recall

func Recall(predicted []string, relevance map[string]int, k int) float64
Returns the fraction of all known relevant items that appear in the first k positions of predicted. A score of 1.0 means every relevant item was found within the top-K results; a score of 0.0 means none were. Returns 0.0 when the relevance map contains no relevant items.
predicted
[]string
required
The ranked list of retrieved item identifiers, ordered from most to least relevant.
relevance
map[string]int
required
A map of item identifier to relevance grade. Items with grade ≥ 1 are counted as relevant. The total number of such items forms the denominator of the recall calculation.
k
int
required
The cutoff depth. Only the first k items in predicted are examined when counting hits.
Returns float64 — the ratio of relevant items found in the top-K to the total number of relevant items.

Example

func ExampleRecall() {
	predicted := []string{"A", "B", "C", "D"}
	relevance := map[string]int{
		"A": 3,
		"B": 2,
		"C": 1,
		"D": 0,
		"E": 3,
	}

	s := reval.Recall(predicted, relevance, 3)
	fmt.Println("Recall@3:", s)

	// Output:
	// Recall@3: 0.75
}
There are 4 relevant items in total (“A”, “B”, “C”, “E” — all with grade ≥ 1). With K=3, the top-3 predicted items are “A”, “B”, and “C” — all three are relevant. Recall@3 = 3/4 = 0.75. Item “E” is relevant but was never retrieved within the cutoff, so the score is less than 1.0.

Precision vs. recall

Precision@K

Measures result quality. Answers: “Are the items I returned actually useful?” A high-precision system returns fewer but more reliable results.

Recall@K

Measures result completeness. Answers: “Did I surface all the useful items?” A high-recall system ensures few relevant items are missed within the top-K.
Precision and recall trade off against each other. Increasing K tends to improve recall (more relevant items are surfaced) but can reduce precision (more non-relevant items are also included). Use F1 or Average Precision to balance the two.

Build docs developers (and LLMs) love