Hybrid search combines multiple retrieval signals — typically a keyword score like BM25 and a semantic score from vector embeddings — into a single relevance score. Neither signal alone is best in all cases: BM25 is strong for exact keyword matches while vector similarity captures semantic closeness. Fusing the two tends to outperform either in isolation. Flock provides five scalar fusion functions for this purpose. They fall into two categories:Documentation Index
Fetch the complete documentation index at: https://mintlify.com/dais-polymtl/flock/llms.txt
Use this file to discover all available pages before exploring further.
- Rank-based (
fusion_rrf): takes a rank position (1 = best) from each retrieval system - Score-based (
fusion_combsum,fusion_combmnz,fusion_combmed,fusion_combanz): takes a normalized score (0.0–1.0) from each retrieval system
DOUBLE score per row. A higher combined score means a more relevant document.
Data preprocessing
Before calling a fusion function, you need to prepare your scores in the right format.Ranks for fusion_rrf
Obtain integer ranks usingDENSE_RANK(). Documents with the same score get the same rank. Rank 1 is the best-ranked document.
Normalized scores for score-based functions
Score-based functions require scores on a common scale. Min-max normalization maps each system’s scores to [0, 1]. When all scores are identical (min = max), the formula producesNaN, which the fusion functions treat as 0.
fusion_rrf
Reciprocal Rank Fusion (RRF), as introduced by Cormack et al. (2009). Each document’s combined score is the sum of reciprocal ranks across all retrieval systems:60 dampens the impact of rank differences for high-ranked documents. A document ranked 1st contributes 1/61 ≈ 0.0164 and a document ranked 100th contributes 1/160 ≈ 0.0063.
Return type: DOUBLE
Parameters: Two or more INTEGER rank values, one per retrieval system.
Examples
fusion_combsum
Sums normalized scores across all retrieval systems.DOUBLE
Parameters: Two or more DOUBLE normalized scores (0.0–1.0), one per retrieval system.
Example
fusion_combmnz
Extends CombSUM by multiplying the sum by the number of retrieval systems that returned a non-zero score for the document (the “hit count”). Documents found by more systems receive a boost.0. NULL, NaN, and 0 do not count as hits.
Return type: DOUBLE
Parameters: Two or more DOUBLE normalized scores (0.0–1.0), one per retrieval system.
Example
fusion_combmed
Takes the median normalized score across all retrieval systems. NULL and NaN are treated as0 and are included when calculating the median — a document missing from one system is penalized.
(NULL, NULL, 1.0) yield a median of 0.0.
Return type: DOUBLE
Parameters: Two or more DOUBLE normalized scores, one per retrieval system.
Example
fusion_combanz
Calculates the average (arithmetic mean) normalized score across all retrieval systems. NULL and NaN are treated as0 and are included in the denominator, so a document missing from some systems is penalized.
(NULL, NULL, 1.0) across three systems yield 0.333....
Return type: DOUBLE
Parameters: Two or more DOUBLE normalized scores, one per retrieval system.
Example
End-to-end RAG pipeline example
The following query shows a complete hybrid search pipeline: generate embeddings withllm_embedding, compute BM25 ranks and embedding similarity scores, normalize both signals, fuse them with fusion_rrf, then return the top results for retrieval-augmented generation.