Score Normalization
Different search backends produce scores in different ranges. QMD normalizes all scores to [0, 1] for consistency.BM25 (FTS5) Normalization
FTS5 BM25 scores are negative (lower = better match):| Raw BM25 | Normalized | Interpretation |
|---|---|---|
| -10 | 0.91 | Strong match |
| -5 | 0.83 | Good match |
| -2 | 0.67 | Medium match |
| -0.5 | 0.33 | Weak match |
| 0 | 0.00 | No match |
- Monotonic: Preserves ranking order
- Query-independent: No per-query normalization needed
- Stable: Same raw score → same normalized score
Vector (Cosine) Normalization
Vector search returns cosine distance (0 = identical, 1 = orthogonal):| Cosine Distance | Normalized | Interpretation |
|---|---|---|
| 0.0 | 1.00 | Identical |
| 0.1 | 0.90 | Very similar |
| 0.3 | 0.70 | Similar |
| 0.5 | 0.50 | Somewhat similar |
| 0.7 | 0.30 | Different |
| 1.0 | 0.00 | Orthogonal |
Reranker Normalization
Qwen3-Reranker returns scores already in [0, 1] range:Reciprocal Rank Fusion (RRF)
Formula
For each document appearing in multiple ranked lists:weight_i= weight for listi(default 1.0, original query gets 2.0)k= constant (60, from RRF literature)rank_i= 0-based rank in listi
Implementation
Top-Rank Bonus
Documents ranking #1 in any list get a +0.05 bonus, #2-3 get +0.02:Weight Configuration
Original query results get 2× weight:| List | Weight | Rationale |
|---|---|---|
| Original query FTS | 2.0 | Preserve exact keyword matches |
| Original query vector | 2.0 | Preserve semantic intent |
| Expanded query 1 | 1.0 | Supporting evidence |
| Expanded query 2 | 1.0 | Supporting evidence |
Example Calculation
Document appears in 3 lists:Position-Aware Blending
Motivation
Pure reranker scores can contradict high-confidence retrieval results. For example:- BM25 finds an exact keyword match (rank 1)
- Reranker gives it a low score (0.3) because it lacks semantic context
- Result gets buried despite being the user’s intent
Algorithm
Weight Table
| RRF Rank | RRF Weight | Reranker Weight | Rationale |
|---|---|---|---|
| 1-3 | 75% | 25% | Preserve exact matches |
| 4-10 | 60% | 40% | Balanced trust |
| 11+ | 40% | 60% | Trust reranker for semantic matches |
Example 1: Top Rank Preserved
Document at RRF rank 2:Example 2: Reranker Promotes
Document at RRF rank 15:Example 3: Middle Ground
Document at RRF rank 7:Full Pipeline Example
User query:"machine learning algorithms"
Step 1: Query Expansion
Step 2: Multi-Backend Search
Step 3: RRF Fusion
Step 4: Reranking
Step 5: Position-Aware Blending
Result
- doc1 stays #1 (strong keyword + semantic match, preserved by position-aware blending)
- doc2 stays #2 (strong semantic match)
- doc4 promoted to #3 (reranker elevated it)
- doc3 drops to #4 (weak reranker score)