Overview
The Hybrid RRF module implements an advanced hybrid RAG pipeline that combines lexical retrieval (BM25) and semantic retrieval (ChromaDB), fuses both ranked lists using Reciprocal Rank Fusion (RRF), and then applies Maximal Marginal Relevance (MMR) for diversity. Module:src.rag.hybrid_rrf
Source: src/rag/hybrid_rrf.py
Configuration
Retrieval Settings
Default Models
Retrievers
Core Functions
reciprocal_rank_fusion
List of ranked document lists from different retrievers
RRF constant. Higher values give less weight to rank position
Number of top documents to return after fusion
Fused and re-ranked list of documents
score = sum(1 / (k_constant + rank)) for each document across all rankings.
mmr_select
The original user query
Pool of candidate documents to select from
Number of documents to select
Lambda parameter controlling relevance vs diversity tradeoff.
- 1.0 = pure relevance
- 0.0 = pure diversity
- 0.7 = 70% relevance, 30% diversity
Selected documents balancing relevance and diversity
MMR = λ * relevance(query, doc) - (1-λ) * max_similarity(doc, selected_docs)
retrieve_hybrid_rrf
The user’s query
Final list of k_final documents (default: 5)
format_docs
List of documents to format
Formatted string with document contents and metadata
process_hybrid_rrf_query
The user’s question
Custom language model for answer generation
Dictionary containing:
answer(str): Generated answercontexts(List[str]): Document contentsretrieved_documents(List[Document]): Full documentsmetrics(dict): Token usage and cost metrics
query_for_evaluation
The question to process
Model name to use. If None, uses default “gpt-4o”
Pre-configured language model. Takes precedence over llm_model
Dictionary containing:
question(str): Original questionanswer(str): Generated answercontexts(List[str]): Retrieved document contentssource_documents(List[Document]): Full retrieved documentsmetadata(dict): Comprehensive metadata including:num_contexts(int): Number of contextsretrieval_method(str): “hybrid_bm25_semantic_rrf_mmr”rrf_k(int): RRF constant usedk_bm25_candidates(int): 15k_semantic_candidates(int): 15k_rrf_pool(int): 10k_final(int): 5mmr_lambda(float): 0.7llm_model(str): Model nameprovider(str): Provider namemodel_id(str): Full model IDembedding_model(str): “text-embedding-3-small”execution_time(float): Execution time in secondsinput_tokens(int): Input tokens usedoutput_tokens(int): Output tokens generatedtotal_cost(float): Total cost in USDtokens_used(int): Total tokensusage_source(str): Usage data sourcecost_source(str): Cost calculation source
Usage Example
Pipeline Flow
- BM25 Retrieval: Retrieves top 15 candidates using lexical search
- Semantic Retrieval: Retrieves top 15 candidates using vector similarity
- RRF Fusion: Fuses both ranked lists using Reciprocal Rank Fusion, creating a pool of 10 documents
- MMR Selection: Applies Maximal Marginal Relevance to select final 5 documents balancing relevance and diversity
- Format: Formats documents with metadata
- Generate: Uses LLM to generate answer
- Track: Captures comprehensive metrics
Key Features
- Advanced Fusion: Uses Reciprocal Rank Fusion for better ranking
- Diversity: MMR ensures diverse document selection
- Deduplication: Automatically removes duplicate documents
- Tunable Parameters: Configurable k values and lambda for fine-tuning
- High Recall: Retrieves 15 candidates from each method
- Balanced Results: 70% relevance + 30% diversity by default
