Dense Vector Search (Default)
Dense vector search uses ChromaDB embeddings to find semantically similar content. This mode excels at understanding meaning and context, even when exact keywords don’t match.How It Works
Dense search converts your query and documents into high-dimensional vectors (embeddings) and finds the closest matches using vector similarity:- Indexing: Documents are chunked and embedded using ChromaDB’s default embedding model
- Query: Your search query is embedded into the same vector space
- Retrieval: ChromaDB finds the nearest neighbors using cosine distance
- Results: Items are ranked by distance (lower is better)
Dense search works well for:
- Conceptual queries (“error handling patterns”)
- Queries with synonyms (“automobile” matches “car”)
- Cross-lingual understanding
- Questions vs. answers matching
src/db.py:445-454
BM25 Lexical Search
BM25 is a traditional keyword-based ranking algorithm that excels at exact term matching and technical queries.How It Works
BM25 uses statistical analysis of term frequency and document frequency:- Tokenization: Documents are tokenized with stemming using PyStemmer
- Indexing: The bm25s library builds an inverted index with term statistics
- Scoring: Query terms are matched and scored using the BM25 algorithm
- Caching: The BM25 index is cached in
./know_index/bm25/for performance
BM25 works well for:
- Exact keyword matches (“ValueError: invalid literal”)
- Technical terminology (“FastAPI dependency injection”)
- Code snippets and function names
- Queries where word frequency matters
src/db.py:375-431, src/bm25.py:25-42
Hybrid Search
Hybrid mode combines both dense vector and BM25 search using Reciprocal Rank Fusion (RRF), giving you the best of both worlds.How It Works
Hybrid search runs both dense and BM25 searches in parallel, then fuses the results:- Parallel Search: Retrieves
limit * 3results from both dense and BM25 - RRF Fusion: Combines rankings using Reciprocal Rank Fusion with k=60
- Re-ranking: Final results are sorted by fused scores
- Top Results: Returns the top
limititems
Understanding RRF with k=60
Understanding RRF with k=60
Reciprocal Rank Fusion (RRF) is a simple but effective way to combine rankings from multiple sources:
- Each result’s contribution is
1 / (k + rank)where k=60 - Higher k values reduce the importance of rank position
- Items appearing in both result sets get boosted scores
- The constant k=60 is a widely-used default that balances stability
- Item at rank 1: score = 1/(60+1) ≈ 0.0164
- Item at rank 5: score = 1/(60+5) ≈ 0.0154
- Item in both at rank 1 & 3: score ≈ 0.0164 + 0.0159 = 0.0323
Hybrid works well for:
- Complex natural language questions
- Queries that benefit from both semantic and keyword matching
- When you’re not sure which mode to use
- Production use cases requiring robust results
src/db.py:485-508, src/retrieval.py:19-32
Comparing Search Modes
Use the--benchmark flag to compare dense and BM25 results side-by-side:
- Dense
- BM25
- Hybrid
Best for:
- Semantic understanding
- Conceptual queries
- Natural language questions
- Cross-lingual matching
- “how to handle errors gracefully”
- “database optimization techniques”
- “user authentication best practices”
Performance Considerations
Caching
BM25 indexes are cached to./know_index/bm25/ with metadata tracking:
Candidate Retrieval
Hybrid mode retrieveslimit * 3 candidates from each source before fusion to ensure high-quality results after filtering:
Next Steps
Filtering
Learn about glob patterns and time-based filtering
Output Formats
Explore rich, plain, and JSON output options