Overview
EchoVault uses a hybrid search architecture that combines:- FTS5 keyword search - Fast, exact matching with BM25 ranking
- Semantic vector search - Meaning-based similarity using embeddings
Search Modes
FTS5 Keyword Search
FTS5 is SQLite’s built-in full-text search extension:- Works immediately with zero configuration
- Porter stemming matches word variants (“run”, “running”, “ran”)
- Prefix matching finds partial words (“auth” matches “authentication”)
- BM25 ranking scores results by relevance
- Unicode normalization handles accents and special characters
From
~/workspace/source/src/memory/db.py:81-87, the FTS5 table is configured with:~/workspace/source/src/memory/search.py:398-400, queries are automatically enhanced with prefix matching:
"auth"* OR "token"*, matching “authentication”, “authorization”, “tokens”, etc.
Semantic Vector Search
Vector search finds memories by meaning, not just keywords:- Embedding generation - Text is converted to high-dimensional vectors
- Cosine similarity - Vectors are compared for semantic similarity
- sqlite-vec - Fast vector search using SQLite extension
- Optional - Requires embedding provider configuration
Vector search is optional. Without an embedding provider configured, EchoVault falls back to FTS5-only search with no loss of core functionality.
Tiered Search Strategy
EchoVault uses a smart “tiered” approach to minimize embedding API latency:Check Results
If FTS returns at least 3 results, skip embedding entirely and return keyword results.
Hybrid Score Merging
When both FTS and vector results are available, they’re merged with weighted scoring:- FTS: 30% (keyword precision)
- Vector: 70% (semantic recall)
Search Filters
All search modes support optional filters:Project Filter
Limit search to a specific project:Source Filter
Limit search to memories created by a specific agent:Filters are applied at the database level for FTS searches, but post-processed for vector searches due to sqlite-vec limitations (from
~/workspace/source/src/memory/db.py:476-480).Context Retrieval
Thememory_context MCP tool uses intelligent retrieval logic:
Semantic Mode
Controlled bycontext.semantic in config.yaml:
auto(default) - Use vectors if Ollama is warm, otherwise FTS onlyalways- Always use vector search if availablenever- Always use FTS-only search
Recent Topup
Controlled bycontext.topup_recent in config.yaml:
- When
true(default), context retrieval fills remaining slots with recent memories - Ensures agents have fresh context even when semantic search returns few results
- Deduplicates to avoid returning the same memory twice
~/workspace/source/src/memory/core.py:436-446:
Vector Storage Details
Dynamic Dimension
The vector table is created dynamically based on the embedding provider’s dimension:Store Dimension
Store dimension in meta table:
INSERT INTO meta (key, value) VALUES ('embedding_dim', '768').Dimension Mismatch Handling
From~/workspace/source/src/memory/db.py:169-181, if the embedding dimension changes:
memory reindex to rebuild the vector table with the new dimension.
Performance Characteristics
FTS5 Search
- Latency: less than 10ms for most queries
- Scaling: Handles 10,000+ memories efficiently
- No dependencies: Works with zero configuration
Vector Search
- Latency: 5-20s for Ollama, 200-500ms for OpenAI
- Scaling: Handles 10,000+ memories efficiently
- Requires: Embedding provider configuration
Search Result Format
Search returns compact memory pointers:memory details <id> to fetch the full details body when needed.