Overview
The RAG module consists of two main components:- rag-base: Core interfaces for document storage and retrieval
- vector-storage: Vector-based storage with semantic search capabilities
- Grounded Responses: Agents answer questions based on your documents
- Semantic Search: Find documents by meaning, not just keywords
- Scalable Storage: Handle large document collections efficiently
- Ranked Retrieval: Get the most relevant documents first
- Flexible Backend: Swap storage implementations without code changes
Installation
Add the RAG dependencies:gradle
Core Interfaces
DocumentStorage
The foundational interface for document operations:rag/rag-base/Module.md:9
DocumentStorageWithPayload
Extends document storage with metadata support:rag/rag-base/Module.md:10
RankedDocumentStorage
Extends document storage with ranking capabilities:rag/rag-base/Module.md:11
Quick Start
Basic Document Storage
Store and retrieve documents:rag/rag-base/Module.md:21
Document with Metadata
Store documents with associated metadata:rag/rag-base/Module.md:39
Semantic Document Retrieval
Find relevant documents using semantic search:rag/rag-base/Module.md:68
Vector Storage
Use vector embeddings for semantic search:Creating Vector Storage
rag/vector-storage/Module.md:20
Using Vector Storage
rag/vector-storage/Module.md:42
Use Cases
Question Answering Agent
Build an agent that answers questions from documents:Document Ingestion Pipeline
Process and store documents in batches:Hybrid Search
Combine semantic and keyword search:Document Deduplication
Find and remove duplicate documents:Text Document Reader
Transform documents into text for embedding:rag/rag-base/Module.md:12 and rag/rag-base/Module.md:58
Document Embedder
Convert documents to vector embeddings:rag/vector-storage/Module.md:11
Best Practices
-
Batch Operations: Process documents in parallel
-
Chunking: Split large documents into smaller chunks
-
Metadata Indexing: Store searchable metadata
-
Error Handling: Handle storage failures gracefully
-
Relevance Tuning: Adjust similarity thresholds
Performance Considerations
- Indexing: Pre-compute embeddings for faster retrieval
- Caching: Cache frequently accessed documents
- Batch Queries: Process multiple queries in parallel
- Dimension Reduction: Use smaller embedding models for speed
- Pagination: Limit result counts for large datasets
Integration with Agents
Use RAG with agents for grounded responses:Platform Support
- JVM: Full support
- JS: Full support
- Native: Planned
Common Use Cases
- Knowledge Bases: Company documentation search
- Customer Support: Answer questions from documentation
- Research: Scientific paper retrieval
- Legal: Case law and document search
- Code Search: Find relevant code examples
- E-commerce: Product recommendation based on descriptions
Next Steps
- Learn about embeddings for vector generation
- Explore vector storage implementations
- See RAG examples in action
- Check out agent patterns with RAG