Use this file to discover all available pages before exploring further.
This guide walks you through building a complete Retrieval-Augmented Generation (RAG) pipeline using VectorDB. You’ll learn how to index documents, configure search, add reranking, and generate answers with an LLM.
These credentials are used in configuration files via the ${VAR} syntax for secure credential management. See the environment variables reference for all supported variables.
3
Create your configuration file
Create a YAML configuration file that defines your pipeline. Here’s a complete example for semantic search with RAG:
Load the dataset and index documents into your vector database:
from vectordb.langchain.semantic_search import PineconeSemanticSearchPipeline# Initialize pipeline with your configpipeline = PineconeSemanticSearchPipeline( "config/my_rag_pipeline.yaml")# Index documentspipeline.index()
The indexing process:
Loads documents from the specified dataset
Generates embeddings using the configured model
Stores vectors and metadata in Pinecone
5
Perform semantic search
Search your indexed documents using natural language queries:
# Search for relevant documentsresult = pipeline.search( "What is photosynthesis?", top_k=5)# Access retrieved documentsfor doc in result["documents"]: print(f"Score: {doc.score}") print(f"Content: {doc.content}") print("---")
6
Add reranking for precision
Improve result quality by adding cross-encoder reranking. Update your configuration:
Reranking applies a more expensive cross-encoder model to the top candidates retrieved by the initial vector search, improving precision at the cost of increased latency.
7
Generate answers with RAG
With RAG enabled in your configuration, the pipeline automatically generates answers:
result = pipeline.search( "Explain how neural networks learn", top_k=10)# The answer is generated using retrieved contextprint(result["answer"])# Access the source documentsfor doc in result["source_documents"]: print(f"Source: {doc.metadata['source']}")
8
Evaluate retrieval quality
Measure your pipeline’s performance using built-in evaluation metrics:
from vectordb.utils.evaluation import evaluate_retrieval, QueryResult# Run evaluation on test queriesquery_results = []for query_data in test_queries: result = pipeline.search(query_data["query"], top_k=10) query_results.append( QueryResult( query=query_data["query"], retrieved_ids=[doc.id for doc in result["documents"]], relevant_ids=set(query_data["relevant_ids"]) ) )# Compute metricsmetrics = evaluate_retrieval(query_results, k=10)print(f"Recall@10: {metrics.recall_at_k:.3f}")print(f"Precision@10: {metrics.precision_at_k:.3f}")print(f"MRR: {metrics.mrr:.3f}")print(f"NDCG@10: {metrics.ndcg_at_k:.3f}")