Chroma integrates seamlessly with LangChain , providing a powerful vector store for building LLM applications with retrieval-augmented generation (RAG).
Installation
pip install chromadb langchain langchain-chroma
Python integration
Basic usage
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
# Load and split documents
text_splitter = CharacterTextSplitter( chunk_size = 1000 , chunk_overlap = 0 )
documents = text_splitter.create_documents([ "Your text here" ])
# Create Chroma vector store
vectorstore = Chroma.from_documents(
documents = documents,
embedding = OpenAIEmbeddings(),
collection_name = "langchain_store" ,
persist_directory = "./chroma_langchain_db"
)
# Query the vector store
results = vectorstore.similarity_search( "your query" , k = 3 )
Using with a retriever
from langchain.chains import RetrievalQA
from langchain_openai import OpenAI
# Create a retriever from the vector store
retriever = vectorstore.as_retriever( search_kwargs = { "k" : 3 })
# Create a QA chain
qa_chain = RetrievalQA.from_chain_type(
llm = OpenAI(),
chain_type = "stuff" ,
retriever = retriever
)
# Ask questions
response = qa_chain.invoke( "What is this document about?" )
Advanced retrieval
# Maximum Marginal Relevance (MMR) search
results = vectorstore.max_marginal_relevance_search(
"query text" ,
k = 5 ,
fetch_k = 20 ,
lambda_mult = 0.5
)
# Search with metadata filtering
results = vectorstore.similarity_search(
"query text" ,
filter = { "source" : "documentation" },
k = 3
)
JavaScript integration
import { Chroma } from "@langchain/community/vectorstores/chroma" ;
import { OpenAIEmbeddings } from "@langchain/openai" ;
// Create vector store
const vectorStore = await Chroma . fromTexts (
[ "Text 1" , "Text 2" , "Text 3" ],
[{ source: "doc1" }, { source: "doc2" }, { source: "doc3" }],
new OpenAIEmbeddings (),
{
collectionName: "langchain_collection" ,
url: "http://localhost:8000" ,
}
);
// Query
const results = await vectorStore . similaritySearch ( "query text" , 3 );
Configuration options
Name of the Chroma collection
Directory for persistent storage (Python only)
Chroma client settings for remote connections
Custom embedding function
Use cases
Document Q&A Build question-answering systems over your documents
Semantic search Implement semantic search with LangChain retrievers
RAG pipelines Create retrieval-augmented generation workflows
Chat with data Build conversational AI that references your data
Resources
The LangChain integration supports both persistent and in-memory Chroma instances.