Milvus: Vector Storage for Semantic Medical Retrieval

Milvus is the vector database that stores dense embeddings of medical document chunks. It powers the semantic retrieval channel of the agentic pipeline, enabling similarity-based search over unstructured medical text. The README specifies that Milvus is used for dense hybrid search, and that LightRAG stores its chunk embeddings in a vector index alongside the Neo4j knowledge graph — making Milvus an explicit dependency for the LightRAG backend.

Role in the System

Milvus participates in both the indexing and retrieval stages of the agentic pipeline when LightRAG is the active backend. During indexing, LightRAG embeds each document chunk into a dense vector representation and writes it into a vector index alongside the raw text. This happens in parallel with the entity and relationship extraction that populates Neo4j, so the semantic and relational indexes are built in a single ingestion pass. During retrieval, the semantic channel uses the vector index to identify the document chunks most relevant to each sub-query generated by the decomposition step. The top-ranked chunks are then filtered, summarised, and merged with the relational channel’s output before the final answer is synthesised.

The README explicitly associates Milvus with LightRAG’s hybrid retrieval mode. Other backends (MiniRAG, PathRAG, HyperGraphRAG) store their context locally and may not share the same vector storage arrangement.

Setup

The quickest way to run Milvus locally is in standalone mode via Docker.

Start the Milvus standalone container

Pull and start the Milvus standalone image with the standard gRPC and HTTP ports exposed:

docker run -d --name milvus-standalone \
  -p 19530:19530 -p 9091:9091 \
  milvusdb/milvus:latest standalone

The standard Milvus standalone ports are 19530 (gRPC) and 9091 (HTTP).

Verify the instance

Once the container is running, confirm it is healthy by checking the HTTP health endpoint:

http://localhost:9091/healthz

The application connects over gRPC on the default port:

Protocol	Port
gRPC (primary)	`19530`
HTTP / health	`9091`

For production deployments, Milvus can be deployed on Kubernetes using the official Milvus Helm chart, or as a fully managed service via Zilliz Cloud.

Configuration

The pipeline connects to Milvus using environment variables for the host and port. The table below shows typical configuration variable names — set these in your shell or in a .env file and never hardcode them in source code.

Variable	Description	Example
`MILVUS_HOST`	Hostname or IP address of the Milvus instance	`localhost`
`MILVUS_PORT`	gRPC port for the Milvus instance	`19530`

Hybrid Search

LightRAG operates in hybrid mode, combining two complementary retrieval strategies in a single pass:

Keyword-based KG traversal — a graph-structured search that surfaces entities and relationships from the Neo4j knowledge graph that are lexically relevant to the query.
Dense vector search — a similarity search over the vector-indexed chunk embeddings that retrieves semantically related document passages even when there is no exact keyword overlap.

The results of both operations are merged by LightRAG’s context builder into a structured JSON response containing four sections: Knowledge Graph Data (Entity), Knowledge Graph Data (Relationship), Document Chunks, and a Reference Document List. The pipeline’s context_filter then splits this response, routing the document chunks and reference list to the semantic channel and the entity and relationship JSON to the relational channel.

For further details on Milvus architecture, collection management, and production deployment options, refer to the official Milvus documentation at https://milvus.io/.

Get Started

Concepts

Backends

Storage & Infrastructure

Evaluation

Role in the System

Setup

Configuration

Hybrid Search

Build docs developers (and LLMs) love

Get Started

Concepts

Backends

Storage & Infrastructure

Evaluation

Documentation Index

​Role in the System

​Setup

​Configuration

​Hybrid Search

Build docs developers (and LLMs) love

Role in the System

Setup

Configuration

Hybrid Search