Embedding models convert a piece of text — a word, sentence, paragraph, or entire article — into an array of floating point numbers called an embedding vector. This vector is a numerical representation of the semantic meaning of the content in a high-dimensional space. By calculating the distance between two embedding vectors, you can determine how semantically similar two pieces of content are, even if they share no exact keywords.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/simonw/LLM/llms.txt
Use this file to discover all available pages before exploring further.
What embeddings enable
- Semantic search — users search by meaning rather than exact keywords; results match the intent of a query.
- Related content — find articles, documents, or records that are conceptually close to a given item.
- Image search — models like CLIP operate on binary inputs such as images, enabling image-to-image or text-to-image similarity queries.
- Clustering and classification — group or label content based on the geometric structure of its embedding space.
How LLM supports embeddings
LLM supports multiple embedding models through its plugin system. Once an embedding model plugin is installed, you can use it on the command line or through the Python API to:- Calculate embedding vectors for individual strings or binary files.
- Store those vectors in named collections inside a SQLite database.
- Run similarity searches across a collection to find the nearest neighbors to a query.
Embeddings are stored as compact binary blobs (little-endian 32-bit floats) in SQLite. See the Storage Format page for technical details.
Explore the embeddings section
CLI Reference
Use
llm embed, llm embed-multi, llm similar, and llm embed-models from the terminal to generate, store, and search embeddings.Python API
Work with the
llm.Collection class to store and search embeddings programmatically from Python.Writing Plugins
Implement a custom embedding model by writing an LLM plugin that exposes an
EmbeddingModel class.Storage Format
Understand the SQLite schema, the binary blob encoding, and the encode/decode helpers used internally.