Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/simonw/LLM/llms.txt

Use this file to discover all available pages before exploring further.

LLM exposes a complete Python API for working with embeddings. You can load an embedding model, compute vectors for individual strings or binary data, and persist those vectors in named collections backed by SQLite. Collections support bulk ingestion, deduplication via content hashing, and cosine-similarity search — all without running the CLI.

Loading a model

Use llm.get_embedding_model() with a model ID or alias:
import llm

embedding_model = llm.get_embedding_model("3-small")

Embedding a string

Call .embed() on the model to compute a vector for a single string. This returns a Python list of floating-point numbers:
vector = embedding_model.embed("my happy hound")

Binary input

Some models — such as CLIP — can embed raw binary data (e.g. images). Check supports_binary before passing bytes:
if embedding_model.supports_binary:
    vector = embedding_model.embed(open("my-image.jpg", "rb").read())
The embedding_model.supports_text property similarly indicates whether text input is supported.

Embedding multiple strings at once

Many models are more efficient when processing a batch of inputs. Use .embed_multi(), which returns a generator of vectors:
vectors = list(embedding_model.embed_multi(["my happy hound", "my dissatisfied cat"]))
Control the batch size with batch_size=N:
vectors = list(embedding_model.embed_multi(lines_from_file, batch_size=20))

Working with collections

The llm.Collection class groups related embeddings in a SQLite database under a single name. Every entry in a collection has a unique string ID and was created with the same embedding model.

Creating a collection

import sqlite_utils
import llm

# In-memory database — discarded when the process exits
collection = llm.Collection("entries", model_id="3-small")

# Persistent database on disk
db = sqlite_utils.Database("my-embeddings.db")
collection = llm.Collection("entries", db, model_id="3-small")

# Pass a model instance directly instead of an ID
embedding_model = llm.get_embedding_model("3-small")
collection = llm.Collection("entries", db, model=embedding_model)
If the named collection already exists in the database, you can omit model and model_id — the model ID is read from the collections table automatically.
If you pass create=False and the collection does not exist, a Collection.DoesNotExist exception is raised:
try:
    collection = llm.Collection("entries", db, create=False)
except llm.Collection.DoesNotExist:
    print("Collection not found")

Embedding a single item

collection.embed("hound", "my happy hound")
This stores the embedding for "my happy hound" under the key "hound". If an entry with the same content hash already exists, the call is a no-op (deduplication is automatic). Add store=True to also persist the original text in the content column:
collection.embed("hound", "my happy hound", store=True)
Attach arbitrary JSON-serialisable metadata with metadata=:
collection.embed(
    "hound",
    "my happy hound",
    metadata={"name": "Hound"},
    store=True,
)

Storing embeddings in bulk

collection.embed_multi() accepts an iterable of (id, text) tuples and is more efficient than calling embed() in a loop:
collection.embed_multi(
    [
        ("hound", "my happy hound"),
        ("cat", "my dissatisfied cat"),
    ],
    store=True,
)
To include per-item metadata, use embed_multi_with_metadata() with (id, text, metadata) tuples:
collection.embed_multi_with_metadata(
    [
        ("hound", "my happy hound", {"name": "Hound"}),
        ("cat", "my dissatisfied cat", {"name": "Cat"}),
    ],
    store=True,
)
Both methods accept a batch_size parameter (default 100). Reduce it if you run into memory issues with large collections:
collection.embed_multi(
    ((i, line) for i, line in enumerate(lines_in_file)),
    batch_size=10,
)

Retrieving similar items

Once a collection is populated, use similar() to find the entries whose embedding vectors are closest (by cosine similarity) to a query string:
for entry in collection.similar("hound"):
    print(entry.id, entry.score)
Each entry is an Entry dataclass with four fields:
FieldTypeDescription
idstrUnique string ID of the item
scorefloatCosine similarity score (higher = more similar)
contentstr | NoneOriginal text, if stored with store=True
metadatadict | NoneMetadata dict, if provided at embed time
Return fewer or more results with number=:
for entry in collection.similar("hound", number=5):
    print(entry.id, entry.score)
Similarity search uses a brute-force approach, computing scores against every document in the collection. This is fine for smaller collections but will not scale well to very large ones. See issue 216 for plans to support scalable vector index plugins.

Searching by existing ID

similar_by_id() looks up the stored vector for an existing entry and returns the most similar neighbours, excluding the item itself:
for entry in collection.similar_by_id("cat"):
    print(entry.id, entry.score)

Searching by raw vector

similar_by_vector() accepts a raw list of floats, which is useful when you already have a pre-computed vector:
query_vector = embedding_model.embed("happy dog")

for entry in collection.similar_by_vector(query_vector, number=5):
    print(entry.id, entry.score)
Pass skip_id= to exclude a specific entry from the results.

Collection class reference

A Collection instance exposes the following properties and methods:
MemberDescription
idInteger primary key of the collection row
nameUnique string name of the collection
model_idString ID of the embedding model used
model()Returns the live EmbeddingModel instance
count()Returns the number of items in the collection
embed(id, value, metadata=None, store=False)Embed and store a single item
embed_multi(entries, store=False, batch_size=100)Embed and store multiple (id, text) pairs
embed_multi_with_metadata(entries, store=False, batch_size=100)Embed and store multiple (id, text, metadata) triples
similar(value, number=10)Find nearest neighbours by query string
similar_by_id(id, number=10)Find nearest neighbours by stored ID
similar_by_vector(vector, number=10, skip_id=None)Find nearest neighbours by raw float vector
delete()Delete the collection and all its embeddings
Use the Collection.exists() class method to check for a collection’s existence before working with it:
if llm.Collection.exists(db, "entries"):
    print("The entries collection exists")

SQL schema

The embeddings database contains two tables:
CREATE TABLE [collections] (
   [id] INTEGER PRIMARY KEY,
   [name] TEXT,
   [model] TEXT
)
CREATE TABLE "embeddings" (
   [collection_id] INTEGER REFERENCES [collections]([id]),
   [id] TEXT,
   [embedding] BLOB,
   [content] TEXT,
   [content_blob] BLOB,
   [content_hash] BLOB,
   [metadata] TEXT,
   [updated] INTEGER,
   PRIMARY KEY ([collection_id], [id])
)
  • collections.model stores the full model ID (e.g. text-embedding-3-small).
  • embeddings.embedding is a binary blob of little-endian 32-bit floats. See Storage Format for encode/decode details.
  • embeddings.content_hash is an MD5 digest of the original content, used to skip re-embedding identical inputs.
  • embeddings.metadata is a JSON string, or NULL if no metadata was provided.

Build docs developers (and LLMs) love