LLM Python API for Embeddings: Collection Class Guide

LLM exposes a complete Python API for working with embeddings. You can load an embedding model, compute vectors for individual strings or binary data, and persist those vectors in named collections backed by SQLite. Collections support bulk ingestion, deduplication via content hashing, and cosine-similarity search — all without running the CLI.

Loading a model

Use llm.get_embedding_model() with a model ID or alias:

import llm

embedding_model = llm.get_embedding_model("3-small")

Embedding a string

Call .embed() on the model to compute a vector for a single string. This returns a Python list of floating-point numbers:

vector = embedding_model.embed("my happy hound")

Binary input

Some models — such as CLIP — can embed raw binary data (e.g. images). Check supports_binary before passing bytes:

if embedding_model.supports_binary:
    vector = embedding_model.embed(open("my-image.jpg", "rb").read())

The embedding_model.supports_text property similarly indicates whether text input is supported.

Embedding multiple strings at once

Many models are more efficient when processing a batch of inputs. Use .embed_multi(), which returns a generator of vectors:

vectors = list(embedding_model.embed_multi(["my happy hound", "my dissatisfied cat"]))

Control the batch size with batch_size=N:

vectors = list(embedding_model.embed_multi(lines_from_file, batch_size=20))

Working with collections

The llm.Collection class groups related embeddings in a SQLite database under a single name. Every entry in a collection has a unique string ID and was created with the same embedding model.

Creating a collection

import sqlite_utils
import llm

# In-memory database — discarded when the process exits
collection = llm.Collection("entries", model_id="3-small")

# Persistent database on disk
db = sqlite_utils.Database("my-embeddings.db")
collection = llm.Collection("entries", db, model_id="3-small")

# Pass a model instance directly instead of an ID
embedding_model = llm.get_embedding_model("3-small")
collection = llm.Collection("entries", db, model=embedding_model)

If the named collection already exists in the database, you can omit model and model_id — the model ID is read from the collections table automatically.

If you pass create=False and the collection does not exist, a Collection.DoesNotExist exception is raised:

try:
    collection = llm.Collection("entries", db, create=False)
except llm.Collection.DoesNotExist:
    print("Collection not found")

Embedding a single item

collection.embed("hound", "my happy hound")

This stores the embedding for "my happy hound" under the key "hound". If an entry with the same content hash already exists, the call is a no-op (deduplication is automatic). Add store=True to also persist the original text in the content column:

collection.embed("hound", "my happy hound", store=True)

Attach arbitrary JSON-serialisable metadata with metadata=:

collection.embed(
    "hound",
    "my happy hound",
    metadata={"name": "Hound"},
    store=True,
)

Storing embeddings in bulk

collection.embed_multi() accepts an iterable of (id, text) tuples and is more efficient than calling embed() in a loop:

collection.embed_multi(
    [
        ("hound", "my happy hound"),
        ("cat", "my dissatisfied cat"),
    ],
    store=True,
)

To include per-item metadata, use embed_multi_with_metadata() with (id, text, metadata) tuples:

collection.embed_multi_with_metadata(
    [
        ("hound", "my happy hound", {"name": "Hound"}),
        ("cat", "my dissatisfied cat", {"name": "Cat"}),
    ],
    store=True,
)

Both methods accept a batch_size parameter (default 100). Reduce it if you run into memory issues with large collections:

collection.embed_multi(
    ((i, line) for i, line in enumerate(lines_in_file)),
    batch_size=10,
)

Retrieving similar items

Once a collection is populated, use similar() to find the entries whose embedding vectors are closest (by cosine similarity) to a query string:

for entry in collection.similar("hound"):
    print(entry.id, entry.score)

Each entry is an Entry dataclass with four fields:

Field	Type	Description
`id`	`str`	Unique string ID of the item
`score`	`float`	Cosine similarity score (higher = more similar)
`content`	`str \| None`	Original text, if stored with `store=True`
`metadata`	`dict \| None`	Metadata dict, if provided at embed time

Return fewer or more results with number=:

for entry in collection.similar("hound", number=5):
    print(entry.id, entry.score)

Similarity search uses a brute-force approach, computing scores against every document in the collection. This is fine for smaller collections but will not scale well to very large ones. See issue 216 for plans to support scalable vector index plugins.

Searching by existing ID

similar_by_id() looks up the stored vector for an existing entry and returns the most similar neighbours, excluding the item itself:

for entry in collection.similar_by_id("cat"):
    print(entry.id, entry.score)

Searching by raw vector

similar_by_vector() accepts a raw list of floats, which is useful when you already have a pre-computed vector:

query_vector = embedding_model.embed("happy dog")

for entry in collection.similar_by_vector(query_vector, number=5):
    print(entry.id, entry.score)

Pass skip_id= to exclude a specific entry from the results.

Collection class reference

A Collection instance exposes the following properties and methods:

Member	Description
`id`	Integer primary key of the collection row
`name`	Unique string name of the collection
`model_id`	String ID of the embedding model used
`model()`	Returns the live `EmbeddingModel` instance
`count()`	Returns the number of items in the collection
`embed(id, value, metadata=None, store=False)`	Embed and store a single item
`embed_multi(entries, store=False, batch_size=100)`	Embed and store multiple `(id, text)` pairs
`embed_multi_with_metadata(entries, store=False, batch_size=100)`	Embed and store multiple `(id, text, metadata)` triples
`similar(value, number=10)`	Find nearest neighbours by query string
`similar_by_id(id, number=10)`	Find nearest neighbours by stored ID
`similar_by_vector(vector, number=10, skip_id=None)`	Find nearest neighbours by raw float vector
`delete()`	Delete the collection and all its embeddings

Use the Collection.exists() class method to check for a collection’s existence before working with it:

if llm.Collection.exists(db, "entries"):
    print("The entries collection exists")

SQL schema

The embeddings database contains two tables:

CREATE TABLE [collections] (
   [id] INTEGER PRIMARY KEY,
   [name] TEXT,
   [model] TEXT
)
CREATE TABLE "embeddings" (
   [collection_id] INTEGER REFERENCES [collections]([id]),
   [id] TEXT,
   [embedding] BLOB,
   [content] TEXT,
   [content_blob] BLOB,
   [content_hash] BLOB,
   [metadata] TEXT,
   [updated] INTEGER,
   PRIMARY KEY ([collection_id], [id])
)

collections.model stores the full model ID (e.g. text-embedding-3-small).
embeddings.embedding is a binary blob of little-endian 32-bit floats. See Storage Format for encode/decode details.
embeddings.content_hash is an MD5 digest of the original content, used to skip re-embedding identical inputs.
embeddings.metadata is a JSON string, or NULL if no metadata was provided.

Get Started

Using LLM

Advanced Features

Embeddings

Plugins

LLM Python API for Embeddings: Collection Class Guide

Loading a model

Embedding a string

Binary input

Embedding multiple strings at once

Working with collections

Creating a collection

Embedding a single item

Storing embeddings in bulk

Retrieving similar items

Searching by existing ID

Searching by raw vector

Collection class reference

SQL schema

Build docs developers (and LLMs) love

Get Started

Using LLM

Advanced Features

Embeddings

Plugins

Documentation Index

​Loading a model

​Embedding a string

​Binary input

​Embedding multiple strings at once

​Working with collections

​Creating a collection

​Embedding a single item

​Storing embeddings in bulk

​Retrieving similar items

​Searching by existing ID

​Searching by raw vector

​Collection class reference

​SQL schema

Build docs developers (and LLMs) love

Loading a model

Embedding a string

Binary input

Embedding multiple strings at once

Working with collections

Creating a collection

Embedding a single item

Storing embeddings in bulk

Retrieving similar items

Searching by existing ID

Searching by raw vector

Collection class reference

SQL schema