Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BerriAI/litellm/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Generate embeddings for text using any of LiteLLM’s supported embedding providers. Returns responses in OpenAI format.

Function Signature

def embedding(
    model: str,
    input: Union[str, List[str]] = [],
    # Optional params
    dimensions: Optional[int] = None,
    encoding_format: Optional[str] = None,
    timeout: float = 600,
    # API configuration
    api_base: Optional[str] = None,
    api_version: Optional[str] = None,
    api_key: Optional[str] = None,
    api_type: Optional[str] = None,
    # LiteLLM specific
    caching: bool = False,
    user: Optional[str] = None,
    custom_llm_provider: Optional[str] = None,
    **kwargs
) -> EmbeddingResponse

Parameters

Required Parameters

model
string
required
The embedding model to use.Examples:
  • text-embedding-3-small (OpenAI)
  • text-embedding-ada-002 (OpenAI)
  • amazon.titan-embed-text-v1 (Bedrock)
  • textembedding-gecko@003 (Vertex AI)
  • embed-english-v3.0 (Cohere)
input
Union[str, List[str]]
required
Input text to embed. Can be a single string or array of strings.
# Single string
input="The quick brown fox"

# Multiple strings
input=["First text", "Second text", "Third text"]

Optional Parameters

dimensions
int
Number of dimensions for the output embeddings. Only supported by some models (e.g., text-embedding-3 and later).
dimensions=512  # Reduce from default 1536
encoding_format
string
default:"float"
Format to return embeddings in.Options:
  • "float": Array of floats
  • "base64": Base64 encoded string
timeout
float
default:"600"
Request timeout in seconds (default 10 minutes).
user
string
Unique identifier for your end-user, for abuse monitoring.

API Configuration

api_key
string
API key for the provider. If not provided, uses environment variables.
api_base
string
Base URL for the API endpoint.
api_version
string
API version to use (provider-specific).
api_type
string
API type (e.g., “azure” for Azure OpenAI).

LiteLLM Specific

caching
bool
default:"false"
Enable response caching.
custom_llm_provider
string
Override the provider detection.Example: custom_llm_provider="bedrock"
metadata
dict
Additional metadata to tag the request.

Response

EmbeddingResponse

object
string
Object type, always “list”.
data
List[Embedding]
List of embedding objects.
model
string
Model used for embeddings.
usage
Usage
Token usage information.

Usage Examples

Basic Embedding

import litellm

response = litellm.embedding(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog"
)

print(response.data[0].embedding)  # [0.123, -0.456, ...]
print(len(response.data[0].embedding))  # 1536

Batch Embeddings

import litellm

texts = [
    "First document to embed",
    "Second document to embed",
    "Third document to embed"
]

response = litellm.embedding(
    model="text-embedding-3-small",
    input=texts
)

for i, embedding_obj in enumerate(response.data):
    print(f"Document {i}: {len(embedding_obj.embedding)} dimensions")

Async Embeddings

import litellm
import asyncio

async def main():
    response = await litellm.aembedding(
        model="text-embedding-3-small",
        input="Async embedding example"
    )
    print(response.data[0].embedding)

asyncio.run(main())

Custom Dimensions

import litellm

# Reduce embedding dimensions for smaller storage
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Sample text",
    dimensions=512  # Instead of default 1536
)

print(len(response.data[0].embedding))  # 512

Multiple Providers

import litellm

# OpenAI
response = litellm.embedding(
    model="text-embedding-3-small",
    input="Hello world"
)

# Cohere
response = litellm.embedding(
    model="embed-english-v3.0",
    input="Hello world"
)

# AWS Bedrock
response = litellm.embedding(
    model="amazon.titan-embed-text-v1",
    input="Hello world"
)

# Azure OpenAI
response = litellm.embedding(
    model="azure/text-embedding-ada-002",
    input="Hello world",
    api_key="your-azure-key",
    api_base="https://your-endpoint.openai.azure.com/",
    api_version="2024-02-01"
)

# Vertex AI
response = litellm.embedding(
    model="textembedding-gecko@003",
    input="Hello world"
)

Semantic Search Example

import litellm
import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Documents to search
documents = [
    "Python is a programming language",
    "JavaScript is used for web development",
    "Machine learning is a subset of AI"
]

# Get embeddings for all documents
response = litellm.embedding(
    model="text-embedding-3-small",
    input=documents
)

doc_embeddings = [item.embedding for item in response.data]

# Query
query = "What is Python?"
query_response = litellm.embedding(
    model="text-embedding-3-small",
    input=query
)
query_embedding = query_response.data[0].embedding

# Find most similar document
similarities = [
    cosine_similarity(query_embedding, doc_emb) 
    for doc_emb in doc_embeddings
]

most_similar_idx = np.argmax(similarities)
print(f"Most similar: {documents[most_similar_idx]}")
print(f"Similarity: {similarities[most_similar_idx]:.4f}")

Provider-Specific Examples

Cohere with Input Type

import litellm

response = litellm.embedding(
    model="embed-english-v3.0",
    input="Sample text",
    input_type="search_document"  # or "search_query", "classification"
)

Vertex AI Multimodal Embeddings

import litellm

response = litellm.embedding(
    model="multimodalembedding@001",
    input="Sample text"
)

Error Handling

import litellm
from litellm import AuthenticationError, RateLimitError

try:
    response = litellm.embedding(
        model="text-embedding-3-small",
        input="Sample text"
    )
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except Exception as e:
    print(f"An error occurred: {e}")

Supported Providers

LiteLLM supports embeddings from:
  • OpenAI: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
  • Azure OpenAI: All OpenAI embedding models
  • Cohere: embed-english-v3.0, embed-multilingual-v3.0
  • AWS Bedrock: amazon.titan-embed-text-v1, cohere.embed-*
  • Google Vertex AI: textembedding-gecko, text-embedding-004
  • Hugging Face: All embedding models
  • Voyage AI: voyage-2, voyage-code-2
  • Together AI: togethercomputer/m2-bert-80M-*
  • And many more!
See Embedding Providers for the complete list.

Build docs developers (and LLMs) love