TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/jundot/omlx/llms.txt
Use this file to discover all available pages before exploring further.
/v1/embeddings endpoint generates dense vector representations of text using embedding models. It is fully compatible with the OpenAI Embeddings API, so any library using openai.embeddings.create(...) works by changing only the base URL. oMLX auto-detects embedding model families — BERT, BGE-M3, and ModernBERT are supported — and routes requests to the appropriate engine automatically.
Supported models
| Family | Examples |
|---|---|
| BERT | bert-base-uncased, bert-large-uncased |
| BGE-M3 | bge-m3, bge-large-en-v1.5 |
| ModernBERT | modernbert-base, modernbert-large |
--model-dir at a directory containing MLX-format embedding model subdirectories. The model type is detected automatically; no manual configuration is needed.
Request
POST /v1/embeddings
Parameters
The embedding model name or alias to use. Must match an embedding model discovered in your model directory.
The text to embed. Accepts a single string or an array of strings. Each string is embedded independently. Either
input or items must be provided, but not both.Structured input for multimodal embedding models. Each item is an object with at least one of
text (string) or image (string). Mutually exclusive with input.Format of the returned embedding vector.
"float" returns a JSON array of numbers; "base64" returns a base64-encoded string of little-endian 32-bit floats, suitable for compact storage.Truncate the output embedding to this many dimensions. Only supported by models that allow dimension reduction. If the model does not support it, the full-dimension vector is returned.
Examples
Response
Always
"list".List of embedding results, one per input string, in the same order as the input.
The embedding model used.