Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Excurs1ons/MonoRelay/llms.txt

Use this file to discover all available pages before exploring further.

The /v1/embeddings endpoint converts text into dense numerical vectors (embeddings) that capture semantic meaning. These vectors are useful for semantic search, retrieval-augmented generation (RAG), clustering, and similarity ranking. MonoRelay routes the request to the appropriate provider based on your model routing configuration and returns the embeddings in the standard OpenAI format.

Method and path

POST /v1/embeddings

Authentication

Include your Bearer token in the Authorization header.
Authorization: Bearer <your-access-token>

Request body

model
string
required
The embedding model to use, such as text-embedding-3-small or text-embedding-ada-002. Accepts aliases and model@provider syntax.
input
string | string[]
required
The text to embed. Pass a single string for one embedding, or an array of strings to embed multiple texts in a single request. All strings are processed as a batch by the upstream provider.
encoding_format
string
default:"float"
The format of the returned embeddings. Use "float" for a list of floating-point numbers, or "base64" for a base64-encoded binary representation. Not all providers support "base64".

Response

object
string
Always "list".
data
object[]
Array of embedding objects, one per input string, in the same order as the input.
model
string
The model name as returned by the upstream provider.
usage
object
Token usage for the request.

Examples

from openai import OpenAI

client = OpenAI(
    base_url="https://<host>/v1",
    api_key="<your-access-token>",
)

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["MonoRelay is an LLM relay server.", "It supports multiple providers."],
)

for item in response.data:
    print(f"Index {item.index}: {len(item.embedding)}-dimensional vector")

Error responses

Errors are returned as JSON with HTTP 503. The message field is prefixed with the provider name for easier debugging.
{
  "error": {
    "message": "[openai] No available keys for provider 'openai'",
    "type": "no_keys"
  }
}

Build docs developers (and LLMs) love