TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Excurs1ons/MonoRelay/llms.txt
Use this file to discover all available pages before exploring further.
/v1/completions endpoint provides legacy text completion behavior: you supply a raw text prompt and the model continues it. This differs from chat completions, which use a structured message list. The endpoint is maintained for compatibility with older OpenAI SDK versions and tools that target the original text-davinci-003-style interface.
For modern chat-oriented models such as GPT-4o, Claude, or Gemini, prefer
/v1/chat/completions. The completions endpoint is best suited for base models or instruct-tuned models that expect a raw prompt rather than a conversation.Method and path
Authentication
Include your Bearer token in theAuthorization header on every request.
Request body
Model name, alias, or
model@provider syntax. MonoRelay resolves the model through the same routing rules used by the chat endpoint.The prompt text (or array of prompts) to complete. The model generates a continuation starting from where this text ends.
Maximum number of tokens to generate. When omitted, the upstream provider’s default limit applies.
Sampling temperature between
0 and 2. Lower values produce more focused, deterministic output.When
true, the response is delivered as SSE stream chunks ending with data: [DONE].One or more sequences at which generation should stop. The stop sequence is not included in the output.
Number of completion choices to generate for the prompt.
Example
Error responses
Errors follow the same structure as all MonoRelay endpoints, returning a JSON body with anerror object and HTTP 503 for upstream failures.