Skip to main content

POST /v1/completions

Generates a text completion for a given prompt string. This is the legacy completions endpoint, distinct from the chat completions endpoint. It is maintained for compatibility with older OpenAI integrations and providers that expose a complete endpoint.
For most use cases, prefer chat completions. Modern models such as GPT-4o and Claude are optimized for the chat format.

Request headers

x-portkey-provider
string
The provider to route the request to (e.g. openai). Required when not using a config.
x-portkey-api-key
string
Your provider API key.
x-portkey-config
string
A JSON config object or config ID that defines routing, fallbacks, retries, and more.
x-portkey-virtual-key
string
A virtual key ID from Portkey Cloud that maps to a stored provider credential.

Request body

model
string
required
The model to use for completion (e.g. gpt-3.5-turbo-instruct).
prompt
string | string[]
required
The prompt to generate a completion for. Accepts a string or an array of strings for batched input.
temperature
number
default:"1"
Sampling temperature between 0 and 2. Higher values produce more varied output.
max_tokens
integer
default:"100"
The maximum number of tokens to generate in the completion.
stream
boolean
default:"false"
When true, partial results are streamed as server-sent events and the stream ends with data: [DONE].
top_p
number
default:"1"
Nucleus sampling threshold. Only tokens within the top top_p probability mass are considered.
frequency_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values reduce repeated tokens.
presence_penalty
number
default:"0"
Number between -2.0 and 2.0. Positive values encourage new topics.
n
integer
default:"1"
The number of completions to generate for each prompt.
stop
string | string[]
One or more sequences at which to stop generating. The stop sequence is not included in the output.
logprobs
integer
Include the log probabilities of the logprobs most likely tokens at each position.
echo
boolean
default:"false"
When true, echo the prompt back in the response along with the completion.
best_of
integer
default:"1"
Generate best_of completions server-side and return the best. Incurs higher token cost.
logit_bias
object
A map of token IDs to bias values from -100 to 100. Adjusts the likelihood of specific tokens appearing.
seed
integer
A seed for deterministic sampling.
user
string
A unique identifier for the end user, used for monitoring.

Response

id
string
A unique identifier for this completion.
object
string
Always text_completion.
created
integer
Unix timestamp of when the completion was created.
model
string
The model used for the completion.
choices
object[]
An array of completion choices.
usage
object
Token usage for this request.

Code examples

curl http://localhost:8787/v1/completions \
  -H "Content-Type: application/json" \
  -H "x-portkey-provider: openai" \
  -H "x-portkey-api-key: $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "The Portkey AI Gateway is",
    "max_tokens": 128,
    "temperature": 0.7
  }'

Build docs developers (and LLMs) love