Completions

POST /v1/completions

Generates a text completion for a given prompt string. This is the legacy completions endpoint, distinct from the chat completions endpoint. It is maintained for compatibility with older OpenAI integrations and providers that expose a complete endpoint.

For most use cases, prefer chat completions. Modern models such as GPT-4o and Claude are optimized for the chat format.

Request headers

x-portkey-provider

string

The provider to route the request to (e.g. openai). Required when not using a config.

x-portkey-api-key

string

Your provider API key.

x-portkey-config

string

A JSON config object or config ID that defines routing, fallbacks, retries, and more.

x-portkey-virtual-key

string

A virtual key ID from Portkey Cloud that maps to a stored provider credential.

Request body

model

string

required

The model to use for completion (e.g. gpt-3.5-turbo-instruct).

prompt

string | string[]

required

The prompt to generate a completion for. Accepts a string or an array of strings for batched input.

temperature

number

default:"1"

Sampling temperature between 0 and 2. Higher values produce more varied output.

max_tokens

integer

default:"100"

The maximum number of tokens to generate in the completion.

stream

boolean

default:"false"

When true, partial results are streamed as server-sent events and the stream ends with data: [DONE].

top_p

number

default:"1"

Nucleus sampling threshold. Only tokens within the top top_p probability mass are considered.

frequency_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values reduce repeated tokens.

presence_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values encourage new topics.

integer

default:"1"

The number of completions to generate for each prompt.

stop

string | string[]

One or more sequences at which to stop generating. The stop sequence is not included in the output.

logprobs

integer

Include the log probabilities of the logprobs most likely tokens at each position.

echo

boolean

default:"false"

When true, echo the prompt back in the response along with the completion.

best_of

integer

default:"1"

Generate best_of completions server-side and return the best. Incurs higher token cost.

logit_bias

object

A map of token IDs to bias values from -100 to 100. Adjusts the likelihood of specific tokens appearing.

seed

integer

A seed for deterministic sampling.

user

string

A unique identifier for the end user, used for monitoring.

Response

string

A unique identifier for this completion.

object

string

Always text_completion.

created

integer

Unix timestamp of when the completion was created.

model

string

The model used for the completion.

choices

object[]

An array of completion choices.

Show properties

text

string

The generated completion text.

index

integer

The index of this choice.

finish_reason

string

Why generation stopped. One of stop, length, or content_filter.

logprobs

object | null

Log probabilities, when requested.

usage

object

Token usage for this request.

Show properties

prompt_tokens

integer

Number of tokens in the prompt.

completion_tokens

integer

Number of tokens generated.

total_tokens

integer

Total tokens consumed.

Code examples

curl http://localhost:8787/v1/completions \
  -H "Content-Type: application/json" \
  -H "x-portkey-provider: openai" \
  -H "x-portkey-api-key: $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "The Portkey AI Gateway is",
    "max_tokens": 128,
    "temperature": 0.7
  }'

Overview

Chat

Multimodal

Files & Batches

Other

POST /v1/completions

Request headers

Request body

Response

Code examples

Build docs developers (and LLMs) love

Overview

Chat

Multimodal

Files & Batches

Other

​POST /v1/completions

​Request headers

​Request body

​Response

​Code examples

Build docs developers (and LLMs) love

POST /v1/completions

Request headers

Request body

Response

Code examples