Method signature
Overview
Create a response using the OpenAI Responses API. This endpoint routes directly to OpenAI’s Responses API and only supports OpenAI models. The Responses API provides a stateful conversation interface with built-in support for function calling, file inputs, and conversation state management.Parameters
Whether to run the model response in the background. Learn more in the OpenAI background guide.
Conversation that this response belongs to. Items from this conversation are prepended to the input items, and output items from this response are automatically added after completion.
Credentials for MCP server authentication. Each credential is matched to servers by connection name.
Penalizes new tokens based on their frequency in the text so far. Values between -2.0 and 2.0.
Specify additional output data to include in the model response. Supported values:
web_search_call.action.sources- Include web search sourcescode_interpreter_call.outputs- Include code execution outputscomputer_call_output.output.image_url- Include computer call imagesfile_search_call.results- Include file search resultsmessage.input_image.image_url- Include input image URLsmessage.output_text.logprobs- Include logprobs with assistant messagesreasoning.encrypted_content- Include encrypted reasoning tokens
Text, image, or file inputs to the model. Can be a string for simple text input, or a list of objects for multi-modal inputs.Learn more:
A system (or developer) message inserted into the model’s context. When using with
previous_response_id, instructions from the previous response will not be carried over.Maximum number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
Maximum number of tool calls that can be processed in a single request.
Configuration for Model Context Protocol (MCP) servers to enable tool execution.
Metadata to attach to the response. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
The OpenAI model to use. Only OpenAI models are supported for the Responses API (e.g.,
gpt-4, gpt-4-turbo, gpt-3.5-turbo).Whether to enable parallel function calling during tool use.
Penalizes new tokens based on whether they appear in the text so far. Values between -2.0 and 2.0.
The ID of a previous response to continue the conversation from.
Structured prompt configuration for the model.
Key for caching the prompt to improve performance on repeated requests.
Configuration for reasoning behavior in models that support it.
Identifier for safety and content filtering settings.
Service tier to use for the request.
auto will use the best available tier.Whether to store the conversation state. Set to
false for stateless usage.Whether to stream the response. If
true, returns a streaming response.Additional options for streaming responses.
Sampling temperature between 0 and 2. Higher values make output more random.
Text-specific configuration options.
Controls which tool (if any) the model should use. Can be
"none", "auto", "required", or a specific tool object.List of tools available for the model to call. Each tool is defined with a function schema.
Number of most likely tokens to return at each token position (0-20).
Nucleus sampling parameter. Alternative to temperature sampling.
How to handle input truncation when it exceeds the model’s context window.
Unique identifier for the end-user, for abuse monitoring purposes.
Response
Returns aResponse object containing:
Unique identifier for the response.
Object type, always
"response".Unix timestamp of when the response was created.
The model used to generate the response.
ID of the conversation this response belongs to.
List of output items from the model response.
Token usage information for the request.
Example
Notes
The Responses API is an OpenAI-specific feature and only works with OpenAI models. For multi-provider support, use the Chat Completions API instead.
The Responses API provides built-in conversation state management. Use the
store parameter to enable stateful conversations, or set it to false for stateless usage.