create

Method signature

client.responses.create(
    *,
    background: Optional[bool] = None,
    conversation: Optional[dict] = None,
    credentials: Optional[dict] = None,
    frequency_penalty: Optional[float] = None,
    include: Optional[list[str]] = None,
    input: Union[str, list[dict], None] = None,
    instructions: Union[str, list[dict], None] = None,
    max_output_tokens: Optional[int] = None,
    max_tool_calls: Optional[int] = None,
    mcp_servers: Optional[dict] = None,
    metadata: Optional[dict[str, str]] = None,
    model: Optional[str] = None,
    parallel_tool_calls: Optional[bool] = None,
    presence_penalty: Optional[float] = None,
    previous_response_id: Optional[str] = None,
    prompt: Optional[dict] = None,
    prompt_cache_key: Optional[str] = None,
    reasoning: Optional[dict] = None,
    safety_identifier: Optional[str] = None,
    service_tier: Optional[Literal["auto", "default"]] = None,
    store: Optional[bool] = None,
    stream: bool = False,
    stream_options: Optional[dict] = None,
    temperature: Optional[float] = None,
    text: Optional[dict] = None,
    tool_choice: Union[str, dict, None] = None,
    tools: Optional[list[dict]] = None,
    top_logprobs: Optional[int] = None,
    top_p: Optional[float] = None,
    truncation: Optional[Literal["auto", "disabled"]] = None,
    user: Optional[str] = None,
) -> Response

Overview

Create a response using the OpenAI Responses API. This endpoint routes directly to OpenAI’s Responses API and only supports OpenAI models. The Responses API provides a stateful conversation interface with built-in support for function calling, file inputs, and conversation state management.

Parameters

background

bool

Whether to run the model response in the background. Learn more in the OpenAI background guide.

conversation

dict

Conversation that this response belongs to. Items from this conversation are prepended to the input items, and output items from this response are automatically added after completion.

credentials

dict

Credentials for MCP server authentication. Each credential is matched to servers by connection name.

frequency_penalty

float

Penalizes new tokens based on their frequency in the text so far. Values between -2.0 and 2.0.

include

list[str]

Specify additional output data to include in the model response. Supported values:

web_search_call.action.sources - Include web search sources
code_interpreter_call.outputs - Include code execution outputs
computer_call_output.output.image_url - Include computer call images
file_search_call.results - Include file search results
message.input_image.image_url - Include input image URLs
message.output_text.logprobs - Include logprobs with assistant messages
reasoning.encrypted_content - Include encrypted reasoning tokens

input

str | list[dict]

Text, image, or file inputs to the model. Can be a string for simple text input, or a list of objects for multi-modal inputs.Learn more:

instructions

str | list[dict]

A system (or developer) message inserted into the model’s context. When using with previous_response_id, instructions from the previous response will not be carried over.

max_output_tokens

int

Maximum number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.

max_tool_calls

int

Maximum number of tool calls that can be processed in a single request.

mcp_servers

dict

Configuration for Model Context Protocol (MCP) servers to enable tool execution.

metadata

dict[str, str]

Metadata to attach to the response. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

model

str

The OpenAI model to use. Only OpenAI models are supported for the Responses API (e.g., gpt-4, gpt-4-turbo, gpt-3.5-turbo).

parallel_tool_calls

bool

Whether to enable parallel function calling during tool use.

presence_penalty

float

Penalizes new tokens based on whether they appear in the text so far. Values between -2.0 and 2.0.

previous_response_id

str

The ID of a previous response to continue the conversation from.

prompt

dict

Structured prompt configuration for the model.

prompt_cache_key

str

Key for caching the prompt to improve performance on repeated requests.

reasoning

dict

Configuration for reasoning behavior in models that support it.

safety_identifier

str

Identifier for safety and content filtering settings.

service_tier

Literal['auto', 'default']

Service tier to use for the request. auto will use the best available tier.

store

bool

Whether to store the conversation state. Set to false for stateless usage.

stream

bool

default:"false"

Whether to stream the response. If true, returns a streaming response.

stream_options

dict

Additional options for streaming responses.

temperature

float

Sampling temperature between 0 and 2. Higher values make output more random.

text

dict

Text-specific configuration options.

tool_choice

str | dict

Controls which tool (if any) the model should use. Can be "none", "auto", "required", or a specific tool object.

tools

list[dict]

List of tools available for the model to call. Each tool is defined with a function schema.

top_logprobs

int

Number of most likely tokens to return at each token position (0-20).

top_p

float

Nucleus sampling parameter. Alternative to temperature sampling.

truncation

Literal['auto', 'disabled']

How to handle input truncation when it exceeds the model’s context window.

user

str

Unique identifier for the end-user, for abuse monitoring purposes.

Response

Returns a Response object containing:

str

Unique identifier for the response.

object

str

Object type, always "response".

created

int

Unix timestamp of when the response was created.

model

str

The model used to generate the response.

conversation_id

str

ID of the conversation this response belongs to.

output

list

List of output items from the model response.

usage

dict

Token usage information for the request.

Show properties

prompt_tokens

int

Number of tokens in the prompt.

completion_tokens

int

Number of tokens in the completion.

total_tokens

int

Total tokens used.

Example

from dedalus_labs import Dedalus

client = Dedalus()

response = client.responses.create(
    model="gpt-4",
    input="What is the capital of France?",
    instructions="You are a helpful geography assistant.",
    temperature=0.7,
)

print(response.id)
print(response.output)

Notes

The Responses API is an OpenAI-specific feature and only works with OpenAI models. For multi-provider support, use the Chat Completions API instead.

The Responses API provides built-in conversation state management. Use the store parameter to enable stateful conversations, or set it to false for stateless usage.

When using previous_response_id to continue a conversation, make sure the previous response was created with store=True.

Client

Chat

Embeddings

Audio

Images

OCR

Models

Responses

Runner

Types

Method signature

Overview

Parameters

Response

Example

Notes

Build docs developers (and LLMs) love

Client

Chat

Embeddings

Audio

Images

OCR

Models

Responses

Runner

Types

​Method signature

​Overview

​Parameters

​Response

​Example

​Notes

Build docs developers (and LLMs) love

Method signature

Overview

Parameters

Response

Example

Notes