Conversations in the LLM Python API: Multi-Turn Prompts

LLM’s Conversation object tracks the full message history across multiple prompt turns so the model always has context from earlier exchanges. This is the Python equivalent of the llm chat CLI command — you start a conversation once and keep calling conversation.prompt() to send follow-up messages.

Starting a Conversation

Call model.conversation() to create a new Conversation object:

import llm

model = llm.get_model()
conversation = model.conversation()

You can pass a system= prompt to set the assistant’s persona for the entire conversation, as well as tools= to make tool functions available across all turns (see Conversations Using Tools below).

Sending Turns

Use conversation.prompt() exactly like model.prompt(). Each call automatically includes all prior turns in the context sent to the model:

First message

response = conversation.prompt("Five fun facts about pelicans")
print(response.text())

Follow-up message

response2 = conversation.prompt("Now do skunks")
print(response2.text())

Because the conversation object tracks history, the model knows “now do skunks” means “give me five fun facts about skunks” — no need to repeat the original instruction.

Attachments in a turn

response3 = conversation.prompt(
    "Describe these birds",
    attachments=[
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg")
    ]
)
print(response3.text())

Accessing Prior Responses

conversation.responses is a list of every Response returned so far during the conversation:

model = llm.get_model()
conversation = model.conversation()

conversation.prompt("What is the capital of France?").text()
conversation.prompt("And Germany?").text()
conversation.prompt("And Italy?").text()

print(len(conversation.responses))  # 3

for r in conversation.responses:
    print(r.prompt.prompt, "→", r.text())

Each element is a full Response object, so you can inspect .usage(), .tool_calls(), .messages(), and everything else documented in the Python API reference.

The `response.conversation` Attribute

Every Response produced by conversation.prompt() carries a reference back to its parent Conversation:

response = conversation.prompt("Name a river")
print(response.conversation)  # prints the conversation object
assert response.conversation is conversation  # True — same object

This lets any code that only has a response object navigate back to the full conversation history.

Replying Directly from a Response

response.reply() lets you continue the conversation directly from a Response object without a Conversation — useful when you want to branch or replay:

import llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt("What's 2+2?")
print(response.text())

follow_up = response.reply("Now multiply that by 10")
print(follow_up.text())

You can also persist a response to JSON and resume later:

import json, llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt("Tell me about otters")
payload = json.dumps(response.to_dict())

# Later — rehydrate and continue
rebuilt = llm.Response.from_dict(json.loads(payload))
followup = rebuilt.reply("What do they eat?")
print(followup.text())

Conversations Using Tools

Pass a list of tool functions to model.conversation(tools=[...]) to make them available throughout the conversation. Use conversation.chain() instead of conversation.prompt() when you want the automatic tool-call loop:

import llm

def upper(text: str) -> str:
    "convert text to upper case"
    return text.upper()

def reverse(text: str) -> str:
    "reverse text"
    return text[::-1]

model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(tools=[upper, reverse])

print(conversation.chain(
    "Convert panda to uppercase and reverse it"
).text())

print(conversation.chain(
    "Same with pangolin"
).text())

The second conversation.chain() call inherits the tools from the conversation and still has context from the first exchange. The model knows “same with pangolin” means apply the same uppercase + reverse operations.

Tool debugging hooks in conversations

Pass before_call= and after_call= to model.conversation() to apply those hooks to every chained prompt in the conversation:

import llm
from typing import Optional

def log_before(tool: Optional[llm.Tool], tool_call: llm.ToolCall):
    print(f"→ calling {tool_call.name}({tool_call.arguments})")

def log_after(tool: llm.Tool, tool_call: llm.ToolCall, result: llm.ToolResult):
    print(f"← {tool_call.name} returned {result.output!r}")

def upper(text: str) -> str:
    "convert text to upper case"
    return text.upper()

model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(
    tools=[upper],
    before_call=log_before,
    after_call=log_after,
)

print(conversation.chain("Convert panda to uppercase").text())
print(conversation.chain("Now do penguin").text())

You can also pass before_call= and after_call= directly to individual conversation.chain() calls to override the conversation-level hooks for a single turn.

Toolbox instances in conversations

llm.Toolbox instances maintain state between turns, making them ideal for stateful conversation tools like memory or session-scoped resources:

import llm

class Memory(llm.Toolbox):
    _memory = None

    def _get_memory(self):
        if self._memory is None:
            self._memory = {}
        return self._memory

    def set(self, key: str, value: str):
        "Set a value by key"
        self._get_memory()[key] = value

    def get(self, key: str):
        "Get a value by key"
        return self._get_memory().get(key) or ""

    def keys(self):
        "Return all keys"
        return list(self._get_memory().keys())

model = llm.get_model("gpt-4.1-mini")
memory = Memory()
conversation = model.conversation(tools=[memory])

print(conversation.chain("Set name to Simon").text())
print(memory._memory)          # {'name': 'Simon'}

print(conversation.chain("Set name to Penguin").text())
print(conversation.chain("What is the current name?").text())

Because memory is the same Python object across all turns, memory._memory accumulates values from every call the model makes to the set tool.

Full Multi-Turn Example

import llm

def celsius_to_fahrenheit(celsius: float) -> float:
    """Convert a temperature from Celsius to Fahrenheit."""
    return celsius * 9 / 5 + 32

model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(tools=[celsius_to_fahrenheit])

# Turn 1 — tool use
r1 = conversation.chain("What is 100°C in Fahrenheit?")
print(r1.text())

# Turn 2 — follow-up without repeating context
r2 = conversation.chain("And 0°C?")
print(r2.text())

# Turn 3 — no tool needed
r3 = conversation.chain("Which temperature would you recommend for baking bread?")
print(r3.text())

# Inspect the full history
for i, response in enumerate(conversation.responses, 1):
    print(f"\n--- Turn {i} ---")
    print("Prompt :", response.prompt.prompt)
    print("Usage  :", response.usage())

Reference

Conversations in the LLM Python API: Multi-Turn Prompts

Starting a Conversation

Sending Turns

Accessing Prior Responses

The `response.conversation` Attribute

Replying Directly from a Response

Conversations Using Tools

Tool debugging hooks in conversations

Toolbox instances in conversations

Full Multi-Turn Example

Build docs developers (and LLMs) love

Reference

Documentation Index

​Starting a Conversation

​Sending Turns

​Accessing Prior Responses

​The response.conversation Attribute

​Replying Directly from a Response

​Conversations Using Tools

​Tool debugging hooks in conversations

​Toolbox instances in conversations

​Full Multi-Turn Example

Build docs developers (and LLMs) love

Starting a Conversation

Sending Turns

Accessing Prior Responses

The `response.conversation` Attribute

Replying Directly from a Response

Conversations Using Tools

Tool debugging hooks in conversations

Toolbox instances in conversations

Full Multi-Turn Example