Documentation Index
Fetch the complete documentation index at: https://mintlify.com/simonw/LLM/llms.txt
Use this file to discover all available pages before exploring further.
LLM’s Conversation object tracks the full message history across multiple prompt turns so the model always has context from earlier exchanges. This is the Python equivalent of the llm chat CLI command — you start a conversation once and keep calling conversation.prompt() to send follow-up messages.
Starting a Conversation
Call model.conversation() to create a new Conversation object:
import llm
model = llm.get_model()
conversation = model.conversation()
You can pass a system= prompt to set the assistant’s persona for the entire conversation, as well as tools= to make tool functions available across all turns (see Conversations Using Tools below).
Sending Turns
Use conversation.prompt() exactly like model.prompt(). Each call automatically includes all prior turns in the context sent to the model:
First message
response = conversation.prompt("Five fun facts about pelicans")
print(response.text())
Follow-up message
response2 = conversation.prompt("Now do skunks")
print(response2.text())
Because the conversation object tracks history, the model knows “now do skunks” means “give me five fun facts about skunks” — no need to repeat the original instruction.Attachments in a turn
response3 = conversation.prompt(
"Describe these birds",
attachments=[
llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg")
]
)
print(response3.text())
Accessing Prior Responses
conversation.responses is a list of every Response returned so far during the conversation:
model = llm.get_model()
conversation = model.conversation()
conversation.prompt("What is the capital of France?").text()
conversation.prompt("And Germany?").text()
conversation.prompt("And Italy?").text()
print(len(conversation.responses)) # 3
for r in conversation.responses:
print(r.prompt.prompt, "→", r.text())
Each element is a full Response object, so you can inspect .usage(), .tool_calls(), .messages(), and everything else documented in the Python API reference.
The response.conversation Attribute
Every Response produced by conversation.prompt() carries a reference back to its parent Conversation:
response = conversation.prompt("Name a river")
print(response.conversation) # prints the conversation object
assert response.conversation is conversation # True — same object
This lets any code that only has a response object navigate back to the full conversation history.
Replying Directly from a Response
response.reply() lets you continue the conversation directly from a Response object without a Conversation — useful when you want to branch or replay:
import llm
model = llm.get_model("gpt-4o-mini")
response = model.prompt("What's 2+2?")
print(response.text())
follow_up = response.reply("Now multiply that by 10")
print(follow_up.text())
You can also persist a response to JSON and resume later:
import json, llm
model = llm.get_model("gpt-4o-mini")
response = model.prompt("Tell me about otters")
payload = json.dumps(response.to_dict())
# Later — rehydrate and continue
rebuilt = llm.Response.from_dict(json.loads(payload))
followup = rebuilt.reply("What do they eat?")
print(followup.text())
Pass a list of tool functions to model.conversation(tools=[...]) to make them available throughout the conversation. Use conversation.chain() instead of conversation.prompt() when you want the automatic tool-call loop:
import llm
def upper(text: str) -> str:
"convert text to upper case"
return text.upper()
def reverse(text: str) -> str:
"reverse text"
return text[::-1]
model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(tools=[upper, reverse])
print(conversation.chain(
"Convert panda to uppercase and reverse it"
).text())
print(conversation.chain(
"Same with pangolin"
).text())
The second conversation.chain() call inherits the tools from the conversation and still has context from the first exchange. The model knows “same with pangolin” means apply the same uppercase + reverse operations.
Pass before_call= and after_call= to model.conversation() to apply those hooks to every chained prompt in the conversation:
import llm
from typing import Optional
def log_before(tool: Optional[llm.Tool], tool_call: llm.ToolCall):
print(f"→ calling {tool_call.name}({tool_call.arguments})")
def log_after(tool: llm.Tool, tool_call: llm.ToolCall, result: llm.ToolResult):
print(f"← {tool_call.name} returned {result.output!r}")
def upper(text: str) -> str:
"convert text to upper case"
return text.upper()
model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(
tools=[upper],
before_call=log_before,
after_call=log_after,
)
print(conversation.chain("Convert panda to uppercase").text())
print(conversation.chain("Now do penguin").text())
You can also pass before_call= and after_call= directly to individual conversation.chain() calls to override the conversation-level hooks for a single turn.
llm.Toolbox instances maintain state between turns, making them ideal for stateful conversation tools like memory or session-scoped resources:
import llm
class Memory(llm.Toolbox):
_memory = None
def _get_memory(self):
if self._memory is None:
self._memory = {}
return self._memory
def set(self, key: str, value: str):
"Set a value by key"
self._get_memory()[key] = value
def get(self, key: str):
"Get a value by key"
return self._get_memory().get(key) or ""
def keys(self):
"Return all keys"
return list(self._get_memory().keys())
model = llm.get_model("gpt-4.1-mini")
memory = Memory()
conversation = model.conversation(tools=[memory])
print(conversation.chain("Set name to Simon").text())
print(memory._memory) # {'name': 'Simon'}
print(conversation.chain("Set name to Penguin").text())
print(conversation.chain("What is the current name?").text())
Because memory is the same Python object across all turns, memory._memory accumulates values from every call the model makes to the set tool.
Full Multi-Turn Example
import llm
def celsius_to_fahrenheit(celsius: float) -> float:
"""Convert a temperature from Celsius to Fahrenheit."""
return celsius * 9 / 5 + 32
model = llm.get_model("gpt-4.1-mini")
conversation = model.conversation(tools=[celsius_to_fahrenheit])
# Turn 1 — tool use
r1 = conversation.chain("What is 100°C in Fahrenheit?")
print(r1.text())
# Turn 2 — follow-up without repeating context
r2 = conversation.chain("And 0°C?")
print(r2.text())
# Turn 3 — no tool needed
r3 = conversation.chain("Which temperature would you recommend for baking bread?")
print(r3.text())
# Inspect the full history
for i, response in enumerate(conversation.responses, 1):
print(f"\n--- Turn {i} ---")
print("Prompt :", response.prompt.prompt)
print("Usage :", response.usage())