Naive AI agents are surprisingly fragile in production. A single LLM API rate-limit, a transient network error, or a worker crash mid-execution means restarting the entire agent from scratch—re-running every LLM call and burning API credits. Durable AI agents solve this by wrapping every model inference and tool call in a Temporal activity, giving you automatic retries, crash recovery, and a complete audit trail—all without changing a line of agent logic. This page explains the pattern and shows the exact code used in Exercise 3 and Exercise 4.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/temporalio/edu-ai-workshop-openai-agents-sdk/llms.txt
Use this file to discover all available pages before exploring further.
The Problem: Fragile Agents in Production
LLM API rate limits and transient failures
LLM API rate limits and transient failures
LLM APIs enforce rate limits and occasionally return 5xx errors. A plain
Runner.run() call will surface these as exceptions that crash your agent. You would need to write retry loops around every call—error-prone boilerplate that lives outside the agent’s core logic.Crashes mid-execution
Crashes mid-execution
Multi-step agent flows (query → tool call → second LLM call → tool call → final answer) can take seconds or minutes. If the worker process crashes between steps, the entire run is lost. You cannot resume from the middle of a plain Python
async function.No audit trail
No audit trail
In production you need to know: which tool calls did the agent make, what were the inputs and outputs, which LLM calls were retried? A bare
Runner.run() provides none of this without custom instrumentation.Long-running operations
Long-running operations
Some agent tasks—waiting for a human approval, polling an external system—can take minutes or hours. Keeping an
asyncio coroutine alive for that long is impractical and unreliable.The Solution: OpenAIAgentsPlugin
The temporalio.contrib.openai_agents module ships a Temporal client plugin that intercepts every OpenAI Agents SDK model call and transparently runs it as a Temporal activity. Your agent code does not change at all.
The agent code stays identical. You define
Agent, @function_tool, and Runner.run() exactly as you did in Exercise 1. OpenAIAgentsPlugin wraps the infrastructure around those calls at the Temporal worker and client level—the agent itself has no knowledge of Temporal.How the Integration Works
Step 1 — Register the Plugin on the Client
The plugin must be registered when you connect both the worker and the starter client. Here is the worker from Exercise 4 (solutions/04_agent_routing/worker.py):
Step 2 — Register the Plugin on the Starter Client
The starter also needs the plugin so it can correctly serialize/deserialize agent payloads:Step 3 — Write the Workflow as Normal
Inside the workflow you callRunner.run() exactly as you would without Temporal. The plugin intercepts model calls at the SDK level:
Step 4 — Use activity_as_tool() for Custom Activities (Exercise 3)
Exercise 3 adds a custom Temporal activity (a weather API call) as a tool available to the agent. The activity_as_tool() helper bridges the two systems:
ModelActivityParameters
ModelActivityParameters configures the Temporal activity that wraps each LLM inference call:
start_to_close_timeout sets the maximum time for a single LLM call attempt. If the model API does not respond within this window, Temporal marks the activity as failed and retries it automatically. Set this generously for GPT-4 (30–60 seconds); GPT-4o-mini is faster and can use a shorter timeout.
Architecture: Every LLM Call as an Activity
What You Gain
Automatic retries
Every LLM API call and tool execution retries on transient failure using Temporal’s configurable retry policy—no manual retry loops needed.
Crash recovery
If the worker process crashes mid-agent-execution, Temporal resumes from the last completed activity when the worker restarts. Completed LLM calls are never re-run.
Full execution history
Every activity start, completion, failure, and retry is recorded in Temporal’s event history. Inspect it in the Temporal Web UI at port 8233.
OpenAI trace correlation
Wrap
Runner.run() with trace("name") and every LLM call appears in the OpenAI platform under Traces, correlated with the workflow execution ID.The No-Code-Change Guarantee
To make Exercise 1’s agent durable, you do not rewrite the agent. Compare the two versions:@workflow.defn class wrapper, activity_as_tool() for the weather tool, and the OpenAIAgentsPlugin on the worker — all infrastructure-level changes.
Observing Durable Execution in Practice
Exercise 3 includes a built-in failure simulation. To experience crash recovery:Enable the simulated bug
In the Exercise 3 notebook, in the
activities.py cell, comment out data = await make_api_call(state) and uncomment data = await make_api_call_bug(state). This makes the activity always raise an exception.Run the workflow
Execute the starter cell. The workflow starts and the
get_weather activity begins failing.Watch retries in the Temporal UI
Open port
8233 in the Codespaces Ports tab. Navigate to the running workflow and observe the activity retry events accumulating with exponential backoff.Fix the bug
Re-comment
make_api_call_bug and uncomment make_api_call. This simulates a “service coming back online” or a deployed bug fix.Restart the worker
Run the restart cell in the notebook. The new worker picks up the pending activity task and completes it successfully.
Required Imports
Workshop Exercises Using This Pattern
Exercise 1: Agent Hello World
Build the plain agent that you’ll later make durable — establishes the baseline agent code that stays unchanged.
Exercise 2: Temporal Hello World
Learn workflows and activities in isolation before combining them with agent execution.
Exercise 3: Durable Agent
THE key exercise — wrap the weather agent in Temporal, trigger failures, and watch automatic recovery in the UI.
Exercise 4: Agent Routing
Production file structure with a triage agent, three language specialists, handoffs, and a separate worker process.