Environment variables: API keys and runtime controls

auto-harness reads environment variables for secrets and runtime overrides that should not be checked into version control. The canonical list lives in .env.example at the repo root. Copy it to .env before your first run, then fill in the values you need:

cp .env.example .env

Most variables are optional and only apply to specific benchmarks. The sections below explain which variables are required for each benchmark and what they control.

LLM API keys

The correct key is determined automatically from the agent_model value in experiment_config.yaml. If the model name starts with "gemini", GEMINI_API_KEY is required. If it starts with "claude", ANTHROPIC_API_KEY is required. All other models fall back to OPENAI_API_KEY.

OPENAI_API_KEY

string

OpenAI API key. Required when agent_model is an OpenAI model (anything that does not start with "gemini" or "claude"). Also forwarded as LITELLM_API_KEY inside BIRD-Interact runs when LITELLM_API_KEY is not set separately.

ANTHROPIC_API_KEY

string

Anthropic API key. Required when agent_model starts with "claude" (e.g., "anthropic/claude-sonnet-4-20250514").

GEMINI_API_KEY

string

Google Gemini API key. Required when agent_model starts with "gemini".

Sandbox provider keys

These variables are only required for Terminal-Bench when env_provider is set to "e2b" or "daytona". The "docker" provider requires no API key.

E2B_API_KEY

string

E2B sandbox API key. Required when env_provider: "e2b" in experiment_config.yaml.

DAYTONA_API_KEY

string

Daytona API key. Required when env_provider: "daytona" in experiment_config.yaml.

Only one sandbox provider key is needed per experiment. Set E2B_API_KEY for E2B, DAYTONA_API_KEY for Daytona, or neither if you are using the local Docker provider.

Runtime control

These variables let you override experiment configuration at runtime without editing experiment_config.yaml. They are primarily used by the benchmark runner internally, but you can also set them manually for ad-hoc runs.

AGENT_MODEL

string

Override the agent model. TauBenchRunner and TerminalBenchRunner read this as the fallback when agent_model is not set in experiment_config.yaml. Defaults to "gpt-5.4".

AGENT_REASONING_EFFORT

string

Override the reasoning effort level. Set automatically from reasoning_effort in experiment_config.yaml before each run. Accepted values: "low", "medium", "high".

HARNESS_SAVE_TRACE

string

Controls whether TerminalBenchRunner copies agent traces to workspace/traces/. Set to "0" to disable trace saving. The runner sets this automatically to "0" for non-train splits (test and baseline all-tasks runs), preventing the coding agent from reading test-split traces.

tau-bench data directory

TAU2_DATA_DIR

string

Path to the directory where tau-bench data is stored. Defaults to ./tau2_data (relative to the auto-harness repo root). prepare.py clones the tau2-bench repository into this directory on first run if it is not already present. Override this to share the data directory across multiple experiments.

BIRD-Interact overrides

These variables mirror the advanced override keys in experiment_config.yaml. Set them as environment variables if you prefer not to hard-code paths in the config file.

BIRD_REPO

string

Absolute path to an existing BIRD-Interact repo root or BIRD-Interact-ADK directory. Takes effect when bird_repo is not set in experiment_config.yaml.

BIRD_PYTHON_BIN

string

Absolute path to a Python interpreter with BIRD-Interact-ADK dependencies installed. Takes effect when bird_python_bin is not set in experiment_config.yaml.

BIRD-Interact service ports

These variables are set automatically by BirdInteractRunner from the values in experiment_config.yaml. You can also set them directly in your shell to override defaults without editing the config.

SYSTEM_AGENT_PORT

string

Port for the BIRD-Interact system agent service. Defaults to 6100.

USER_SIM_PORT

string

Port for the user simulator service. Defaults to 6101.

DB_ENV_PORT

string

Port for the database environment service. Defaults to 6102.

DATASET

string

BIRD-Interact dataset size passed to the orchestrator. Set automatically from dataset in experiment_config.yaml. Accepted values: "lite" (300 tasks) or "full" (600 tasks).

PATIENCE

string

Maximum clarification turns per task in c-interact mode. Set automatically from patience in experiment_config.yaml. Defaults to 3.

Postgres connection

These variables configure the Postgres connection used by BIRD-Interact. prepare.py provisions a Docker container with these defaults on first run. Override them only when pointing at an existing Postgres instance.

PG_HOST

string

Postgres host. Defaults to 127.0.0.1.

PG_PORT

string

Postgres port. Defaults to 5432.

PG_USER

string

Postgres username. Defaults to root.

PG_PASSWORD

string

Postgres password. Defaults to 123123.

Required variables by benchmark

Terminal-Bench

Variable	Required	Notes
`OPENAI_API_KEY` or `ANTHROPIC_API_KEY` or `GEMINI_API_KEY`	Yes	Match to your `agent_model`
`E2B_API_KEY`	When `env_provider: "e2b"`
`DAYTONA_API_KEY`	When `env_provider: "daytona"`

tau-bench

Variable	Required	Notes
`OPENAI_API_KEY` or `ANTHROPIC_API_KEY` or `GEMINI_API_KEY`	Yes	Match to your `agent_model`
`TAU2_DATA_DIR`	No	Defaults to `./tau2_data`; data is auto-cloned

BIRD-Interact

Variable	Required	Notes
`OPENAI_API_KEY` or `ANTHROPIC_API_KEY` or `GEMINI_API_KEY`	Yes	Match to your `agent_model`
`BIRD_REPO`	No	Auto-provisioned into `./bird_interact_adk/`
`BIRD_PYTHON_BIN`	No	Auto-resolved from venv inside ADK
`PG_HOST`, `PG_PORT`, `PG_USER`, `PG_PASSWORD`	No	Auto-provisioned Docker container

Example `.env` file

# LLM API keys — set whichever your agent_model needs
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...

# Terminal-Bench sandbox provider (set one)
E2B_API_KEY=e2b_...
DAYTONA_API_KEY=...

Configuration

API Reference

Environment variables: API keys and runtime controls

LLM API keys

Sandbox provider keys

Runtime control

tau-bench data directory

BIRD-Interact overrides

BIRD-Interact service ports

Postgres connection

Required variables by benchmark

Example `.env` file

Build docs developers (and LLMs) love

Configuration

API Reference

Documentation Index

​LLM API keys

​Sandbox provider keys

​Runtime control

​tau-bench data directory

​BIRD-Interact overrides

​BIRD-Interact service ports

​Postgres connection

​Required variables by benchmark

​Example .env file

Build docs developers (and LLMs) love

LLM API keys

Sandbox provider keys

Runtime control

tau-bench data directory

BIRD-Interact overrides

BIRD-Interact service ports

Postgres connection

Required variables by benchmark

Example `.env` file