Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/salesforce/ai-economist/llms.txt

Use this file to discover all available pages before exploring further.

BaseEnvironment is the abstract base class that all Foundation scenario environments extend. It instantiates the world, agents, and component objects, and exposes a Gym-style API for resetting, stepping, and seeding the environment.
from ai_economist import foundation

ScenarioClass = foundation.scenarios.get("uniform/simple_wood_and_stone")

env = ScenarioClass(
    components=[
        ("Build", {"payment": 20}),
        ("Gather", {"move_labor": 1.0, "collect_labor": 2.0}),
    ],
    n_agents=4,
    world_size=[25, 25],
)

obs = env.reset()
actions = {agent.idx: ... for agent in env.all_agents}
obs, rew, done, info = env.step(actions)
BaseEnvironment is abstract and cannot be instantiated directly. Create a concrete scenario by subclassing it and implementing all required abstract methods, or use a scenario from foundation.scenarios.

Class attributes

Subclasses must declare the following class-level attributes before any instance is created.
name
str
required
Unique string identifier for the scenario. Used to register and retrieve the class from the scenario registry (foundation.scenarios). Must be set to a non-empty string by every subclass.
agent_subclasses
list
required
List of agent subclass name strings that this scenario applies to. Must contain at least one entry. When multiple subclasses are listed, none may be a subclass of another (no inheritance conflicts).
required_entities
list
required
List of entity names (resources, landmarks, endogenous variables) that the scenario requires. May be an empty list. Coin and Labor are always included automatically.

Constructor

BaseEnvironment(
    components,
    n_agents,
    world_size,
    episode_length=1000,
    multi_action_mode_agents=False,
    multi_action_mode_planner=True,
    flatten_observations=True,
    flatten_masks=True,
    allow_observation_scaling=True,
    dense_log_frequency=None,
    world_dense_log_frequency=50,
    collate_agent_step_and_reset_data=False,
    seed=None,
)

Parameters

components
list
required
A list specifying which components to include in the environment. Each element must be either:
  • A tuple ("Component Name", {component_kwargs})
  • A dict {"Component Name": {component_kwargs}}
"Component Name" must match a name registered in the component registry. Reset, step, and observation generation are executed in the order components appear in this list — reordering can affect environment dynamics.
n_agents
int
required
Number of mobile agents. Does not include the planner. Must be an integer greater than or equal to 2.
world_size
list
required
A length-2 list [height, width] specifying the dimensions of the 2D world map.
episode_length
int
default:"1000"
Number of timesteps in a single episode. Must be at least 1.
multi_action_mode_agents
bool
default:"False"
When True, mobile agents may choose one action per action subspace (defined by each component) per timestep. When False, all subspaces are concatenated into a single action space and agents choose one action from the aggregate.
multi_action_mode_planner
bool
default:"True"
Same as multi_action_mode_agents but for the planner agent.
flatten_observations
bool
default:"True"
When True, all scalar and vector observation subfields are concatenated into a single "flat" observation field before being returned. When False, observations are returned as minimally processed dictionaries.
flatten_masks
bool
default:"True"
When True, action masks are concatenated into a single flat array. When False, masks are returned as a {"action_subspace_name": mask} dictionary. Set to True for deep RL action masking — flattened masks have the same semantics as policy logits.
allow_observation_scaling
bool
default:"True"
When True, certain observation fields (e.g. inventory) are scaled to a range better suited for deep RL training.
dense_log_frequency
int | None
default:"None"
How often (in completed episodes) to create a dense log. None disables automatic dense logging. Set to e.g. 20 to log every 20th episode. Dense logs record agent states, actions, and rewards at each timestep, and world map snapshots at a coarser cadence.
Dense logging is time consuming, especially with many agents. Use sparingly in production training runs.
world_dense_log_frequency
int
default:"50"
When dense logging is active, how often (in timesteps) to snapshot the world state. More frequent snapshots increase memory usage.
collate_agent_step_and_reset_data
bool
default:"False"
When True, observations, rewards, and info dictionaries from all mobile agents ("0", "1", …) are collated into a single entry keyed by "a". Useful for GPU-accelerated training with WarpDrive.
seed
int | float
default:"None"
If provided, sets both the numpy and Python built-in RNG seeds at construction time. Must be greater than 0. Can also be set later via env.seed(seed).

Properties

world

env.world  # -> World
The World object that holds the world map and all agent objects. Constructed during __init__ once all entities and components have been registered.

all_agents

env.all_agents  # -> list[BaseAgent]
List of all agent objects: all mobile agents followed by the planner agent. Equivalent to env.world.agents + [env.world.planner].

episode_length

env.episode_length  # -> int
Number of timesteps per episode, as passed to the constructor.

n_agents

env.n_agents  # -> int
Number of mobile agents (does not include the planner).

components

env.components  # -> list[BaseComponent]
Ordered list of component objects associated with this scenario instance.

resources

env.resources  # -> list[str]
Sorted list of resource names managed by this environment instance. Always includes "Coin".

landmarks

env.landmarks  # -> list[str]
Sorted list of landmark names managed by this environment instance.

endogenous

env.endogenous  # -> list[str]
Sorted list of endogenous variable names managed by this environment instance. Always includes "Labor".

metrics

env.metrics  # -> dict
Combined metrics from both the scenario (scenario_metrics()) and all components (component.get_metrics()). Component metrics are namespaced as "<component_shorthand>/<metric_key>".

previous_episode_metrics

env.previous_episode_metrics  # -> dict | None
Metrics captured at the end of the last completed episode. None before the first episode finishes.

dense_log

env.dense_log  # -> dict
Contents of the current (potentially incomplete) dense log for the running episode. Has keys "world", "states", "actions", "rewards", plus optional per-component keys.

previous_episode_dense_log

env.previous_episode_dense_log  # -> dict
Finalised dense log from the last completed episode that had dense logging active.

previous_episode_replay_log

env.previous_episode_replay_log  # -> dict
Compact replay log from the last completed episode. Contains "reset" and "step" keys that together allow the exact episode to be reproduced.
replay_log = env.previous_episode_replay_log

_ = env.reset(force_dense_logging=True, **replay_log["reset"])
for replay_step in replay_log["step"]:
    _ = env.step(**replay_step)

dense_log = env.previous_episode_dense_log
metrics = env.previous_episode_metrics

inv_scale

env.inv_scale  # -> float
Scale factor applied to inventory-related observations. Returns 0.01 when allow_observation_scaling=True, otherwise 1.

Methods

reset()

env.reset(seed_state=None, force_dense_logging=False)
Reset the environment to begin a new episode. Calls reset_starting_layout(), reset_agent_states(), each component’s reset(), and additional_reset_steps() in order. Returns initial observations.

Parameters

seed_state
tuple | list
default:"None"
Optional numpy RNG state to restore before resetting. Must be length 5, in the format expected by np.random.set_state(). Used for deterministic episode replay.
force_dense_logging
bool
default:"False"
When True, forces dense logging to be active for this episode regardless of dense_log_frequency.

Returns

obs
dict
A dictionary {"agent_idx": agent_obs} with one entry per agent. Keys match each agent’s agent.idx property. Values are observation dictionaries (flattened to a "flat" key when flatten_observations=True). Includes an "action_mask" field for each agent.

step()

env.step(actions=None, seed_state=None)
Advance the environment by one timestep. Executes each component’s step, then the scenario step, then collects observations, rewards, done flags, and info.

Parameters

actions
dict
default:"None"
Dictionary {agent_idx: action} mapping each acting agent’s index to its chosen action.
  • When agent.multi_action_mode is True: action must be a list of integers, one per action subspace.
  • When agent.multi_action_mode is False: action must be a single integer selecting from the concatenated action space.
If None, all agents take the NO-OP action.
seed_state
tuple | list
default:"None"
Optional numpy RNG state to restore before stepping. Must be length 5. Used for deterministic episode replay.

Returns

obs
dict
Observation dictionary with the same structure as returned by reset().
rew
dict
Dictionary {"agent_idx": reward} with one scalar reward per agent. Keys match those in obs.
done
dict
Dictionary with a single key "__all__". Value is False while world.timestep < episode_length, and True when the episode ends.
info
dict
Placeholder dictionary {"agent_idx": {}} with the same keys as obs and rew.

seed()

BaseEnvironment.seed(seed)
Set the numpy and Python built-in random number generator seeds. This is a static method.

Parameters

seed
int | float
required
Seed value. Must be greater than 0. Float values are cast to int internally.

get_component()

env.get_component(component_name)
Retrieve a component object by its full name or shorthand name.

Parameters

component_name
str
required
Full name or shorthand name of the component to retrieve. Must correspond to a component registered in this environment instance. Raises KeyError if no match is found.

Returns

component
BaseComponent
The component object instance wrapped in the environment.

get_agent()

env.get_agent(agent_idx)
Retrieve an agent object by its index.

Parameters

agent_idx
int | str
required
Identifier matching the idx property of the desired agent. The planner’s index is "p". Raises ValueError if no agent with the given index exists.

Returns

agent
BaseAgent
The agent object with the corresponding index.

set_agent_component_action()

env.set_agent_component_action(agent_idx, component_name, action)
Directly set the action for a specific agent and action subspace without going through parse_actions().

Parameters

agent_idx
int | str
required
Index of the agent whose action to set.
component_name
str
required
Name of the action subspace to set the action for.
action
int
required
Integer index of the chosen action within the named subspace.

parse_actions()

env.parse_actions(action_dictionary)
Parse an {agent_idx: action} dictionary and load the actions into each agent’s action buffer.

Parameters

action_dictionary
dict
required
Dictionary mapping agent indices to their chosen actions. Same format as the actions argument of step().

Abstract methods

Every concrete scenario subclass must implement the following methods.

reset_starting_layout()

def reset_starting_layout(self) -> None:
Part 1 of the scenario reset. Handles resetting the environment state managed by the scenario — resource and landmark layout on the world map.

reset_agent_states()

def reset_agent_states(self) -> None:
Part 2 of the scenario reset. Handles resetting the state of the agents themselves — inventory, locations, and so on.

scenario_step()

def scenario_step(self) -> None:
Update the world state according to this scenario’s passive dynamics. Called inside step() after all component steps and before observation/reward generation. Implement resource regeneration, income redistribution, and similar rules here.

generate_observations()

def generate_observations(self) -> dict:
Generate the scenario’s own observations. A scenario may supply observations for none, some, or all agent types, but must be consistent: if it yields an observation for an agent type, it must always do so with the same structure.

Returns

obs
dict
Dictionary {agent.idx: agent_obs_dict} with one entry for each agent type this scenario provides observations for.

compute_reward()

def compute_reward(self) -> dict:
Apply the scenario’s reward function(s) to compute per-agent rewards for the current timestep.

Returns

rew
dict
Dictionary {agent.idx: scalar_reward} with one float reward per agent, including the planner.

Optional override methods

additional_reset_steps()

def additional_reset_steps(self) -> None:
Called at the end of the reset cycle, after reset_starting_layout(), reset_agent_states(), and each component’s reset(). Override to perform any final scenario-specific initialization.

scenario_metrics()

def scenario_metrics(self) -> dict | None:
Return a flat {metric_key: scalar_value} dictionary of scenario-specific metrics. These are merged with component metrics and exposed via the metrics property. Return None (the default) to contribute no metrics.

Scenario registry

The scenario_registry is a Registry object that maps scenario names to their classes. Decorate a subclass with @scenario_registry.add to register it.
from ai_economist.foundation.base.base_env import BaseEnvironment, scenario_registry

@scenario_registry.add
class ExampleScenario(BaseEnvironment):
    name = "example/my_scenario"
    agent_subclasses = ["BasicMobileAgent"]
    required_entities = ["Wood"]

    def reset_starting_layout(self): ...
    def reset_agent_states(self): ...
    def scenario_step(self): ...
    def generate_observations(self): ...
    def compute_reward(self): ...

assert scenario_registry.has("example/my_scenario")
The foundation package exposes the scenario registry as foundation.scenarios. A scenario registered in the above way is only visible through foundation.scenarios if its module is imported in ai_economist/foundation/scenarios/__init__.py.

Build docs developers (and LLMs) love