BaseEnvironment - AI Economist / Foundation

BaseEnvironment is the abstract base class that all Foundation scenario environments extend. It instantiates the world, agents, and component objects, and exposes a Gym-style API for resetting, stepping, and seeding the environment.

from ai_economist import foundation

ScenarioClass = foundation.scenarios.get("uniform/simple_wood_and_stone")

env = ScenarioClass(
    components=[
        ("Build", {"payment": 20}),
        ("Gather", {"move_labor": 1.0, "collect_labor": 2.0}),
    ],
    n_agents=4,
    world_size=[25, 25],
)

obs = env.reset()
actions = {agent.idx: ... for agent in env.all_agents}
obs, rew, done, info = env.step(actions)

BaseEnvironment is abstract and cannot be instantiated directly. Create a concrete scenario by subclassing it and implementing all required abstract methods, or use a scenario from foundation.scenarios.

Class attributes

Subclasses must declare the following class-level attributes before any instance is created.

name

str

required

Unique string identifier for the scenario. Used to register and retrieve the class from the scenario registry (foundation.scenarios). Must be set to a non-empty string by every subclass.

agent_subclasses

list

required

List of agent subclass name strings that this scenario applies to. Must contain at least one entry. When multiple subclasses are listed, none may be a subclass of another (no inheritance conflicts).

required_entities

list

required

List of entity names (resources, landmarks, endogenous variables) that the scenario requires. May be an empty list. Coin and Labor are always included automatically.

Constructor

BaseEnvironment(
    components,
    n_agents,
    world_size,
    episode_length=1000,
    multi_action_mode_agents=False,
    multi_action_mode_planner=True,
    flatten_observations=True,
    flatten_masks=True,
    allow_observation_scaling=True,
    dense_log_frequency=None,
    world_dense_log_frequency=50,
    collate_agent_step_and_reset_data=False,
    seed=None,
)

Parameters

components

list

required

A list specifying which components to include in the environment. Each element must be either:

A tuple ("Component Name", {component_kwargs})
A dict {"Component Name": {component_kwargs}}

"Component Name" must match a name registered in the component registry. Reset, step, and observation generation are executed in the order components appear in this list — reordering can affect environment dynamics.

n_agents

int

required

Number of mobile agents. Does not include the planner. Must be an integer greater than or equal to 2.

world_size

list

required

A length-2 list [height, width] specifying the dimensions of the 2D world map.

episode_length

int

default:"1000"

Number of timesteps in a single episode. Must be at least 1.

multi_action_mode_agents

bool

default:"False"

When True, mobile agents may choose one action per action subspace (defined by each component) per timestep. When False, all subspaces are concatenated into a single action space and agents choose one action from the aggregate.

multi_action_mode_planner

bool

default:"True"

Same as multi_action_mode_agents but for the planner agent.

flatten_observations

bool

default:"True"

When True, all scalar and vector observation subfields are concatenated into a single "flat" observation field before being returned. When False, observations are returned as minimally processed dictionaries.

flatten_masks

bool

default:"True"

When True, action masks are concatenated into a single flat array. When False, masks are returned as a {"action_subspace_name": mask} dictionary. Set to True for deep RL action masking — flattened masks have the same semantics as policy logits.

allow_observation_scaling

bool

default:"True"

When True, certain observation fields (e.g. inventory) are scaled to a range better suited for deep RL training.

dense_log_frequency

int | None

default:"None"

How often (in completed episodes) to create a dense log. None disables automatic dense logging. Set to e.g. 20 to log every 20th episode. Dense logs record agent states, actions, and rewards at each timestep, and world map snapshots at a coarser cadence.

Dense logging is time consuming, especially with many agents. Use sparingly in production training runs.

world_dense_log_frequency

int

default:"50"

When dense logging is active, how often (in timesteps) to snapshot the world state. More frequent snapshots increase memory usage.

collate_agent_step_and_reset_data

bool

default:"False"

When True, observations, rewards, and info dictionaries from all mobile agents ("0", "1", …) are collated into a single entry keyed by "a". Useful for GPU-accelerated training with WarpDrive.

seed

int | float

default:"None"

If provided, sets both the numpy and Python built-in RNG seeds at construction time. Must be greater than 0. Can also be set later via env.seed(seed).

Properties

`world`

env.world  # -> World

The World object that holds the world map and all agent objects. Constructed during __init__ once all entities and components have been registered.

`all_agents`

env.all_agents  # -> list[BaseAgent]

List of all agent objects: all mobile agents followed by the planner agent. Equivalent to env.world.agents + [env.world.planner].

`episode_length`

env.episode_length  # -> int

Number of timesteps per episode, as passed to the constructor.

`n_agents`

env.n_agents  # -> int

Number of mobile agents (does not include the planner).

`components`

env.components  # -> list[BaseComponent]

Ordered list of component objects associated with this scenario instance.

`resources`

env.resources  # -> list[str]

Sorted list of resource names managed by this environment instance. Always includes "Coin".

`landmarks`

env.landmarks  # -> list[str]

Sorted list of landmark names managed by this environment instance.

`endogenous`

env.endogenous  # -> list[str]

Sorted list of endogenous variable names managed by this environment instance. Always includes "Labor".

`metrics`

env.metrics  # -> dict

Combined metrics from both the scenario (scenario_metrics()) and all components (component.get_metrics()). Component metrics are namespaced as "<component_shorthand>/<metric_key>".

`previous_episode_metrics`

env.previous_episode_metrics  # -> dict | None

Metrics captured at the end of the last completed episode. None before the first episode finishes.

`dense_log`

env.dense_log  # -> dict

Contents of the current (potentially incomplete) dense log for the running episode. Has keys "world", "states", "actions", "rewards", plus optional per-component keys.

`previous_episode_dense_log`

env.previous_episode_dense_log  # -> dict

Finalised dense log from the last completed episode that had dense logging active.

`previous_episode_replay_log`

env.previous_episode_replay_log  # -> dict

Compact replay log from the last completed episode. Contains "reset" and "step" keys that together allow the exact episode to be reproduced.

replay_log = env.previous_episode_replay_log

_ = env.reset(force_dense_logging=True, **replay_log["reset"])
for replay_step in replay_log["step"]:
    _ = env.step(**replay_step)

dense_log = env.previous_episode_dense_log
metrics = env.previous_episode_metrics

`inv_scale`

env.inv_scale  # -> float

Scale factor applied to inventory-related observations. Returns 0.01 when allow_observation_scaling=True, otherwise 1.

Methods

`reset()`

env.reset(seed_state=None, force_dense_logging=False)

Reset the environment to begin a new episode. Calls reset_starting_layout(), reset_agent_states(), each component’s reset(), and additional_reset_steps() in order. Returns initial observations.

Parameters

seed_state

tuple | list

default:"None"

Optional numpy RNG state to restore before resetting. Must be length 5, in the format expected by np.random.set_state(). Used for deterministic episode replay.

force_dense_logging

bool

default:"False"

When True, forces dense logging to be active for this episode regardless of dense_log_frequency.

Returns

obs

dict

A dictionary {"agent_idx": agent_obs} with one entry per agent. Keys match each agent’s agent.idx property. Values are observation dictionaries (flattened to a "flat" key when flatten_observations=True). Includes an "action_mask" field for each agent.

`step()`

env.step(actions=None, seed_state=None)

Advance the environment by one timestep. Executes each component’s step, then the scenario step, then collects observations, rewards, done flags, and info.

Parameters

actions

dict

default:"None"

Dictionary {agent_idx: action} mapping each acting agent’s index to its chosen action.

When agent.multi_action_mode is True: action must be a list of integers, one per action subspace.
When agent.multi_action_mode is False: action must be a single integer selecting from the concatenated action space.

If None, all agents take the NO-OP action.

seed_state

tuple | list

default:"None"

Optional numpy RNG state to restore before stepping. Must be length 5. Used for deterministic episode replay.

Returns

obs

dict

Observation dictionary with the same structure as returned by reset().

rew

dict

Dictionary {"agent_idx": reward} with one scalar reward per agent. Keys match those in obs.

done

dict

Dictionary with a single key "__all__". Value is False while world.timestep < episode_length, and True when the episode ends.

info

dict

Placeholder dictionary {"agent_idx": {}} with the same keys as obs and rew.

`seed()`

BaseEnvironment.seed(seed)

Set the numpy and Python built-in random number generator seeds. This is a static method.

Parameters

seed

int | float

required

Seed value. Must be greater than 0. Float values are cast to int internally.

`get_component()`

env.get_component(component_name)

Retrieve a component object by its full name or shorthand name.

Parameters

component_name

str

required

Full name or shorthand name of the component to retrieve. Must correspond to a component registered in this environment instance. Raises KeyError if no match is found.

Returns

component

BaseComponent

The component object instance wrapped in the environment.

`get_agent()`

env.get_agent(agent_idx)

Retrieve an agent object by its index.

Parameters

agent_idx

int | str

required

Identifier matching the idx property of the desired agent. The planner’s index is "p". Raises ValueError if no agent with the given index exists.

Returns

agent

BaseAgent

The agent object with the corresponding index.

`set_agent_component_action()`

env.set_agent_component_action(agent_idx, component_name, action)

Directly set the action for a specific agent and action subspace without going through parse_actions().

Parameters

agent_idx

int | str

required

Index of the agent whose action to set.

component_name

str

required

Name of the action subspace to set the action for.

action

int

required

Integer index of the chosen action within the named subspace.

`parse_actions()`

env.parse_actions(action_dictionary)

Parse an {agent_idx: action} dictionary and load the actions into each agent’s action buffer.

Parameters

action_dictionary

dict

required

Dictionary mapping agent indices to their chosen actions. Same format as the actions argument of step().

Abstract methods

Every concrete scenario subclass must implement the following methods.

`reset_starting_layout()`

def reset_starting_layout(self) -> None:

Part 1 of the scenario reset. Handles resetting the environment state managed by the scenario — resource and landmark layout on the world map.

`reset_agent_states()`

def reset_agent_states(self) -> None:

Part 2 of the scenario reset. Handles resetting the state of the agents themselves — inventory, locations, and so on.

`scenario_step()`

def scenario_step(self) -> None:

Update the world state according to this scenario’s passive dynamics. Called inside step() after all component steps and before observation/reward generation. Implement resource regeneration, income redistribution, and similar rules here.

`generate_observations()`

def generate_observations(self) -> dict:

Generate the scenario’s own observations. A scenario may supply observations for none, some, or all agent types, but must be consistent: if it yields an observation for an agent type, it must always do so with the same structure.

Returns

obs

dict

Dictionary {agent.idx: agent_obs_dict} with one entry for each agent type this scenario provides observations for.

`compute_reward()`

def compute_reward(self) -> dict:

Apply the scenario’s reward function(s) to compute per-agent rewards for the current timestep.

Returns

rew

dict

Dictionary {agent.idx: scalar_reward} with one float reward per agent, including the planner.

Optional override methods

`additional_reset_steps()`

def additional_reset_steps(self) -> None:

Called at the end of the reset cycle, after reset_starting_layout(), reset_agent_states(), and each component’s reset(). Override to perform any final scenario-specific initialization.

`scenario_metrics()`

def scenario_metrics(self) -> dict | None:

Return a flat {metric_key: scalar_value} dictionary of scenario-specific metrics. These are merged with component metrics and exposed via the metrics property. Return None (the default) to contribute no metrics.

Scenario registry

The scenario_registry is a Registry object that maps scenario names to their classes. Decorate a subclass with @scenario_registry.add to register it.

from ai_economist.foundation.base.base_env import BaseEnvironment, scenario_registry

@scenario_registry.add
class ExampleScenario(BaseEnvironment):
    name = "example/my_scenario"
    agent_subclasses = ["BasicMobileAgent"]
    required_entities = ["Wood"]

    def reset_starting_layout(self): ...
    def reset_agent_states(self): ...
    def scenario_step(self): ...
    def generate_observations(self): ...
    def compute_reward(self): ...

assert scenario_registry.has("example/my_scenario")

The foundation package exposes the scenario registry as foundation.scenarios. A scenario registered in the above way is only visible through foundation.scenarios if its module is imported in ai_economist/foundation/scenarios/__init__.py.

Environment

Building Blocks

Built-in Components

Documentation Index

​Class attributes

​Constructor

​Parameters

​Properties

​world

​all_agents

​episode_length

​n_agents

​components

​resources

​landmarks

​endogenous

​metrics

​previous_episode_metrics

​dense_log

​previous_episode_dense_log

​previous_episode_replay_log

​inv_scale

​Methods

​reset()

​Parameters

​Returns

​step()

​Parameters

​Returns

​seed()

​Parameters

​get_component()

​Parameters

​Returns

​get_agent()

​Parameters

​Returns

​set_agent_component_action()

​Parameters

​parse_actions()

​Parameters

​Abstract methods

​reset_starting_layout()

​reset_agent_states()

​scenario_step()

​generate_observations()

​Returns

​compute_reward()

​Returns

​Optional override methods

​additional_reset_steps()

​scenario_metrics()

​Scenario registry

Build docs developers (and LLMs) love

Class attributes

Constructor

Parameters

Properties

`world`

`all_agents`

`episode_length`

`n_agents`

`components`

`resources`

`landmarks`

`endogenous`

`metrics`

`previous_episode_metrics`

`dense_log`

`previous_episode_dense_log`

`previous_episode_replay_log`

`inv_scale`

Methods

`reset()`

Parameters

Returns

`step()`

Parameters

Returns

`seed()`

Parameters

`get_component()`

Parameters

Returns

`get_agent()`

Parameters

Returns

`set_agent_component_action()`

Parameters

`parse_actions()`

Parameters

Abstract methods

`reset_starting_layout()`

`reset_agent_states()`

`scenario_step()`

`generate_observations()`

Returns

`compute_reward()`

Returns

Optional override methods

`additional_reset_steps()`

`scenario_metrics()`

Scenario registry