Multi-Agent Systems

The MultiAgent class enables you to coordinate multiple agents interacting in a shared environment. This is essential for multi-agent reinforcement learning, competitive/cooperative scenarios, and complex simulations.

MultiAgent Class

The MultiAgent class is defined in neurenix/agent/multi_agent.py and provides functionality for coordinating multiple agents in a shared environment.

Constructor

from neurenix.agent import Agent, Environment, MultiAgent

# Create agents and environment
agents = [MyAgent(f"agent-{i}") for i in range(3)]
env = MyEnvironment()

# Create multi-agent system
mas = MultiAgent(agents, env)

Parameters:

agents (List[Agent]): List of agents in the system
environment (Environment): Shared environment for the agents

Properties

agents

The list of agents in the system.

print(f"Number of agents: {len(mas.agents)}")
for agent in mas.agents:
    print(agent.name)

environment

The shared environment for the agents.

env_state = mas.environment.state

step_count

Get the number of steps taken in the current episode.

print(f"Current step: {mas.step_count}")

Returns: int - Number of steps taken

Core Methods

step()

Perform a single step of the multi-agent system. This method:

Gets observations for each agent from the environment
Has each agent select an action based on its observation
Applies all actions to the environment
Returns the results

results = mas.step()

print(results["observations"])  # Observations for each agent
print(results["actions"])       # Actions taken by each agent
print(results["rewards"])       # Rewards for each agent
print(results["done"])          # Whether episode is complete
print(results["info"])          # Additional information

Returns: Dict[str, Any] - Dictionary containing:

observations (dict): Observations for each agent (keyed by agent.id)
actions (dict): Actions taken by each agent (keyed by agent.id)
rewards (dict): Rewards for each agent (keyed by agent.id)
done (bool): Whether the episode is complete
info (dict): Additional information from the environment

reset()

Reset the multi-agent system. This method:

Resets the environment
Resets each agent
Returns the initial observations

initial_observations = mas.reset()

for agent_id, obs in initial_observations.items():
    print(f"Agent {agent_id} initial observation: {obs}")

Returns: Dict[str, Any] - Dictionary containing initial observations for each agent (keyed by agent.id)

add_agent(agent)

Add a new agent to the system.

new_agent = MyAgent("agent-new")
mas.add_agent(new_agent)

Parameters:

agent (Agent): The agent to add

Returns: None

remove_agent(agent_id)

Remove an agent from the system.

removed_agent = mas.remove_agent("agent-1")
if removed_agent:
    print(f"Removed agent: {removed_agent.name}")
else:
    print("Agent not found")

Parameters:

agent_id (str): ID of the agent to remove

Returns: Optional[Agent] - The removed agent, or None if not found

Examples

Basic Multi-Agent Simulation

Run a simple multi-agent simulation:

from neurenix.agent import Agent, Environment, MultiAgent

# Define custom agent
class SimpleAgent(Agent):
    def __init__(self, name):
        super().__init__(name)
        self.id = name  # Add id attribute for MultiAgent
    
    def act(self, observation):
        # Simple action selection
        return {"move": "forward"}
    
    def learn(self, experience):
        pass

# Define custom environment
class GridEnvironment(Environment):
    def __init__(self, size=10):
        super().__init__()
        self.size = size
    
    def step(self, actions):
        # Process all agent actions
        rewards = {}
        for agent_id, action in actions.items():
            rewards[agent_id] = self._compute_reward(agent_id, action)
        
        return {
            "rewards": rewards,
            "done": self._is_done(),
            "info": {}
        }
    
    def observe(self, agent):
        # Return observation for specific agent
        return {
            "position": self._state.get(f"{agent.id}_pos", [0, 0]),
            "grid": self._get_visible_grid(agent)
        }
    
    def _compute_reward(self, agent_id, action):
        return 0.0
    
    def _is_done(self):
        return False
    
    def _get_visible_grid(self, agent):
        return []

# Create and run multi-agent system
agents = [SimpleAgent(f"agent-{i}") for i in range(5)]
env = GridEnvironment(size=20)
mas = MultiAgent(agents, env)

# Run simulation
observations = mas.reset()
for step in range(100):
    results = mas.step()
    
    print(f"Step {mas.step_count}:")
    for agent_id, reward in results["rewards"].items():
        print(f"  {agent_id}: reward={reward}")
    
    if results["done"]:
        print("Episode complete!")
        break

Competitive Multi-Agent RL

Create a competitive multi-agent reinforcement learning scenario:

from neurenix.agent import Agent, Environment, MultiAgent
from neurenix.nn import Sequential, Linear, ReLU
from neurenix.tensor import Tensor
import numpy as np

class CompetitiveAgent(Agent):
    def __init__(self, name, state_dim, action_dim):
        super().__init__(name)
        self.id = name
        
        # Policy network
        self.policy = Sequential(
            Linear(state_dim, 64),
            ReLU(),
            Linear(64, action_dim)
        )
        
        self.experience_buffer = []
    
    def act(self, observation):
        state = Tensor(observation["state"])
        action_logits = self.policy.forward(state)
        
        # Epsilon-greedy exploration
        if np.random.random() < 0.1:
            return np.random.randint(0, len(action_logits.data))
        else:
            return np.argmax(action_logits.data)
    
    def learn(self, experience):
        self.experience_buffer.append(experience)
        
        # Batch learning every 32 experiences
        if len(self.experience_buffer) >= 32:
            self._update_policy()
            self.experience_buffer = []
    
    def _update_policy(self):
        # Implement policy update (e.g., PPO, DQN)
        pass

class CompetitiveEnvironment(Environment):
    def __init__(self, num_agents):
        super().__init__()
        self.num_agents = num_agents
        self.agent_positions = {}
        self.resources = []
    
    def reset(self):
        self._state = {
            "positions": {},
            "resources": self._generate_resources(),
            "scores": {agent_id: 0 for agent_id in self._agents.keys()}
        }
        return self._state
    
    def step(self, actions):
        rewards = {}
        
        # Process each agent's action
        for agent_id, action in actions.items():
            old_pos = self._state["positions"].get(agent_id, [0, 0])
            new_pos = self._compute_new_position(old_pos, action)
            self._state["positions"][agent_id] = new_pos
            
            # Check for resource collection
            reward = 0
            if new_pos in self._state["resources"]:
                reward = 10
                self._state["resources"].remove(new_pos)
                self._state["scores"][agent_id] += 1
            
            # Penalty for collision with other agents
            for other_id, other_pos in self._state["positions"].items():
                if other_id != agent_id and new_pos == other_pos:
                    reward -= 5
            
            rewards[agent_id] = reward
        
        # Episode ends when all resources collected
        done = len(self._state["resources"]) == 0
        
        return {
            "rewards": rewards,
            "done": done,
            "info": {"scores": self._state["scores"]}
        }
    
    def observe(self, agent):
        pos = self._state["positions"].get(agent.id, [0, 0])
        return {
            "state": self._create_observation_vector(agent.id, pos),
            "position": pos,
            "score": self._state["scores"].get(agent.id, 0)
        }
    
    def _generate_resources(self):
        return [[np.random.randint(0, 10), np.random.randint(0, 10)] for _ in range(20)]
    
    def _compute_new_position(self, pos, action):
        # 4 actions: up, down, left, right
        moves = [[0, 1], [0, -1], [-1, 0], [1, 0]]
        move = moves[action % 4]
        return [pos[0] + move[0], pos[1] + move[1]]
    
    def _create_observation_vector(self, agent_id, pos):
        # Create observation vector (position + nearby resources + other agents)
        obs = np.zeros(10)
        obs[0:2] = pos
        # Add more features...
        return obs

# Create competitive multi-agent system
num_agents = 4
agents = [CompetitiveAgent(f"agent-{i}", state_dim=10, action_dim=4) 
          for i in range(num_agents)]
env = CompetitiveEnvironment(num_agents)

# Register agents with environment
for agent in agents:
    env.register_agent(agent)

mas = MultiAgent(agents, env)

# Training loop
for episode in range(1000):
    observations = mas.reset()
    episode_rewards = {agent.id: 0 for agent in agents}
    
    while True:
        results = mas.step()
        
        # Update episode rewards
        for agent_id, reward in results["rewards"].items():
            episode_rewards[agent_id] += reward
        
        # Have agents learn from experience
        for agent in agents:
            experience = {
                "observation": results["observations"][agent.id],
                "action": results["actions"][agent.id],
                "reward": results["rewards"][agent.id],
                "done": results["done"]
            }
            agent.learn(experience)
        
        if results["done"]:
            break
    
    # Print episode statistics
    if episode % 100 == 0:
        print(f"Episode {episode}:")
        for agent_id, total_reward in episode_rewards.items():
            print(f"  {agent_id}: {total_reward}")

Cooperative Multi-Agent Task

Create agents that cooperate to achieve a shared goal:

class CooperativeAgent(Agent):
    def __init__(self, name, role):
        super().__init__(name)
        self.id = name
        self.role = role  # "explorer", "collector", "builder"
    
    def act(self, observation):
        # Role-specific behavior
        if self.role == "explorer":
            return self._explore_action(observation)
        elif self.role == "collector":
            return self._collect_action(observation)
        else:
            return self._build_action(observation)
    
    def _explore_action(self, obs):
        # Explore unknown areas
        return {"type": "move", "direction": "random"}
    
    def _collect_action(self, obs):
        # Collect discovered resources
        return {"type": "collect", "target": obs.get("nearest_resource")}
    
    def _build_action(self, obs):
        # Build structures with collected resources
        return {"type": "build", "structure": "base"}
    
    def learn(self, experience):
        # Learn to coordinate better
        pass

# Create cooperative team
team = [
    CooperativeAgent("explorer-1", "explorer"),
    CooperativeAgent("explorer-2", "explorer"),
    CooperativeAgent("collector-1", "collector"),
    CooperativeAgent("builder-1", "builder")
]

env = CooperativeEnvironment()
mas = MultiAgent(team, env)

# Run cooperative task
observations = mas.reset()
while True:
    results = mas.step()
    
    # Check if team goal is achieved
    if results["info"].get("goal_achieved"):
        print("Team succeeded!")
        break
    
    if results["done"]:
        print("Team failed.")
        break

Dynamic Agent Management

Add and remove agents dynamically during simulation:

mas = MultiAgent([], env)

# Start with one agent
initial_agent = MyAgent("agent-0")
mas.add_agent(initial_agent)

for step in range(1000):
    results = mas.step()
    
    # Add new agent every 100 steps
    if step % 100 == 0 and len(mas) < 10:
        new_agent = MyAgent(f"agent-{len(mas)}")
        mas.add_agent(new_agent)
        print(f"Added {new_agent.name}. Total agents: {len(mas)}")
    
    # Remove poorly performing agents
    for agent in mas.agents:
        if results["rewards"][agent.id] < -10:
            mas.remove_agent(agent.id)
            print(f"Removed {agent.name}")
    
    if results["done"]:
        break

Best Practices

1. Agents Need an `id` Attribute

The MultiAgent class uses agent.id to identify agents. Make sure your agents have this attribute:

class MyAgent(Agent):
    def __init__(self, name):
        super().__init__(name)
        self.id = name  # Important for MultiAgent!

2. Environment Must Support Multi-Agent

Your environment should handle actions from multiple agents:

class MultiAgentEnvironment(Environment):
    def step(self, actions):
        # actions is a dict: {agent_id: action}
        rewards = {}
        for agent_id, action in actions.items():
            rewards[agent_id] = self._process_action(agent_id, action)
        
        return {
            "rewards": rewards,  # Dict of rewards per agent
            "done": False,
            "info": {}
        }

3. Register Agents with Environment

If your environment needs to track agents, register them:

for agent in agents:
    env.register_agent(agent)

mas = MultiAgent(agents, env)

4. Handle Episode Termination

Decide when episodes end based on your use case:

def step(self, actions):
    # Episode ends when:
    # - All agents reach goal (cooperative)
    # - One agent wins (competitive)
    # - Maximum steps reached
    # - All resources depleted
    
    done = self._check_termination_condition()
    
    return {
        "rewards": rewards,
        "done": done,
        "info": {}
    }

API Reference

MultiAgent

Source: neurenix/agent/multi_agent.py:10

class MultiAgent:
    def __init__(self, agents: List[Agent], environment: Environment)
    
    @property
    def step_count(self) -> int
    
    def step(self) -> Dict[str, Any]
    def reset(self) -> Dict[str, Any]
    def add_agent(self, agent: Agent) -> None
    def remove_agent(self, agent_id: str) -> Optional[Agent]
    def __len__(self) -> int

Get Started

Core Concepts

AI Agents

Reinforcement Learning

Advanced Features

Specialized Modules

Hardware Support

Deployment

Multi-Agent Systems

Multi-Agent Systems

MultiAgent Class

Constructor

Properties

agents

environment

step_count

Core Methods

step()

reset()

add_agent(agent)

remove_agent(agent_id)

Examples

Basic Multi-Agent Simulation

Competitive Multi-Agent RL

Cooperative Multi-Agent Task

Dynamic Agent Management

Best Practices

1. Agents Need an `id` Attribute

2. Environment Must Support Multi-Agent

3. Register Agents with Environment

4. Handle Episode Termination

API Reference

MultiAgent

See Also

Build docs developers (and LLMs) love

Get Started

Core Concepts

AI Agents

Reinforcement Learning

Advanced Features

Specialized Modules

Hardware Support

Deployment

Documentation Index

​Multi-Agent Systems

​MultiAgent Class

​Constructor

​Properties

​agents

​environment

​step_count

​Core Methods

​step()

​reset()

​add_agent(agent)

​remove_agent(agent_id)

​Examples

​Basic Multi-Agent Simulation

​Competitive Multi-Agent RL

​Cooperative Multi-Agent Task

​Dynamic Agent Management

​Best Practices

​1. Agents Need an id Attribute

​2. Environment Must Support Multi-Agent

​3. Register Agents with Environment

​4. Handle Episode Termination

​API Reference

​MultiAgent

​See Also

Build docs developers (and LLMs) love

Multi-Agent Systems

MultiAgent Class

Constructor

Properties

agents

environment

step_count

Core Methods

step()

reset()

add_agent(agent)

remove_agent(agent_id)

Examples

Basic Multi-Agent Simulation

Competitive Multi-Agent RL

Cooperative Multi-Agent Task

Dynamic Agent Management

Best Practices

1. Agents Need an `id` Attribute

2. Environment Must Support Multi-Agent

3. Register Agents with Environment

4. Handle Episode Termination

API Reference

MultiAgent

See Also