Skip to main content

Overview

RepoMaster implements a sophisticated multi-agent system where specialized agents collaborate to solve complex programming tasks. The system uses AutoGen framework for agent orchestration and communication.

Agent Hierarchy

Primary Agents

1. Scheduler Agent

The scheduler_agent is the brain of the system, responsible for task analysis and mode selection. System Message (agent_scheduler.py:19-71):
scheduler_system_message = dedent("""Role: Enhanced Task Scheduler

Primary Responsibility:
Analyze user input, create structured task plan, and select appropriate 
tools from: Web Search, Repository Mode, or General Code Assistant Mode.

Mode Selection Strategy:
- Prioritize Web Search: For real-time data, definitions, general knowledge
- Repository Mode: For tasks involving code repositories (GitHub/local)
- General Code Assistant Mode: For general programming questions
""")
Capabilities:
  • Task requirement analysis
  • Tool selection and orchestration
  • Multi-step planning
  • Result evaluation and fallback handling
Termination Logic: Replies with “TERMINATE” when the task is successfully completed.

2. User Proxy Agent

The user_proxy acts as an execution proxy that interfaces between the scheduler and tool execution. System Message (agent_scheduler.py:73-81):
user_proxy_system_message = dedent("""Role: Execution Proxy (User Proxy)

Primary Rules:
- Do not provide user-facing answers
- Never paraphrase or repeat content from scheduler_agent
- Summarize tool outputs succinctly for scheduler_agent only
- Respond with "TERMINATE" after scheduler delivers final answer
""")
Configuration:
self.user_proxy = ExtendedUserProxyAgent(
    name="user_proxy",
    system_message=user_proxy_system_message,
    llm_config=self.llm_config,
    is_termination_msg=lambda x: x.get("content", "").endswith("TERMINATE"),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    code_execution_config=self.code_execution_config
)

Repository Exploration Agents

Code_Explorer Agent

The intelligent assistant that plans repository exploration and code analysis. Initialization (agent_code_explore.py:180-185):
self.explore = ExtendedAssistantAgent(
    name="Code_Explorer",
    system_message=explorer_system_message,
    llm_config=self.llm_config,
    is_termination_msg=self.token_limit_termination
)
System Message Components (prompt.py):
  • Current timestamp for context
  • Repository path information
  • Available tools description
  • Task-specific instructions (e.g., Kaggle competition handling)
Responsibilities:
  • Analyze task requirements
  • Plan exploration strategy
  • Call appropriate analysis tools
  • Generate code solutions
  • Synthesize final results

Coder_Executor Agent

The execution agent that runs code and tool operations. Configuration (agent_code_explore.py:188-203):
self.executor = ExtendedUserProxyAgent(
    name="Coder_Excuter",
    system_message="""You are the assistant to the code analyzer, 
    responsible for executing code analysis and viewing operations.""",
    human_input_mode="NEVER",
    llm_config=self.llm_config,
    code_execution_config=self.code_execution_config,
    is_termination_msg=self.token_limit_termination,
    remote_repo_path=self.remote_repo_path,
    local_repo_path=self.local_repo_path,
    work_dir=self.work_dir
)
Execution Modes:
Uses EnhancedDockerCommandLineCodeExecutor for containerized execution:
executor = EnhancedDockerCommandLineCodeExecutor(
    image="whc_docker",
    timeout=7200,  # 2 hours
    work_dir=self.work_dir,
    keep_same_path=True,
    network_mode="host"
)
Uses LocalCommandLineCodeExecutor with isolated virtual environment:
local_executor = LocalCommandLineCodeExecutor(
    work_dir=self.work_dir,
    timeout=7200,
    virtual_env_context=self.venv_context
)

Agent Communication Protocol

Message Flow

Tool Registration

Agents communicate through registered tool functions (agent_scheduler.py:334-347):
def register_tools(self):
    register_toolkits(
        [
            self.web_search,                  # Web search capability
            self.run_repository_agent,        # Repository processing
            self.run_general_code_assistant,  # General assistance
            self.github_repo_search,          # Repository search
        ],
        self.scheduler,    # Assistant agent
        self.user_proxy,   # User proxy agent
    )

Conversation Management

Token Limit Handling

The system monitors conversation length and summarizes history when needed (agent_code_explore.py:97-142):
def token_limit_termination(self, msg):
    """Check if token limit is reached to decide termination"""
    messages = self.executor.chat_messages.get(self.explore, [])
    total_tokens = sum(
        get_code_abs_token(str(m.get("content", "")))
        for m in messages
    )
    
    if total_tokens > self.limit_restart_tokens:
        self.is_restart = True
        self.chat_turns += len(messages) - 1
        return True
    return False

Chat History Summarization

When conversations exceed token limits, the system summarizes history (base_code_explorer.py:213-304): Summary Structure:
{
    "history_summary": [
        {
            "subtask_goal": "Goal description",
            "tool_calls": [
                {
                    "function_name": "tool_name",
                    "arguments": "arguments",
                    "response_summary": "key information"
                }
            ],
            "code_executions": [
                {
                    "intention": "purpose",
                    "code": "code snippet",
                    "execution_result_analysis": "analysis"
                }
            ],
            "reflection": "lessons learned"
        }
    ]
}
Smart Restart: When token limit is reached, the system summarizes the conversation and restarts with the summary as context, allowing continuation of complex tasks.

Deep Search Agent

AutogenDeepSearchAgent

Specialized agent for web search and information retrieval. Integration (agent_scheduler.py:96-99):
self.repo_searcher = AutogenDeepSearchAgent(
    llm_config=self.llm_config,
    code_execution_config=self.code_execution_config
)
Usage Patterns:

Web Search

web_search(query) - General internet search for real-time information

Repository Search

github_repo_search(task) - Find relevant GitHub repositories

Issue Solutions

issue_solution_search(issue_description) - Find programming solutions

Deep Research

deep_search(query) - Comprehensive research queries

Agent Lifecycle

Initialization Sequence

  1. Environment Setup
    • Load configuration (llm_config, code_execution_config)
    • Create working directories
    • Initialize virtual environments (if enabled)
  2. Agent Creation
    • Instantiate scheduler and user_proxy agents
    • Setup system messages and termination conditions
    • Configure execution environments
  3. Tool Registration
    • Register available tools with agents
    • Setup function signatures and descriptions
    • Enable tool calling capabilities
  4. Conversation Initiation
    chat_result = self.user_proxy.initiate_chat(
        self.scheduler,
        message=initial_message,
        max_turns=12,
        summary_method="reflection_with_llm"
    )
    

Execution Modes

Tools are called one at a time based on previous results:
  1. Scheduler analyzes task
  2. Calls first tool
  3. Evaluates result
  4. Decides next action
  5. Continues until completion
For repository mode, the system tries multiple approaches:
  1. Search for repositories
  2. Try most promising repository
  3. Evaluate results
  4. If unsuccessful, try next repository
  5. Continue until success or exhaustion

Best Practices

For Agent Configuration

  • Set appropriate max_turns based on task complexity (typically 10-20)
  • Configure timeout values for long-running operations (default: 2 hours)
  • Use summary_method="reflection_with_llm" for better final summaries

For Tool Design

  • Provide clear, descriptive function names
  • Use type hints with Annotated for parameter descriptions
  • Return structured, parseable results
  • Include error handling and helpful error messages

For Conversation Management

  • Monitor token usage with token_limit_termination
  • Enable chat history summarization for long tasks
  • Use conversation context from ConversationManager

Next Steps

Task Routing

Learn how tasks are routed to appropriate agents

Repository Exploration

Understand code analysis capabilities

Build docs developers (and LLMs) love