Overview
The ExperimentController is the central orchestrator that manages the complete ML experiment lifecycle. It coordinates data profiling, baseline modeling, iterative experiment design, code generation, execution, analysis, and reporting.
Architecture
The controller implements a multi-phase state machine:
- Data Profiling - Analyze dataset characteristics
- Baseline Modeling - Establish performance baseline
- Experiment Loop - Iterative design, execute, analyze
- Finalization - Generate report and visualizations
Class Definition
ExperimentController
from src.orchestration.controller import ExperimentController
from pathlib import Path
controller = ExperimentController(
data_path=Path("housing.csv"),
target_column="price",
task_type="regression",
constraints="Focus on interpretable models",
max_iterations=20,
time_budget=3600,
output_dir=Path("outputs/"),
verbose=True,
resume_path=None
)
Path to the dataset file (CSV or Parquet)
Name of the target column for prediction
Type of ML task: 'classification' or 'regression'
Natural language constraints for the AI:
- “Only use tree-based models”
- “Prioritize interpretability over accuracy”
- “Must achieve R² > 0.85”
- “Avoid deep learning methods”
Maximum number of experiment iterations. Default: 20
Time budget in seconds. Default: 3600 (1 hour)
Output directory for results. Defaults to project outputs/ folder
Whether to show detailed reasoning and analysis. Default: False
Path to state JSON file to resume from previous session
Methods
run()
Run the complete experiment loop from start to finish.
This method:
- Profiles the dataset (if not already done)
- Runs baseline model (if not already done)
- Iteratively designs and executes experiments
- Analyzes results after each iteration
- Generates hypotheses for next iteration
- Terminates based on stopping conditions
- Generates final report and visualizations
Stopping conditions:
- Maximum iterations reached
- Time budget exhausted
- Performance plateau detected (3 iterations without improvement)
- Target metric achieved
- AI agent recommends stopping
The method handles all phases automatically. If execution fails, state is saved for potential resume.
save_state()
Save current experiment state to disk.
Saves to: {output_dir}/state_{session_id}.json
Saved information:
- All experiment results
- Data profile
- Best metric tracking
- Current phase
- Gemini conversation history
- Termination status
Experiment Phases
The controller tracks progress through these phases:
Initial state before any work
Analyzing dataset with DataProfiler
Running baseline model for comparison
Using Gemini to design next experiment
Generating Python script from specification
Running generated script in subprocess
Analyzing experiment results with Gemini
Generating hypotheses for next iteration
Creating final report with Gemini
Successfully finished all work
Components
The controller initializes and coordinates these components:
Cognitive Layer (Gemini-powered)
- GeminiClient - API client for Gemini
- ExperimentDesigner - Designs experiments based on data and history
- ResultsAnalyzer - Analyzes results and identifies patterns
- HypothesisGenerator - Generates testable hypotheses
- ReportGenerator - Creates final markdown report
Execution Layer
- DataProfiler - Profiles dataset characteristics
- CodeGenerator - Generates Python experiment scripts
- ExperimentRunner - Executes scripts in subprocesses
- VisualizationGenerator - Creates matplotlib plots
Persistence Layer
- MLflowTracker - Logs experiments to MLflow
- ExperimentState - Pydantic model for state management
Complete Example
from pathlib import Path
from src.orchestration.controller import ExperimentController
# Create controller
controller = ExperimentController(
data_path=Path("data/titanic.csv"),
target_column="survived",
task_type="classification",
constraints="""
- Prioritize interpretable models
- Must achieve F1 > 0.80
- Avoid ensemble methods with >100 trees
""",
max_iterations=15,
time_budget=1800, # 30 minutes
output_dir=Path("outputs/titanic_run"),
verbose=True
)
# Run complete experiment loop
try:
controller.run()
print("✓ Experiment completed successfully")
# Access results
state = controller.state
print(f"Best experiment: {state.best_experiment}")
print(f"Best {state.config.primary_metric}: {state.best_metric:.4f}")
print(f"Total iterations: {state.current_iteration}")
print(f"Elapsed time: {state.get_elapsed_time():.0f}s")
except Exception as e:
print(f"✗ Experiment failed: {e}")
# State is automatically saved for resume
Resuming Experiments
Resume from a saved state file:
# Resume from previous session
controller = ExperimentController(
data_path=Path("data/housing.csv"),
target_column="price",
task_type="regression",
resume_path=Path("outputs/state_abc123.json")
)
controller.run()
When resuming, all initialization parameters except resume_path are loaded from the state file.
State Management
The controller maintains state using the ExperimentState Pydantic model:
# Access current state
state = controller.state
# Get summary
summary = state.get_summary()
print(summary)
# Output:
# {
# 'session_id': 'abc123',
# 'phase': 'experiment_design',
# 'current_iteration': 5,
# 'max_iterations': 20,
# 'elapsed_time': 450.3,
# 'best_metric': 0.876,
# 'best_experiment': 'rf_tuned_depth',
# 'total_experiments': 6,
# 'successful_experiments': 5
# }
MLflow Integration
The controller automatically logs to MLflow:
# MLflow experiment created as:
experiment_name = f"autopilot_{dataset_name}_{session_id}"
# Logged information:
# - Data profile (parameters and JSON artifact)
# - Each experiment run (parameters, metrics, code)
# - Final summary (metrics, state JSON)
# - Visualizations (PNG files)
View in MLflow UI:
mlflow ui --backend-store-uri ./mlruns
# Open http://localhost:5000
Iteration Loop Details
Each iteration follows this sequence:
def _run_iteration(self):
# 1. Design experiment using Gemini
spec = self._design_experiment()
# 2. Generate Python script
script_path = self.code_generator.generate(spec, ...)
# 3. Execute in subprocess
result = self.runner.run(script_path, spec, iteration)
# 4. Update state
self.state.add_experiment(result)
# 5. Log to MLflow
self.tracker.log_experiment(result)
# 6. Analyze results with Gemini
analysis = self._analyze_results(result)
# 7. Generate hypotheses for next iteration
hypotheses = self._generate_hypotheses(analysis)
# 8. Save state
self.save_state()
Constraint Parsing
The controller parses natural language constraints:
constraints = """
- Only tree-based models (RandomForest, XGBoost)
- Target RMSE < 5000
- Prefer models with <500 trees for speed
"""
# Gemini extracts:
# - Model restrictions
# - Target metric value
# - Performance vs speed tradeoffs
Hypothesis-Driven Design
The controller maintains cross-iteration context:
# After each iteration:
# 1. ResultsAnalyzer creates AnalysisResult
# - Compares to baseline and previous best
# - Identifies trend patterns
# - Notes key observations
# 2. HypothesisGenerator creates HypothesisSet
# - Multiple testable hypotheses
# - Confidence scores and priorities
# - Suggested models and parameters
# 3. ExperimentDesigner uses top hypothesis
# - Incorporates into next experiment design
# - Balances exploration vs exploitation
Output Files
The controller generates these outputs:
outputs/
├── state_{session_id}.json # Experiment state
├── plots/
│ ├── metric_progression.png # Metric over time
│ ├── model_comparison.png # Model type comparison
│ └── improvement_over_baseline.png # Baseline vs best
├── report_{session_id}.md # AI-generated report
└── experiments/
└── {session_id}/
├── baseline_*.py # Baseline script
├── experiment_1_*.py # Generated scripts
└── ...
Error Recovery
try:
controller.run()
except Exception as e:
# State is saved automatically
print(f"Error: {e}")
print(f"State saved to: {controller.output_dir}/state_*.json")
# Can resume later:
# controller = ExperimentController(
# data_path=...,
# resume_path=Path("state_abc123.json")
# )
Verbose Mode
Enable verbose output to see detailed reasoning:
controller = ExperimentController(
...,
verbose=True
)
# Shows:
# - Gemini's reasoning for each experiment design
# - Detailed analysis after each iteration
# - Hypothesis generation process
# - Conversation history length
Source Location
~/workspace/source/src/orchestration/controller.py