Overview
This guide covers common issues you may encounter when using ML Experiment Autopilot, along with solutions and debugging strategies.Installation & Setup Issues
ImportError: No module named 'src'
ImportError: No module named 'src'
Error:Cause: Running the script incorrectly (not as a module)Solution: Always run as a module:From the README troubleshooting section (line 512).
GEMINI_API_KEY not found
GEMINI_API_KEY not found
Error:Cause: Missing or misconfigured API keySolution:Get a free API key from Google AI Studio.From the README troubleshooting section (line 514).
MLflow UI shows no experiments
MLflow UI shows no experiments
Error: MLflow UI opens but shows “No experiments”Cause: Incorrect From the README troubleshooting section (line 515).
--backend-store-uriSolution: Verify the URI points to outputs/mlruns:Data & Configuration Issues
Target column not found
Target column not found
Error:Cause: Target column name doesn’t match dataset (case-sensitive)Solution:From the README troubleshooting section (line 519).
Unsupported file format
Unsupported file format
Error:Cause: Dataset is not CSV or ParquetSolution: Convert to CSV:Then run with CSV:
Dataset too large (memory error)
Dataset too large (memory error)
Error:Cause: Dataset exceeds available memorySolution: Sample the dataset:Or use Parquet for better memory efficiency:
Experiment Execution Issues
Experiment timeout
Experiment timeout
Error:Cause: Experiment exceeded the 300-second timeout (defined in Or use From the README troubleshooting section (line 516).
src/config.py:54)Solution: Increase timeout in configuration:--time-budget to allow more total time:Generated code syntax error
Generated code syntax error
Error:Cause: Code generation template error or corrupted outputSolution: Inspect the generated code:Code is validated with
ast.parse() before execution (in src/execution/code_generator.py). If this error occurs, it’s a bug—please report it.From the README troubleshooting section (line 518).Import errors in generated code
Import errors in generated code
Error:Cause: Missing optional dependencySolution: Install required packages:
Multiple experiments fail consecutively
Multiple experiments fail consecutively
Observation: 3+ experiments fail in a row with errorsCauses:Solutions:
- Data quality issues (e.g., too many missing values)
- Incompatible hyperparameters
- Gemini generating invalid configurations
- Add constraints to guide Gemini toward working approaches
- Simplify dataset (remove problematic features)
- Manually test a simple model to verify data integrity
Gemini API Issues
Gemini rate limit (429)
Gemini rate limit (429)
Error:Cause: Exceeded Gemini API rate limitsHow it’s handled: Automatic retry with exponential backoff (max 3 retries, defined in
src/config.py:36)If retries fail:- Reduce iteration frequency (increase
--time-budget) - Upgrade API tier at Google AI Studio
- Wait and resume later with
--resume
Gemini invalid response
Gemini invalid response
Error:Cause: Gemini returned non-JSON or incomplete responseHow it’s handled: Falls back to basic analysis (in
ResultsAnalyzer._get_fallback_analysis())If persistent:- Check API key validity
- Verify network connectivity
- Try again later (may be temporary API issue)
Gemini context length exceeded
Gemini context length exceeded
Error:Cause: Conversation history + data profile too long for Gemini’s context windowSolution: Reduce experiment history size:
- Use fewer iterations (
--max-iterations 10) - Simplify constraints file
- Start fresh session instead of very long runs
Performance & Quality Issues
No improvement after baseline
No improvement after baseline
Observation: Iterations 1-5 show no improvement over baselineCauses:Solutions:
- Dataset is simple (baseline already near-optimal)
- Insufficient feature information
- Poor data quality
- Add feature engineering via preprocessing
- Use constraints to suggest specific strategies
- Verify dataset has predictive features
Plateau detected prematurely
Plateau detected prematurely
Observation: Experiment stops after 3 iterations with “Performance plateau detected”Cause: Improvement < 0.5% for 3 consecutive iterations (defined in Or use constraints:
src/config.py:52-53)Solution: Adjust plateau threshold:Gemini ignores constraints
Gemini ignores constraints
Observation: Experiments don’t follow specified constraintsDebugging:Solutions:
- Make constraints more explicit and specific
- Avoid contradictory requirements
- Add rationale: “Prefer tree-based models because…”
- Check constraint file path is correct
High variance in metrics across iterations
High variance in metrics across iterations
Observation: Metrics fluctuate wildly (e.g., RMSE: 0.2 → 0.8 → 0.3)Causes:
- Inconsistent preprocessing across experiments
- High-variance models (e.g., deep trees without regularization)
- Small dataset with train/test split instability
- Add constraints to standardize preprocessing
- Increase dataset size if possible
- Use cross-validation (future feature)
- Guide toward stable model families via constraints
Output & Reporting Issues
Visualization generation fails
Visualization generation fails
Warning:Cause: Matplotlib backend issue or missing dataHow it’s handled: Gracefully degrades—report still generates without plotsIf you need visualizations:
Report generation fails
Report generation fails
Warning:Cause: Gemini API error or insufficient experiment dataDebugging:Workaround: Generate report manually from state:
Output directory permissions error
Output directory permissions error
Error:Solution:
Resume & State Issues
State file corrupted
State file corrupted
Error:Cause: State file corrupted or truncatedSolution:
Cannot resume: data path mismatch
Cannot resume: data path mismatch
Error:Cause:
--data argument different from original sessionSolution: Use exact same arguments:Debugging Strategies
Enable Verbose Mode
Always run with--verbose when debugging:
- Gemini’s reasoning for each experiment
- Hypothesis generation process
- Analysis observations
- Full error tracebacks
Inspect Generated Code
Generated scripts are saved inoutputs/experiments/{session_id}/:
Check MLflow Artifacts
MLflow stores detailed artifacts:Review State File
The state file contains complete session history:Test Components Individually
Isolate issues by testing components:Getting Help
If you encounter an issue not covered here:Gather Context
Collect:
- Error message and traceback
- State file:
outputs/state_<session_id>.json - Generated code:
outputs/experiments/<session_id>/ - MLflow error artifacts
Common Error Messages Reference
| Error Message | Section | Quick Fix |
|---|---|---|
No module named 'src' | Installation | Run as module: python -m src.main |
GEMINI_API_KEY not found | Setup | Create .env with API key |
MLflow UI shows no experiments | MLflow | Use --backend-store-uri file:./outputs/mlruns |
Target column not found | Data | Check column name (case-sensitive) |
Experiment timed out | Execution | Increase timeout in src/config.py |
429 Resource exhausted | Gemini | Wait for retry or upgrade API tier |
Syntax error in generated code | Execution | Inspect outputs/experiments/, report bug |
State file corrupted | Resume | Restore from backup or restart |
Next Steps
Running Experiments
Return to the main guide
Understanding Results
Learn to interpret outputs