Constraints Files

Constraints files allow you to guide the autopilot’s experimentation strategy using natural language. They are written in Markdown format and organized into sections that control different aspects of the ML workflow.

Overview

Constraints files provide a way to:

Specify preferred metrics and evaluation criteria
Guide model selection and algorithm choices
Control preprocessing and feature engineering
Define termination conditions
Enforce business rules and requirements

The autopilot uses Gemini to interpret your constraints and apply them throughout the experimentation process.

File Format

Constraints are written in Markdown with section headers (##) defining different constraint categories. Each section contains bullet points describing specific constraints.

# Experiment Constraints

## Section Name
- Constraint 1
- Constraint 2

Standard Sections

While you can create custom sections, these standard sections are recognized by the autopilot:

Metrics

Define which metrics to optimize and track:

## Metrics
- Primary metric: RMSE
- Also track: MAE, R²
- Optimize for: lowest RMSE

Common metrics:

Regression: RMSE, MAE, R², MAPE
Classification: Accuracy, F1, Precision, Recall, ROC-AUC

Models

Specify model preferences and restrictions:

## Models
- Prefer tree-based models
- Prefer boosting methods
- Avoid neural networks (interpretability required)
- Must use ensemble methods

Common model families:

Tree-based: Decision Trees, Random Forest, Extra Trees
Boosting: XGBoost, LightGBM, CatBoost, AdaBoost
Linear: Linear/Logistic Regression, Ridge, Lasso, ElasticNet
Neural networks: MLP, deep learning models
Other: SVM, KNN, Naive Bayes

Preprocessing

Control data preprocessing and feature engineering:

## Preprocessing
- Log-transform the target variable
- Use median imputation for missing values
- Scale features using StandardScaler
- Create polynomial features up to degree 2
- Do not remove outliers

Common preprocessing steps:

Scaling: StandardScaler, MinMaxScaler, RobustScaler
Imputation: Mean, median, mode, forward-fill, KNN imputation
Encoding: One-hot, label encoding, target encoding
Transformations: Log, square root, box-cox
Feature engineering: Polynomial features, interactions, binning

Termination

Define when experiments should stop:

## Termination
- Stop if no improvement for 3 iterations
- Stop if RMSE < 0.1
- Stop if validation score plateaus

Features

Constraints about feature selection and engineering:

## Features
- Must include: age, income, location
- Exclude: customer_id, timestamp
- Create interaction between age and income
- Use feature selection to keep top 20 features

Validation

Specify validation strategy:

## Validation
- Use 5-fold cross-validation
- Use time-based split (no shuffle)
- Stratify by target variable
- Hold out 20% for final test set

Performance

Resource and runtime constraints:

## Performance
- Training time per model < 5 minutes
- Use max 8 CPU cores
- Prioritize faster models

Complete Example

Here’s a real constraints file from the sample data:

# Experiment Constraints

## Metrics
- Primary metric: RMSE

## Models
- Prefer tree-based models
- Prefer boosting methods

## Preprocessing
- Log-transform the target variable
- Use median imputation for missing values

## Termination
- Stop if no improvement for 3 iterations

Example Use Cases

Financial Modeling

For a loan default prediction model:

# Loan Default Prediction Constraints

## Metrics
- Primary metric: F1 score
- Minimize false negatives (missing actual defaults)
- Track precision and recall separately

## Models
- Must be interpretable (regulatory requirement)
- Prefer: Logistic Regression, Decision Trees, or Rule-based models
- Avoid: Neural networks, complex ensembles

## Features
- Must include: credit_score, income, debt_ratio
- Exclude: race, gender (fairness requirements)
- Create debt-to-income ratio feature

## Validation
- Use time-based split (no data leakage)
- Test on most recent 6 months of data

## Preprocessing
- Impute missing values with median
- Scale numerical features
- Cap outliers at 99th percentile

E-commerce Recommendations

For predicting customer purchase amounts:

# Purchase Amount Prediction Constraints

## Metrics
- Primary metric: MAPE (Mean Absolute Percentage Error)
- Secondary: RMSE

## Models
- Prefer: XGBoost, LightGBM, CatBoost
- Use ensemble of top 3 models for final predictions

## Preprocessing
- Log-transform target (purchase_amount)
- One-hot encode categorical features
- Create recency, frequency, monetary (RFM) features
- Normalize all numeric features

## Features
- Must include: customer_lifetime_value, days_since_last_purchase
- Create: average_order_value, purchase_frequency
- Use feature importance to select top 30 features

## Termination
- Stop if MAPE < 15%
- Stop if no improvement for 5 iterations

## Performance
- Training time per model < 10 minutes
- Prioritize inference speed for real-time predictions

Medical Diagnosis

For disease prediction with strict accuracy requirements:

# Disease Diagnosis Constraints

## Metrics
- Primary metric: ROC-AUC
- Minimum recall: 0.95 (catch all positive cases)
- Track sensitivity and specificity

## Models
- Prefer interpretable models (clinical review requirement)
- Primary: Logistic Regression, Random Forest
- Ensemble acceptable if individual models are interpretable

## Preprocessing
- Handle missing values with domain knowledge:
  - Lab values: use reference range midpoint
  - Demographics: use mode
- Standardize all lab values to z-scores
- No data augmentation (maintain clinical validity)

## Features
- Must include all biomarkers: hemoglobin, glucose, cholesterol
- Feature engineering: create risk score combinations
- Document feature importance for clinical interpretation

## Validation
- Stratified 5-fold cross-validation
- Ensure balanced classes in each fold
- Validate on external dataset from different hospital

## Termination
- Minimum 20 iterations (thorough exploration required)
- Stop if ROC-AUC > 0.90 and recall > 0.95

Usage

Pass your constraints file using the --constraints (or -c) flag:

autopilot run \
  --data train.csv \
  --target price \
  --task regression \
  --constraints constraints.md

How Constraints Are Applied

The autopilot interprets constraints at different stages:

Experiment Planning - Gemini reads constraints and creates an initial experiment plan
Model Selection - Filters and prioritizes models based on preferences
Preprocessing - Applies specified preprocessing steps
Evaluation - Uses specified metrics and validation strategies
Iteration - Checks termination conditions after each experiment
Reporting - Highlights compliance with constraints in final report

Natural Language Flexibility

Constraints are interpreted by Gemini, so you can write them in natural language:

## Models
- I need fast predictions, so avoid slow models like SVM
- The model needs to be explainable to non-technical stakeholders
- Tree-based models have worked well in the past for similar datasets

This is equivalent to:

## Models
- Prioritize fast inference time
- Require interpretability
- Prefer tree-based models

Best Practices

Be Specific

Vague constraints like “use good models” are less helpful than specific ones like “prefer XGBoost and LightGBM for tabular data”.

Explain Why

Including reasoning helps Gemini make better decisions:

## Models
- Avoid neural networks (need interpretability for regulatory compliance)

Prioritize Constraints

Use language like “must”, “prefer”, “avoid” to indicate importance:

Must = Hard requirement
Prefer = Strong preference but flexible
Avoid = Try to avoid but acceptable if needed

Don't Over-Constrain

Too many restrictions can limit the autopilot’s ability to find good solutions. Start with a few key constraints and add more as needed.

Test Without Constraints First

Run one experiment without constraints to see the baseline, then add constraints to guide improvements.

Constraint Validation

The autopilot will:

Parse and validate your constraints file
Warn about conflicting constraints
Show how constraints are being applied (use --verbose)
Report constraint compliance in the final results

Examples Directory

For more examples, see the sample constraints in the repository:

data/sample/constraints.md

Get Started

Core Concepts

CLI Reference

Guides

Examples

Overview

File Format

Standard Sections

Metrics

Models

Preprocessing

Termination

Features

Validation

Performance

Complete Example

Example Use Cases

Financial Modeling

E-commerce Recommendations

Medical Diagnosis

Usage

How Constraints Are Applied

Natural Language Flexibility

Best Practices

Constraint Validation

Examples Directory

See Also

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Reference

Guides

Examples

​Overview

​File Format

​Standard Sections

​Metrics

​Models

​Preprocessing

​Termination

​Features

​Validation

​Performance

​Complete Example

​Example Use Cases

​Financial Modeling

​E-commerce Recommendations

​Medical Diagnosis

​Usage

​How Constraints Are Applied

​Natural Language Flexibility

​Best Practices

​Constraint Validation

​Examples Directory

​See Also

Build docs developers (and LLMs) love

Overview

File Format

Standard Sections

Metrics

Models

Preprocessing

Termination

Features

Validation

Performance

Complete Example

Example Use Cases

Financial Modeling

E-commerce Recommendations

Medical Diagnosis

Usage

How Constraints Are Applied

Natural Language Flexibility

Best Practices

Constraint Validation

Examples Directory

See Also