Skip to main content
Constraints are natural language instructions that guide Gemini’s experiment design. This guide shows advanced patterns for complex scenarios.

Constraint File Structure

Constraints are written in Markdown with specific section headers:
# Experiment Constraints

## Metrics
[Metric preferences and thresholds]

## Models
[Model family preferences and exclusions]

## Preprocessing
[Feature engineering and transformation rules]

## Hyperparameters
[Specific hyperparameter preferences]

## Class Imbalance
[Strategies for handling imbalanced data]

## Feature Engineering
[Custom feature creation rules]

## Termination
[Custom stopping criteria]

## Domain Knowledge
[Business rules and domain-specific guidance]

Metrics Configuration

## Metrics
- Primary metric: F1-score
- Secondary metric: ROC-AUC
- Monitor precision (should be > 0.60)
- Monitor recall (should be > 0.40)
- Report confusion matrix for each iteration

Model Selection

## Models
- Prefer gradient boosting models (XGBoost, LightGBM)
- Avoid neural networks (interpretability required)
- Test CatBoost for categorical features
- Prioritize models with feature importance

Preprocessing Strategies

## Preprocessing
- Log-transform the target variable (highly skewed)
- Apply Box-Cox transformation if log fails
- Keep original target for interpretability
- Document transformation in final report

Hyperparameter Preferences

## Hyperparameters
- Random Forest: n_estimators in [100, 200, 300]
- Random Forest: max_depth in [10, 20, 30, None]
- Random Forest: min_samples_split in [2, 5, 10]
- Random Forest: class_weight='balanced'

Class Imbalance Strategies

## Class Imbalance
- Class ratio: 86% negative, 14% positive (6.3:1)
- Strategy 1: Use class_weight='balanced' (iteration 1-2)
- Strategy 2: Apply SMOTE oversampling (iteration 3)
- Strategy 3: Use scale_pos_weight for XGBoost (iteration 4)
- Strategy 4: Threshold optimization (iteration 5)
- Prioritize recall > 0.40 for positive class
- Accept precision tradeoff

Termination Criteria

## Termination
- Stop if RMSE < 0.12 achieved
- Stop if no improvement for 3 consecutive iterations
- Stop if improvement < 1% for 2 iterations
- Maximum 10 iterations regardless

Domain Knowledge Integration

## Domain Knowledge
- "duration" feature is NOT available at prediction time
- Exclude "duration" from all models
- "pdays" = -1 means customer was not contacted
- Consider creating "was_previously_contacted" binary feature
- "month" and "day" may have seasonal patterns

Complete Example: Production-Ready Constraints

# E-commerce Churn Prediction Constraints

## Metrics
- Primary metric: F1-score (balance precision and recall)
- Target F1-score: > 0.55
- Monitor ROC-AUC (should be > 0.75)
- False Negative cost: $200 (lost customer LTV)
- False Positive cost: $10 (retention campaign cost)

## Models
- Prefer interpretable models (regulatory requirement)
- Test: Logistic Regression, Random Forest, XGBoost
- Avoid: Neural networks, SVM
- Generate SHAP values for top model

## Preprocessing
- Log-transform monetary features (purchase_amount, cart_value)
- One-hot encode categorical (country, device_type)
- Create recency-frequency-monetary (RFM) features
- Scale numerical features with StandardScaler

## Feature Engineering
- Create "days_since_last_purchase" feature
- Create "purchase_frequency" (purchases / days_active)
- Create "avg_order_value" (total_spent / num_purchases)
- Interaction: age × total_spent
- Bin tenure into quartiles

## Class Imbalance
- Churn rate: 23% (imbalanced)
- Use class_weight='balanced'
- Apply SMOTE if F1-score < 0.45 after iteration 2
- Optimize threshold for maximum F1-score

## Hyperparameters
- XGBoost: learning_rate in [0.05, 0.1]
- XGBoost: max_depth in [3, 5, 7]
- XGBoost: n_estimators in [100, 200]
- Random Forest: n_estimators in [100, 200]
- Random Forest: max_depth in [10, 20, None]

## Validation
- Use stratified 5-fold cross-validation
- Report mean ± std for all metrics
- Hold out 20% for final test set
- Ensure no data leakage from temporal features

## Termination
- Stop if F1-score > 0.55 achieved
- Stop if no improvement for 3 iterations
- Maximum 8 iterations
- Time budget: 90 minutes

## Domain Knowledge
- Customers with purchase_count = 1 are high churn risk
- Abandoned cart (cart_value > 0, purchase = 0) is strong signal
- Seasonality matters: Q4 has higher retention
- Mobile users have different behavior than desktop

## Deliverables
- Feature importance plot
- SHAP summary plot for top model
- Threshold optimization curve
- Segment analysis (high-value vs low-value customers)
- Deployment-ready model artifact

Running with Advanced Constraints

python -m src.main run \
  --data data/custom/churn_data.csv \
  --target churn \
  --task classification \
  --constraints constraints/production_churn.md \
  --max-iterations 8 \
  --time-budget 5400 \
  --verbose

Best Practices

Begin with basic constraints and add complexity:
  1. First run: No constraints (baseline)
  2. Second run: Add metric and model preferences
  3. Third run: Add preprocessing and termination rules
  4. Fourth run: Add domain knowledge and advanced features
  • ✅ “Prefer tree-based models” (flexible)
  • ❌ “Use Random Forest with exactly 100 trees” (too rigid)
  • ✅ “RMSE should be < 0.15” (clear target)
  • ❌ “Make the model better” (too vague)
Gemini benefits from understanding:
  • Feature meanings and relationships
  • Business constraints and requirements
  • Known patterns or anomalies in data
  • Regulatory or compliance requirements
  • Cost implications of errors
Write constraints as if explaining to a data scientist:
  • “Log-transform the target because it’s skewed”
  • “Avoid SVM because the dataset is too large”
  • “Prioritize recall because false negatives are costly”
  • “Stop early if we hit 90% recall”

Constraint Validation

The autopilot validates constraints during execution:
✓ Constraints loaded successfully
✓ Primary metric: F1-score
✓ Model preferences: XGBoost, LightGBM, Random Forest
✓ Preprocessing: log-transform target, standardize features
✓ Termination: F1 > 0.55 OR no improvement for 3 iterations
⚠ Warning: scale_pos_weight=6.3 may be too aggressive

Next Steps

Regression Example

See constraints in action for regression

Classification Example

See constraints for classification tasks

CLI Reference

All command-line options

Concepts

How Gemini interprets constraints

Build docs developers (and LLMs) love