Skip to main content
GET
/
monitoring
/
drift
Drift Monitoring
curl --request GET \
  --url https://api.example.com/monitoring/drift
{
  "samples_observed": 123,
  "drift_score_max_abs_z": 123,
  "drifted_features": [
    {}
  ],
  "predicted_positive_rate": 123,
  "training_positive_rate": 123,
  "should_retrain": true,
  "reason": "<string>",
  "recommended_action": "<string>"
}

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/RaviTejaMedarametla/Data-Science-AI-Portfolio/llms.txt

Use this file to discover all available pages before exploring further.

Model Drift Status

Monitors feature distribution drift and prediction rate shifts to detect when model retraining is needed.

Endpoints

GET /monitoring/drift
GET /monitoring/retraining_trigger
Both endpoints return identical responses. The /monitoring/retraining_trigger path is an alias for semantic clarity in retraining workflows.

Response

Returns comprehensive drift analysis based on accumulated prediction data.
samples_observed
integer
required
Total number of predictions made since service startup.
drift_score_max_abs_z
number
required
Maximum absolute z-score across all monitored features.Measures how many standard deviations the current feature means are from training baseline.
drifted_features
array
required
List of feature names that have drifted beyond the z-score threshold.Empty array if no features have drifted significantly.
predicted_positive_rate
number
required
Current rate of positive predictions (predicted_purchase=1) in production.Value between 0.0 and 1.0.
training_positive_rate
number
required
Historical positive rate from the training dataset baseline.Used to detect prediction rate shift.
should_retrain
boolean
required
Whether retraining is recommended based on drift detection rules.
reason
string
required
Human-readable explanation for the retraining recommendation.Possible values:
  • baseline_not_loaded - Drift baseline file missing
  • no_predictions_observed - No predictions made yet
  • insufficient_samples - Need more samples for reliable drift detection
  • below_threshold - No significant drift detected
  • feature_distribution_drift - Features have drifted significantly
  • prediction_rate_shift - Prediction rate has changed significantly
Specific action to take based on drift status.Possible values:
  • run_training_to_generate_baseline - Run training pipeline to create baseline
  • collect_inference_samples - Continue collecting prediction data
  • collect_more_samples - Need more data for drift analysis
  • continue_monitoring - Keep monitoring, no action needed
  • trigger_retraining_pipeline - Initiate model retraining

Status Codes

  • 200 OK - Drift status computed successfully

Example Request

cURL
curl -X GET "http://localhost:8000/monitoring/drift" \
  -H "accept: application/json"

Example Responses

No Drift Detected

200 OK
{
  "samples_observed": 1247,
  "drift_score_max_abs_z": 1.8342,
  "drifted_features": [],
  "predicted_positive_rate": 0.3456,
  "training_positive_rate": 0.3512,
  "should_retrain": false,
  "reason": "below_threshold",
  "recommended_action": "continue_monitoring"
}

Feature Drift Detected

200 OK
{
  "samples_observed": 2834,
  "drift_score_max_abs_z": 4.2187,
  "drifted_features": [
    "minutes_watched",
    "practice_exams_started"
  ],
  "predicted_positive_rate": 0.2891,
  "training_positive_rate": 0.3512,
  "should_retrain": true,
  "reason": "feature_distribution_drift",
  "recommended_action": "trigger_retraining_pipeline"
}

Prediction Rate Shift

200 OK
{
  "samples_observed": 1523,
  "drift_score_max_abs_z": 2.1456,
  "drifted_features": [],
  "predicted_positive_rate": 0.4789,
  "training_positive_rate": 0.3512,
  "should_retrain": true,
  "reason": "prediction_rate_shift",
  "recommended_action": "trigger_retraining_pipeline"
}

Insufficient Data

200 OK
{
  "samples_observed": 42,
  "drift_score_max_abs_z": 2.8921,
  "drifted_features": ["courses_started"],
  "predicted_positive_rate": 0.4286,
  "training_positive_rate": 0.3512,
  "should_retrain": false,
  "reason": "insufficient_samples",
  "recommended_action": "collect_more_samples"
}

Baseline Not Loaded

200 OK
{
  "samples_observed": 0,
  "drift_score_max_abs_z": 0.0,
  "drifted_features": [],
  "predicted_positive_rate": 0.0,
  "training_positive_rate": 0.0,
  "should_retrain": false,
  "reason": "baseline_not_loaded",
  "recommended_action": "run_training_to_generate_baseline"
}

Implementation Details

Defined in src/api.py:300-302 Response Model: DriftStatusResponse (src/api.py:56-64)
class DriftStatusResponse(BaseModel):
    samples_observed: int
    drift_score_max_abs_z: float
    drifted_features: List[str]
    predicted_positive_rate: float
    training_positive_rate: float
    should_retrain: bool
    reason: str
    recommended_action: str

Drift Detection Algorithm

Implemented in _compute_drift_status() (src/api.py:91-172)

Configuration Parameters

Defined in config.yaml under the monitoring section:
monitoring:
  drift_min_samples: 100              # Minimum samples needed for drift analysis
  drift_zscore_threshold: 3.0         # Z-score threshold for feature drift
  drift_min_features: 2               # Minimum drifted features to trigger retraining
  class_rate_shift_threshold: 0.10    # Prediction rate shift threshold (10%)

Detection Logic

  1. Feature Drift Detection
    • For each numeric feature, compute running mean from predictions
    • Calculate z-score: (current_mean - baseline_mean) / baseline_std
    • Flag feature as drifted if |z-score| >= drift_zscore_threshold
    • Trigger retraining if drifted_features.count >= drift_min_features
  2. Prediction Rate Shift Detection
    • Track ratio of positive predictions in production
    • Compare against training dataset positive rate
    • Trigger retraining if |predicted_rate - training_rate| >= class_rate_shift_threshold
  3. Sample Size Validation
    • Require at least drift_min_samples before analyzing drift
    • Prevents false positives from small sample noise

Thread Safety

Monitoring state is protected by a threading lock (_LOCK) to ensure atomic updates:
  • Feature sums accumulated after each prediction
  • Sample counts incremented atomically
  • Positive prediction counts tracked accurately

Baseline Generation

The drift baseline is generated during model training (src/train.py):
  1. Compute statistics (mean, std) for all numeric features in training data
  2. Calculate training set positive rate
  3. Save to artifacts/drift_baseline.json
  4. Loaded at API startup from config path: artifacts.drift_baseline_file
Baseline File Location: artifacts/drift_baseline.json

Integration with MLOps Pipeline

Automated Retraining Workflow

import requests
import time

while True:
    response = requests.get("http://api:8000/monitoring/drift")
    status = response.json()
    
    if status["should_retrain"]:
        print(f"Drift detected: {status['reason']}")
        print(f"Action: {status['recommended_action']}")
        # Trigger retraining pipeline
        trigger_training_job()
        time.sleep(86400)  # Check daily
    else:
        time.sleep(3600)  # Check hourly

Monitoring Dashboard Metrics

Recommended dashboard visualizations:
  • Time series of drift_score_max_abs_z
  • Prediction rate vs training rate comparison
  • Count of drifted features over time
  • should_retrain flag alerts
  • Health Check - Check if drift baseline is loaded
  • Predict - Single predictions that contribute to drift tracking
  • Batch Predict - Batch predictions that contribute to drift tracking

Best Practices

  1. Monitor regularly - Check drift status hourly or daily depending on prediction volume
  2. Adjust thresholds - Tune drift_zscore_threshold and class_rate_shift_threshold based on your model’s stability
  3. Validate retraining - Always validate retrained models before deployment
  4. Reset monitoring - Clear monitoring state after retraining by restarting the service
  5. Log drift events - Integrate drift alerts with your observability platform

Build docs developers (and LLMs) love