Documentation Index Fetch the complete documentation index at: https://mintlify.com/characat0/mlops-fundamentals-homework/llms.txt
Use this file to discover all available pages before exploring further.
The homework is graded out of 20 points across 5 sections. This page documents exactly what graders check for each point — read it before you start implementing so you know precisely what is expected at every stage.
Total Points Summary
Component Points Data Pipeline 6 Model Serving 5 Drift Monitoring 3 Testing & CI/CD 4 Documentation 2 TOTAL 20
Rubric by Section
Stage 1 — Data Pipeline (6 pts)
1.1 Dataset Integrity — 1 pt Criterion Points songs.csv MD5 matches the expected hash (dvc status songs.csv.dvc is clean)0.5 dvc repro produces data/raw.csv with the correct column count0.5
1.2 Process Script — 1.5 pts Criterion Points Temporal split is correct — year ≤ 2010 → train, year > 2010 → prod_sim (exact boundary matters) 0.5 Both data/train.csv and data/prod_sim.csv are produced 0.5 Audio features and the genre column are present in both outputs 0.5
1.3 Train Script — 2 pts Criterion Points Loads training data and target (genre) correctly 0.5 Trains 2+ different models (Logistic Regression + at least one other) 0.5 Logs parameters and metrics (accuracy) to MLflow for each model 0.5 All runs appear in MLflow UI with proper naming and artifacts 0.5
1.4 Evaluate Script — 1 pt Criterion Points Finds best model by accuracy metric 0.5 Registers best model in MLflow Model Registry with champion alias 0.5
1.5 DVC Pipeline — 0.5 pts Criterion Points dvc repro runs without errors and produces all expected outputs0.5
Stage 2 — Model Serving (5 pts)
2.1 API Implementation — 3 pts Criterion Points GET /health endpoint returns the correct response1.0 POST /predict accepts a valid SpotifyFeatures payload and returns a prediction1.0 Request logging is implemented and writes to logs/api_requests.jsonl 1.0
2.2 Pydantic Models — 1 pt Criterion Points SpotifyFeatures includes all audio feature fields with correct types1.0
2.3 Dockerfile — 1 pt Criterion Points Dockerfile builds without errors 0.5 Includes a step to download the @champion model from MLflow at build time 0.5
Stage 3 — Drift Monitoring (3 pts)
3.1 Batch Mode — 1.5 pts Criterion Points Loads data/train.csv and data/prod_sim.csv correctly in --mode batch 0.5 Kolmogorov-Smirnov test runs for each audio feature (uses scipy.stats.ks_2samp) 0.5 drift_report.json contains per-feature ks_statistic, p_value, drift_detected, and an overall status0.5
3.2 Online Mode — 1.5 pts Criterion Points Loads data/train.csv and logs/api_requests.jsonl correctly in --mode online 0.5 Parses JSONL line-by-line and builds a DataFrame of production features 0.5 Reuses the same KS analysis logic as batch mode (run_ks_analysis) 0.5
Stage 4 — Testing & CI/CD (4 pts)
4.1 Unit Tests — 2 pts Criterion Points pytest data_pipeline/tests passes (all assertions pass)1.0 pytest model_serving/tests passes (all assertions pass)1.0
4.2 Code Quality — 1 pt Criterion Points flake8 . shows no major style violations (warnings OK, errors not OK)1.0
4.3 GitHub Actions — 1 pt Criterion Points CI pipeline passes on PR (linter + all tests pass; green checkmark in Actions tab) 1.0
Stage 5 — Documentation & Code Quality (2 pts)
5.1 Code Quality — 1 pt Criterion Points All TODO comments are addressed; code follows Python style guidelines 1.0
5.2 README & Setup — 1 pt Criterion Points README is clear and instructions are followable 0.5 Setup works end-to-end (download → process → train → evaluate) 0.5