This guide covers everything you need to set up a robust development environment for machine learning in production.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/kyryl-opens-ml/ml-in-production-practice/llms.txt
Use this file to discover all available pages before exploring further.
Python Environment
Requirements
All modules in this course require Python 3.10 or higher. Some modules may require Python 3.12+ for specific features.Verify your Python version
Virtual Environment Setup
Each module is self-contained and should use its own virtual environment to avoid dependency conflicts.- Using venv (Standard)
- Using uv (Recommended)
- Using conda
Core Dependencies
Module 3: Training Workflows
The classic example module requires transformer libraries and experiment tracking tools:requirements.txt
What each package does
What each package does
- transformers: Hugging Face library for BERT, GPT, and other transformer models
- datasets: Easy loading and processing of ML datasets
- accelerate: Simplifies distributed training across GPUs
- typer: Build CLI applications with type hints
- wandb: Experiment tracking and model registry
- ruff: Fast Python linter and formatter (replaces flake8, black, isort)
- great-expectations: Data validation and testing
- pytest-cov: Code coverage reporting for tests
Module 5: Model Serving
For deploying models as APIs:fastapi- Modern web framework for building APIsuvicorn- ASGI server for running FastAPI appspydantic- Data validation using Python type hintsstreamlit- Build interactive web UIs for models
Development Tools
Code Formatting and Linting
This repository uses Ruff for code quality. It’s 10-100x faster than traditional tools.Testing
Run tests using pytest:Experiment Tracking
Weights & Biases Setup
Most training examples integrate with W&B for tracking experiments, metrics, and model artifacts.Create W&B account
Sign up for free at wandb.ai
Get your API key
Visit wandb.ai/authorize to get your API key
W&B is optional for local development but highly recommended for tracking experiments across multiple runs.
Container Tools (Module 1)
Docker Installation
Required for containerization and Kubernetes modules.- macOS
- Linux
- Windows
Kubernetes Local Setup
For running Kubernetes examples locally, install kind (Kubernetes in Docker):Serverless Platforms (Optional)
Modal Setup
Modal provides serverless GPU compute for training and inference.Modal offers generous free tier credits for experimentation. Perfect for running GPU-intensive training jobs without local hardware.
GPU Setup (Optional)
CUDA for Local GPU Training
If you have an NVIDIA GPU and want to train locally:Install PyTorch with CUDA
Visit pytorch.org for the correct command, or use:
Editor Setup
VS Code (Recommended)
Install these extensions for the best experience:- Python (ms-python.python) - IntelliSense, debugging, linting
- Ruff (charliermarsh.ruff) - Fast linting and formatting
- Jupyter (ms-toolsai.jupyter) - Notebook support
- Docker (ms-azuretools.vscode-docker) - Container management
- YAML (redhat.vscode-yaml) - Kubernetes YAML validation
settings.json
Verify Your Setup
Run this checklist to ensure everything is configured correctly:Setup verification checklist
Setup verification checklist
Common Issues
ImportError: No module named 'transformers'
ImportError: No module named 'transformers'
Make sure your virtual environment is activated:
CUDA out of memory
CUDA out of memory
Reduce batch size or maximum sequence length in your training config:
conf/example.json
Docker permission denied
Docker permission denied
Add your user to the docker group:
Port 8080 already in use
Port 8080 already in use
Find and kill the process using the port:
Next Steps
Quickstart Guide
Train and deploy your first model in 10 minutes
Module 1: Infrastructure
Start with containerization and Kubernetes basics
Module 3: Training
Learn training workflows and experiment tracking
Browse All Modules
Explore all 8 course modules