TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/characat0/mlops-fundamentals-homework/llms.txt
Use this file to discover all available pages before exploring further.
model_serving Dockerfile produces a self-contained container image that includes the champion model artifact baked directly into the image layers — no MLflow server is required at runtime. The build-time download strategy means the resulting image can be shipped to any environment and will serve predictions immediately, without any dependency on the MLflow tracking server that registered the model.
Current Dockerfile
The skeleton Dockerfile sets up the Python environment and copies the application code, but leaves the model download step as a TODO for students to implement:What to Implement (TODO)
To make the container self-contained, replace the TODO comment block with these two lines:ARG MLFLOW_TRACKING_URI— Declares a build-time variable that is automatically picked up by the MLflow CLI as theMLFLOW_TRACKING_URIenvironment variable during theRUNstep. The default value (http://localhost:5000) can be overridden with--build-argat build time. This variable is not present in the final image, keeping credentials out of the runtime environment.RUN mlflow models download— Invokes the MLflow CLI during the image build to fetch the model registered under the@championalias. The-d ./modelsflag writes the artifact into/app/models/inside the container. The--no-directoryflag prevents MLflow from creating an extra subdirectory, sopredict_genre()can load the model directly from./models.
Complete Dockerfile
After implementing the TODO, the full Dockerfile should look like this:Build Commands
The
--build-arg MLFLOW_TRACKING_URI value must point to your actual MLflow tracking server. If MLflow is running on a different host or port, substitute that address here. The value is used only at build time and is not embedded in the final image.Loading the Model in predict_genre()
Once the build step has downloaded the artifact to ./models/, the predict_genre() function in app/main.py loads it at inference time using the MLflow sklearn loader:
./models resolves to /app/models/ inside the container, which is exactly where the RUN mlflow models download -d ./models step placed the artifact. Because the model is already on disk, this call is fast — there is no network round-trip to MLflow at prediction time.