Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jonatan-leal/ia-proyecto-sustituto/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Both Phase 2 (CLI) and Phase 3 (API) use Docker containers for consistent, reproducible deployments. This page covers Docker setup, configuration, and best practices for the diabetes prediction system.
Docker Version: The project uses standard Docker (Docker Engine 20.10+)Base Image: Python 3.12 official image

Why Docker?

Consistency

Same environment on development, testing, and production machines

Isolation

Dependencies don’t conflict with other projects on the same machine

Portability

Run anywhere Docker is installed - Linux, Mac, Windows, cloud

Reproducibility

Dockerfile ensures anyone can build identical environment

Prerequisites

1

Install Docker

Install Docker Desktop (Mac/Windows) or Docker Engine (Linux):
  1. Download Docker Desktop for Windows
  2. Run installer
  3. Enable WSL 2 backend when prompted
  4. Restart computer
  5. Verify:
docker --version
2

Verify Installation

Test Docker with a simple container:
docker run hello-world
Expected output:
Hello from Docker!
This message shows that your installation appears to be working correctly.

Phase 2: CLI Docker Setup

Dockerfile Analysis

# fase-2/Dockerfile

# Select Python base image
FROM python:3.12

# Set working directory
WORKDIR /app

# Copy necessary files to application directory
ADD train.py /app
ADD predict.py /app
ADD requirements.txt /app

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
FROM python:3.12
  • Base image: Official Python 3.12 from Docker Hub
  • Includes Python, pip, and standard library
  • Based on Debian Linux
WORKDIR /app
  • Sets /app as the working directory
  • All subsequent commands run in /app
  • Directory is created if it doesn’t exist
ADD train.py /app
  • Copies train.py from build context to /app/train.py
  • ADD can also extract archives (though not used here)
ADD predict.py /app
  • Copies predict.py to container
ADD requirements.txt /app
  • Copies dependencies list
RUN pip install —no-cache-dir -r requirements.txt
  • Installs Python packages
  • --no-cache-dir: Reduces image size by not caching packages
  • Packages: scikit-learn, pandas, imbalanced-learn, loguru, argparse

Dependencies (requirements.txt)

argparse
scikit-learn
loguru
pandas
imbalanced-learn

Building the Image

# Navigate to fase-2 directory
cd ~/workspace/source/fase-2

# Build image with tag 'ai-proyecto-sustituto'
docker build -t ai-proyecto-sustituto .
Build Process:
Step 1/5 : FROM python:3.12
 ---> Pulling python:3.12
Step 2/5 : WORKDIR /app
 ---> Running in abc123def456
Step 3/5 : ADD train.py /app
 ---> 9f8e7d6c5b4a
Step 4/5 : ADD predict.py /app
 ---> 3a2b1c0d9e8f
Step 5/5 : RUN pip install --no-cache-dir -r requirements.txt
 ---> Running in def456ghi789
Successfully built 7f6e5d4c3b2a
Successfully tagged ai-proyecto-sustituto:latest
Build Time: First build takes 5-10 minutes (downloads base image and installs packages). Subsequent builds are faster due to layer caching.

Running the Container

docker run -it --name ai-container ai-proyecto-sustituto /bin/bash
Flags:
  • -i: Interactive (keep STDIN open)
  • -t: Allocate pseudo-TTY (terminal)
  • --name ai-container: Name the container
  • /bin/bash: Command to run (bash shell)
Result: Opens bash shell inside container
root@abc123def456:/app#

Phase 3: API Docker Setup

Dockerfile Analysis

# fase-3/Dockerfile

# Select Python base image
FROM python:3.12

# Set working directory
WORKDIR /app

# Copy necessary files to application directory
ADD .. /app

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Run the application
CMD ["fastapi", "run", "apirest.py", "--port", "80"]
ADD .. /app
  • Copies parent directory (entire fase-3 folder)
  • Less selective than Phase 2 (which copied specific files)
  • Includes all Python files, requirements.txt, etc.
CMD [“fastapi”, “run”, “apirest.py”, “—port”, “80”]
  • Default command when container starts
  • Launches FastAPI application on port 80
  • Unlike Phase 2, container runs automatically (no need for /bin/bash)

Dependencies (requirements.txt)

fastapi==0.111.0
scikit-learn==1.4.1.post1
loguru==0.7.2
pandas==2.2.1
imbalanced-learn==0.12.0
Phase 3 has versioned dependencies for production stability, unlike Phase 2 which uses latest versions.

Building the Image

cd ~/workspace/source/fase-3
docker build -t apirest .

Running the API Container

docker run -d --name apirest-container -p 80:80 apirest
Flags:
  • -d: Detached mode (background)
  • --name apirest-container: Container name
  • -p 80:80: Port mapping (host:container)
  • apirest: Image name
Port Mapping Explanation:
-p 80:80
   ↑   ↑
   |   Container port (FastAPI listens on 80)
   Host port (access at localhost:80)
Use different host port if 80 is occupied:
docker run -d --name apirest-container -p 8080:80 apirest
# Access at localhost:8080

Data Management

Copying Files to Containers

# Syntax: docker cp <source> <container>:<destination>

# Copy training data
docker cp train.csv ai-container:/app

# Copy test data
docker cp test.csv ai-container:/app

# Copy from subdirectory
docker cp ~/data/train.csv ai-container:/app/data/

Volume Mounting

For persistent data storage and easier file access:
docker run -it \
  --name ai-container \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/models:/app/models \
  ai-proyecto-sustituto /bin/bash
Inside container:
python train.py --data_file /app/data/train.csv --model_file /app/models/model.pkl
Result: Files persist on host even after container is removed
Windows Users: Use absolute paths:
docker run -v C:\Users\YourName\data:/app/data ...

Container Management

Essential Commands

# Start stopped container
docker start ai-container

# Stop running container
docker stop ai-container

# Restart container
docker restart ai-container

# Remove container
docker rm ai-container

# Force remove running container
docker rm -f ai-container

Image Management

# List images
docker images

# Remove image
docker rmi apirest

# Remove unused images
docker image prune

# Remove all unused images
docker image prune -a

# Image details
docker history apirest

# Image size
docker images apirest --format "{{.Size}}"

Advanced Docker Configurations

Environment Variables

# Pass environment variables
docker run -d \
  -e MODEL_FILE=/app/models/model.pkl \
  -e DATA_FILE=/app/data/train.csv \
  -e LOG_LEVEL=DEBUG \
  apirest
In Python code:
import os

model_file = os.getenv('MODEL_FILE', 'model.pkl')
data_file = os.getenv('DATA_FILE', 'train.csv')
log_level = os.getenv('LOG_LEVEL', 'INFO')

Resource Limits

# Limit memory
docker run -d \
  --memory="2g" \
  --memory-swap="2g" \
  apirest

# Limit CPU
docker run -d \
  --cpus="1.5" \
  apirest

# Combined
docker run -d \
  --memory="2g" \
  --cpus="2" \
  --name apirest-container \
  -p 80:80 \
  apirest

Health Checks

Add to Dockerfile:
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:80/health || exit 1
Or in docker run:
docker run -d \
  --health-cmd="curl -f http://localhost:80/health || exit 1" \
  --health-interval=30s \
  --health-timeout=3s \
  --health-retries=3 \
  apirest

Multi-stage Builds

Optimize image size:
# Build stage
FROM python:3.12 AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["fastapi", "run", "apirest.py", "--port", "80"]
Benefits:
  • Smaller final image (uses slim base)
  • Faster pulls and deployments
  • Reduced attack surface

Docker Compose (Optional)

For more complex setups:
# docker-compose.yml
version: '3.8'

services:
  api:
    build: ./fase-3
    container_name: diabetes-api
    ports:
      - "80:80"
    volumes:
      - ./data:/app/data
      - ./models:/app/models
    environment:
      - MODEL_FILE=/app/models/model.pkl
      - DATA_FILE=/app/data/train.csv
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:80/health"]
      interval: 30s
      timeout: 3s
      retries: 3
Usage:
# Start services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

# Rebuild and start
docker-compose up -d --build

Best Practices

1

Use Specific Base Image Versions

# Good
FROM python:3.12.1

# Avoid
FROM python:latest
Ensures reproducibility across builds.
2

Minimize Layers

# Good - One RUN layer
RUN apt-get update && apt-get install -y \
    package1 \
    package2 \
 && rm -rf /var/lib/apt/lists/*

# Avoid - Multiple RUN layers
RUN apt-get update
RUN apt-get install -y package1
RUN apt-get install -y package2
3

Use .dockerignore

# .dockerignore
__pycache__
*.pyc
.git
.env
venv/
*.pkl
*.csv
Prevents unnecessary files from being copied into the image.
4

Don't Run as Root

FROM python:3.12

# Create non-root user
RUN useradd -m -u 1000 appuser
USER appuser

WORKDIR /home/appuser/app
COPY --chown=appuser:appuser . .

RUN pip install --user -r requirements.txt
5

Use ENV for Configuration

ENV MODEL_FILE=model.pkl \
    DATA_FILE=train.csv \
    LOG_LEVEL=INFO

Troubleshooting

Error: Bind for 0.0.0.0:80 failed: port is already allocatedSolutions:
  1. Use different port:
docker run -p 8080:80 apirest
  1. Stop conflicting container:
docker ps | grep :80
docker stop <container_id>
  1. Kill process using port:
# Linux/Mac
sudo lsof -i :80
sudo kill <PID>

# Windows
netstat -ano | findstr :80
taskkill /PID <PID> /F
Problem: Container stops right after startingDiagnosis:
docker logs <container>
Common Causes:
  • No CMD in Dockerfile (Phase 2)
  • Application crashes on startup
  • Missing dependencies
Solutions:For Phase 2:
docker run -it ai-proyecto-sustituto /bin/bash
For Phase 3:
docker logs apirest-container
# Check for Python errors
Error: Cannot connect to the Docker daemonSolutions:
  • Windows/Mac: Start Docker Desktop
  • Linux: Start Docker service:
sudo systemctl start docker
  • Permissions (Linux):
sudo usermod -aG docker $USER
# Log out and back in
Error: no space left on deviceSolution: Clean up Docker resources:
# Remove unused containers
docker container prune

# Remove unused images
docker image prune -a

# Remove unused volumes
docker volume prune

# Nuclear option - remove everything
docker system prune -a --volumes

Security Considerations

Important Security Practices:
  1. Don’t include sensitive data in images:
    # DON'T do this
    COPY kaggle.json /app/
    
  2. Use secrets for credentials:
    docker run -e KAGGLE_KEY=$(cat kaggle_key.txt) ...
    
  3. Scan images for vulnerabilities:
    docker scan apirest
    
  4. Keep base images updated:
    docker pull python:3.12
    docker build -t apirest .
    

Next Steps

CLI Usage

Detailed CLI operations and automation

API Deployment

Production API deployment strategies

Phase 2: CLI

CLI tools walkthrough

Phase 3: API

REST API implementation guide

Build docs developers (and LLMs) love