Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jonatan-leal/ia-proyecto-sustituto/llms.txt

Use this file to discover all available pages before exploring further.

Welcome

This guide will help you make your first diabetes prediction using the easiest deployment method - the REST API (Phase 3). You’ll have a working prediction endpoint in minutes.
Time Required: ~10 minutesPrerequisites: Docker installed on your system

Quick Start with REST API

1

Clone or Navigate to Source

First, ensure you have the project source code:
cd ~/workspace/source/fase-3
2

Build Docker Image

Build the FastAPI container:
docker build -t apirest .
This creates a Docker image with all dependencies installed:
  • Python 3.12
  • FastAPI 0.111.0
  • scikit-learn 1.4.1
  • imbalanced-learn 0.12.0
  • All other required packages
3

Run API Container

Start the API server in detached mode:
docker run -d --name apirest-container -p 80:80 apirest
The API is now running and accessible at http://localhost
4

Copy Training Data

Copy your training dataset into the container:
# From the resources directory
docker cp train.csv apirest-container:/app
Make sure you have train.csv from the Kaggle dataset. See Dataset Documentation for download instructions.
5

Access Swagger UI

Open your browser and navigate to:
http://localhost/docs
You’ll see the interactive Swagger API documentation with two endpoints:
  • POST /train - Train the model
  • POST /predict - Make predictions
6

Train the Model

In Swagger UI:
  1. Click on the POST /train endpoint
  2. Click “Try it out”
  3. Click “Execute”
The API will:
  • Load train.csv
  • Encode categorical features
  • Scale features with StandardScaler
  • Apply SMOTEENN resampling
  • Train RandomForestClassifier
  • Save model to model.pkl
Response:
{
  "message": "Model successfully trained"
}
7

Make Your First Prediction

Now predict diabetes for a sample patient:
  1. Click on the POST /predict endpoint
  2. Click “Try it out”
  3. Replace the request body with this patient data:
{
  "gender": "Female",
  "age": 36,
  "hypertension": 0,
  "heart_disease": 0,
  "smoking_history": "current",
  "bmi": 32.27,
  "HbA1c_level": 6.2,
  "blood_glucose_level": 220
}
  1. Click “Execute”
Response:
{
  "message": "Tiene diabetes"
}
Try different patient values to see how the prediction changes! Patients with higher HbA1c levels (≥6.5%) and blood glucose (≥126 mg/dL) are more likely to have diabetes.

Test with Different Patient Profiles

{
  "gender": "Female",
  "age": 36,
  "hypertension": 0,
  "heart_disease": 0,
  "smoking_history": "current",
  "bmi": 32.27,
  "HbA1c_level": 6.2,
  "blood_glucose_level": 220
}
Expected: “Tiene diabetes” (Has diabetes)Why: High BMI (obese range), elevated HbA1c (prediabetic), very high blood glucose

Using cURL

You can also interact with the API using command-line tools:
curl -X POST "http://localhost/train" \
  -H "accept: application/json"

Verify Container Status

Check that your container is running properly:
# View running containers
docker ps

# Expected output:
CONTAINER ID   IMAGE      COMMAND                  PORTS                NAMES
abc123def456   apirest    "fastapi run apirest..."  0.0.0.0:80->80/tcp   apirest-container

# View container logs
docker logs apirest-container

# Enter container shell (optional)
docker exec -it apirest-container /bin/bash

Stopping and Cleaning Up

When you’re done:
# Stop the container
docker stop apirest-container

# Remove the container
docker rm apirest-container

# (Optional) Remove the image
docker rmi apirest

Understanding the API Response

The predict endpoint returns a simple JSON response:
{
  "message": "Tiene diabetes"  // or "No tiene diabetes"
}
The messages are in Spanish:
  • “Tiene diabetes” = Has diabetes (prediction: 1)
  • “No tiene diabetes” = No diabetes (prediction: 0)

Troubleshooting

If port 80 is occupied, use a different port:
docker run -d --name apirest-container -p 8080:80 apirest
Then access at http://localhost:8080/docs
This means the model hasn’t been trained yet. Make sure to:
  1. Copy train.csv into the container
  2. Call the /train endpoint before /predict
docker cp train.csv apirest-container:/app
Check the container logs for detailed error messages:
docker logs apirest-container
Common issues:
  • Invalid patient data format
  • Missing required fields
  • Invalid categorical values (e.g., wrong gender or smoking_history)
You need Kaggle credentials to download the dataset:
  1. Create a Kaggle account
  2. Generate API token (kaggle.json)
  3. Download dataset:
kaggle datasets download -d iammustafatz/diabetes-prediction-dataset
unzip diabetes-prediction-dataset.zip
See Dataset Documentation for detailed instructions.

What’s Next?

Phase 1: Notebook

Explore the data and model in an interactive Jupyter notebook

Phase 2: CLI

Use command-line tools for batch predictions

API Deployment

Advanced API deployment and integration guide

Patient Features

Understand what each patient feature means

Alternative Start: CLI (Phase 2)

If you prefer command-line tools over REST API:
# Navigate to fase-2
cd ~/workspace/source/fase-2

# Build and run container
docker build -t ai-proyecto-sustituto .
docker run -it --name ai-container ai-proyecto-sustituto /bin/bash

# In another terminal, copy data files
docker cp train.csv ai-container:/app
docker cp test.csv ai-container:/app

# Back in the container shell
python train.py --model_file model.pkl --data_file train.csv --overwrite_model
python predict.py --model_file model.pkl --input_file test.csv --predictions_file predictions.csv

# View predictions
cat predictions.csv
See Phase 2: CLI Tools for detailed documentation.

Build docs developers (and LLMs) love