Skip to main content

Overview

The KNN (K-Nearest Neighbors) recommendation system is the core of SmartEat AI’s recipe matching engine. It finds nutritionally similar recipes based on calorie and macronutrient profiles, enabling personalized meal plans and smart recipe swaps.

How It Works

The KNN model operates in the feature space of nutritional values, using Euclidean distance to measure recipe similarity.
1

Feature Vector

Each recipe is represented as a 4-dimensional vector:
FEATURES = [
    'calories',          # Total calories per serving
    'fat_content',       # Grams of fat
    'carbohydrate_content', # Grams of carbs
    'protein_content'    # Grams of protein
]
2

Normalization

Features are standardized using scikit-learn’s StandardScaler:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(df[FEATURES])
This ensures all features contribute equally to distance calculations.
3

Similarity Search

The KNN model finds the N nearest neighbors in feature space:
knn = NearestNeighbors(n_neighbors=550, metric='euclidean')
knn.fit(X_scaled)

distances, indices = knn.kneighbors(recipe_vector)

Model Architecture

import joblib

class MLModel:
    def __init__(self):
        self.df = None
        self.scaler = None
        self.knn = None
        self.X_scaled_all = None

    def load(self):
        self.df = joblib.load("app/files/df_recetas.joblib")
        self.scaler = joblib.load("app/files/scaler.joblib")
        self.knn = joblib.load("app/files/knn.joblib")
        
        FEATURES = ['calories', 'fat_content', 
                    'carbohydrate_content', 'protein_content']
        self.X_scaled_all = self.scaler.transform(self.df[FEATURES])

ml_model = MLModel()

Model Files

The KNN system consists of three joblib files stored in backend/app/files/:

df_recetas.joblib

Cleaned recipe dataframe with features and metadata

scaler.joblib

StandardScaler fitted on training data

knn.joblib

Trained NearestNeighbors model

Recommendation Process

Finding Similar Recipes

When a user wants to swap a meal, the system:
1

Extract Base Recipe

Get the nutritional profile of the current recipe:
recipe = db.query(Recipe).filter(Recipe.recipe_id == recipe_id).first()
base_index = ml_model.df.index[
    ml_model.df["recipe_id"] == recipe_id
].tolist()[0]
2

Query KNN Model

Find N nearest neighbors (default 550):
recipe_vec = ml_model.X_scaled_all[base_index].reshape(1, -1)
distances, indices = ml_model.knn.kneighbors(
    recipe_vec, n_neighbors=550
)
3

Apply Filters

Filter candidates by:
  • Meal type (breakfast, lunch, dinner, snack)
  • Dietary restrictions (vegan, vegetarian, etc.)
  • Recipes not already in the current plan
recipe_meals = {m.name.lower() for m in neighbor.meal_types}
recipe_diets = {d.name.lower() for d in neighbor.diet_types}

if meal_label.lower() not in recipe_meals:
    continue

if required_diets and not required_diets.intersection(recipe_diets):
    continue
4

Return Best Match

Select a valid alternative:
if len(valid_neighbors) == 1:
    return valid_neighbors[0]
return random.choice(valid_neighbors)

Performance Characteristics

  • Query time: ~50-100ms for 550 neighbors
  • Memory: ~200MB for model files
  • Scalability: O(log n) with ball tree indexing

Integration Points

Backend Services

The KNN model is used by:
  1. Agent Tools (backend/app/services/agent/tools/)
    • suggest_recipe_alternatives: Finds 3 similar recipes
    • replace_meal_in_plan: Swaps a meal with a similar one
  2. Recommender Service (backend/app/core/recommender.py)
    • swap_for_similar(): Core similarity search function
  3. Plan Generation (generate_weekly_plan.py)
    • Ensures nutritionally balanced weekly plans
The KNN model focuses on nutritional similarity only. Additional filters for ingredients, cuisine, and dietary restrictions are applied separately using SQL queries and LLM validation.

Example: Recipe Swap

Here’s how a user swaps a breakfast recipe:

Model Parameters

ParameterValueRationale
n_neighbors550Large pool for filtering
metriceuclideanIntuitive for nutritional space
algorithmautoScikit-learn optimizes based on data
Important: The model is loaded once at application startup. Changes to the model files require a server restart.

Future Enhancements

Collaborative Filtering

Learn from user ratings and swaps to improve recommendations

Deep Learning

Use embeddings to capture ingredient relationships and flavor profiles

A/B Testing

Compare KNN vs. other algorithms for user satisfaction

Real-time Training

Update model as new recipes are added to the database

Related Documentation

Build docs developers (and LLMs) love