Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/andresshm/fini-marketing-intelligence/llms.txt

Use this file to discover all available pages before exploring further.

Fini Marketing Intelligence is an end-to-end marketing analytics platform built for the Fini confectionery brand. It generates a fully reproducible synthetic dataset of products, customers, and sales transactions, loads that data into a PostgreSQL data warehouse, and runs a suite of analytics and machine-learning models to produce customer segmentation, revenue insights, and multi-model sales forecasts. The entire pipeline is orchestrated by a single Python entry point and is designed for analysts who want a realistic, self-contained environment to build dashboards, validate models, or explore candy-brand retail dynamics — without needing access to production data.

What the Platform Does

The platform moves data through four sequential stages: synthetic data generation, ETL and loading, analytics and segmentation, and forecasting. In the generation stage, scripts produce 20 products across 7 categories, 5,000 customers with demographic and channel attributes, and 100,000 sales records with realistic seasonal patterns. The ETL stage validates and loads all three datasets into a star-schema PostgreSQL database. The analytics stage calculates product-level insights and applies RFM (Recency, Frequency, Monetary) segmentation to classify customers into behavioral tiers. Finally, three independent forecasting models produce 90-day revenue projections that can be compared side by side.
All data generators use a fixed random seed of 42, which guarantees that every pipeline run produces identical datasets. This makes results fully reproducible across machines and environments.

Key Components

Architecture Overview

Understand the star-schema data warehouse, module layout, and how the pipeline stages connect.

Data Generation

Learn how synthetic products, customers, and 100,000 sales records are generated with controlled seasonality.

RFM Segmentation

See how 5,000 customers are scored and segmented by recency, frequency, and monetary value.

Forecasting Models

Compare the Prophet baseline, seasonality-enriched Prophet, and XGBoost forecasting models.

Dataset at a Glance

The synthetic dataset is designed to mirror realistic confectionery retail dynamics. Products span seven categories with varying seasonal availability, and sales are weighted to spike during four simulated seasonality windows.
EntityCountKey Attributes
Products20category, season, unit cost, unit price
Customers5,000age group, region, channel, purchase frequency
Sales100,000date, revenue, cost, margin, discount

Product Categories

The 20 products cover all major Fini product lines:
  • Gummies — Tropical Mix, Sour Cola Bottles, Watermelon Slices, Sour Worms, Bubblegum Bottles, Sharks, Jelly Hearts, Fruit Rings
  • Belts — Strawberry Belts, Rainbow Belts
  • Seasonal — Halloween Mix, Christmas Mix, Spooky Teeth, Snowflakes
  • Marshmallow — Marshmallow Twist, Watermelon Marshmallow
  • Licorice — Regaliz Twist
  • Foam — Candy Bananas, Fried Eggs
  • Novelty — Mini Burgers

Seasonality Windows

Sales volumes are amplified during four calendar windows to simulate real promotional cycles:
  • 🎃 Halloween — products: Halloween Mix, Spooky Teeth
  • 🎄 Christmas — products: Christmas Mix, Snowflakes
  • ☀️ Summer — products: Watermelon Slices, Bubblegum Bottles, Watermelon Marshmallow, Sharks
  • 💝 Valentine’s Day — products: Jelly Hearts

Customer Segments

Customers are generated across five regions (North, South, East, West, Center) and four purchase channels (Supermarket, E-commerce, Convenience Store, Hypermarket), with purchase frequency distributed as 40% Low, 40% Medium, and 20% High.

Forecasting Models

Three models run independently on the same sales time series, allowing direct comparison of accuracy and forecast shape:
ModelDescription
Prophet BaselineStandard Facebook Prophet model with default settings
Prophet EnrichedProphet with custom seasonality regressors for Halloween, Christmas, and Summer windows
XGBoostGradient-boosted tree model trained on lag features and calendar encodings
Each model outputs a forecast CSV and a JSON metrics file saved to the outputs/ directory.

Tech Stack

The platform is entirely Python-based. No cloud services or paid APIs are required to run the full pipeline locally.
LayerTechnologies
Data manipulationPython 3.9+, pandas, NumPy
DatabasePostgreSQL 16, SQLAlchemy, psycopg2
ForecastingProphet, XGBoost, scikit-learn
InfrastructureDocker, Docker Compose
VisualizationPower BI (fini_BI.pbix)
Environmentpython-dotenv

Build docs developers (and LLMs) love