Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/andresshm/fini-marketing-intelligence/llms.txt

Use this file to discover all available pages before exploring further.

The Prophet Baseline model is the most straightforward forecasting approach in the Fini Marketing Intelligence toolkit. It fits Facebook’s Prophet library directly to the aggregated daily revenue series, relying exclusively on the built-in yearly and weekly seasonality components to explain recurring patterns. With no additional regressors and a minimal configuration surface, this model serves as both a reliable production option for stable product lines and a clean benchmark against which the more complex Enriched and XGBoost models can be measured.

Data Preparation

Raw transactions from data/raw/sales.csv are aggregated to one row per calendar day before being passed to Prophet. The sale_date and revenue columns are renamed to ds and y — the column names Prophet requires.
df = pd.read_csv("data/raw/sales.csv")
df["sale_date"] = pd.to_datetime(df["sale_date"])

daily = df.groupby("sale_date").agg({
    "revenue": "sum",
    "units": "sum"
}).reset_index()

daily = daily.sort_values("sale_date")

prophet_df = daily.rename(columns={
    "sale_date": "ds",
    "revenue": "y"
})

Train / Test Split

The hold-out test window covers the final 90 days of the dataset. The cutoff date is derived dynamically from the maximum observed date, so the split remains correct as new data arrives.
cutoff_date = prophet_df["ds"].max() - pd.Timedelta(days=90)

train = prophet_df[prophet_df["ds"] <= cutoff_date]
test  = prophet_df[prophet_df["ds"] >  cutoff_date]

Model Configuration

The Prophet model is initialised with yearly and weekly seasonality enabled and daily seasonality disabled. Daily seasonality is turned off because candy retail data aggregated at the day level does not exhibit meaningful within-day cycles, and enabling it would introduce noise.
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False
)

model.fit(train)
No additional regressors are added. Prophet learns trend changepoints and seasonal Fourier components from the training data alone.

Generating the Forecast

After fitting, make_future_dataframe extends the date index 90 days beyond the last training observation. model.predict returns the full posterior predictive distribution, from which the point estimate (yhat) and 80 % uncertainty intervals (yhat_lower, yhat_upper) are extracted.
future   = model.make_future_dataframe(periods=90)
forecast = model.predict(future)
The forecast DataFrame is then joined back to the original daily series so that actuals (y) are aligned with predictions on historical dates. Future periods beyond the observed data have y = NaN.
results = forecast[["ds", "yhat", "yhat_lower", "yhat_upper"]].merge(
    prophet_df[["ds", "y"]],
    on="ds",
    how="left"
)

Evaluation

The evaluation DataFrame eval_df is built by calling results.dropna(). This removes all future periods where y is NaN, leaving only rows where an actual revenue value exists. Metrics are therefore calculated exclusively over historical dates — never over the 90-day forward projection.
eval_df = results.dropna()

mae  = mean_absolute_error(eval_df["y"], eval_df["yhat"])
rmse = np.sqrt(mean_squared_error(eval_df["y"], eval_df["yhat"]))
mape = np.mean(np.abs((eval_df["y"] - eval_df["yhat"]) / eval_df["y"])) * 100

Metric Definitions

MetricValueInterpretation
MAE54.85 €/dayOn average the model’s daily revenue prediction is off by €54.85
RMSE77.31 €/dayPenalises large errors more than MAE; useful for spotting outlier days
MAPE27.56 %The model’s predictions deviate by roughly 27.56 % from actual revenue on a typical day
The metrics are serialised to JSON for downstream comparison:
metrics = {
    "MAE":  float(mae),
    "RMSE": float(rmse),
    "MAPE": float(mape)
}
{
    "MAE": 54.85448774193466,
    "RMSE": 77.30509736516379,
    "MAPE": 27.557921673518198
}

Output Files

Two files are written to the outputs/ directory on every run.
FileColumnsDescription
outputs/forecast_baseline.csvds, yhat, yhat_lower, yhat_upper, yFull forecast including historical fit and 90-day future window; y is NaN for future rows
outputs/metrics_baseline.jsonMAE, RMSE, MAPEScalar evaluation metrics calculated on historical actuals

Build docs developers (and LLMs) love