Prophet Baseline Model — Fini Marketing Intelligence

The Prophet Baseline model is the most straightforward forecasting approach in the Fini Marketing Intelligence toolkit. It fits Facebook’s Prophet library directly to the aggregated daily revenue series, relying exclusively on the built-in yearly and weekly seasonality components to explain recurring patterns. With no additional regressors and a minimal configuration surface, this model serves as both a reliable production option for stable product lines and a clean benchmark against which the more complex Enriched and XGBoost models can be measured.

Data Preparation

Raw transactions from data/raw/sales.csv are aggregated to one row per calendar day before being passed to Prophet. The sale_date and revenue columns are renamed to ds and y — the column names Prophet requires.

df = pd.read_csv("data/raw/sales.csv")
df["sale_date"] = pd.to_datetime(df["sale_date"])

daily = df.groupby("sale_date").agg({
    "revenue": "sum",
    "units": "sum"
}).reset_index()

daily = daily.sort_values("sale_date")

prophet_df = daily.rename(columns={
    "sale_date": "ds",
    "revenue": "y"
})

Train / Test Split

The hold-out test window covers the final 90 days of the dataset. The cutoff date is derived dynamically from the maximum observed date, so the split remains correct as new data arrives.

cutoff_date = prophet_df["ds"].max() - pd.Timedelta(days=90)

train = prophet_df[prophet_df["ds"] <= cutoff_date]
test  = prophet_df[prophet_df["ds"] >  cutoff_date]

Model Configuration

The Prophet model is initialised with yearly and weekly seasonality enabled and daily seasonality disabled. Daily seasonality is turned off because candy retail data aggregated at the day level does not exhibit meaningful within-day cycles, and enabling it would introduce noise.

model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False
)

model.fit(train)

No additional regressors are added. Prophet learns trend changepoints and seasonal Fourier components from the training data alone.

Generating the Forecast

After fitting, make_future_dataframe extends the date index 90 days beyond the last training observation. model.predict returns the full posterior predictive distribution, from which the point estimate (yhat) and 80 % uncertainty intervals (yhat_lower, yhat_upper) are extracted.

future   = model.make_future_dataframe(periods=90)
forecast = model.predict(future)

The forecast DataFrame is then joined back to the original daily series so that actuals (y) are aligned with predictions on historical dates. Future periods beyond the observed data have y = NaN.

results = forecast[["ds", "yhat", "yhat_lower", "yhat_upper"]].merge(
    prophet_df[["ds", "y"]],
    on="ds",
    how="left"
)

Evaluation

The evaluation DataFrame eval_df is built by calling results.dropna(). This removes all future periods where y is NaN, leaving only rows where an actual revenue value exists. Metrics are therefore calculated exclusively over historical dates — never over the 90-day forward projection.

eval_df = results.dropna()

mae  = mean_absolute_error(eval_df["y"], eval_df["yhat"])
rmse = np.sqrt(mean_squared_error(eval_df["y"], eval_df["yhat"]))
mape = np.mean(np.abs((eval_df["y"] - eval_df["yhat"]) / eval_df["y"])) * 100

Metric Definitions

Metric	Value	Interpretation
MAE	54.85 €/day	On average the model’s daily revenue prediction is off by €54.85
RMSE	77.31 €/day	Penalises large errors more than MAE; useful for spotting outlier days
MAPE	27.56 %	The model’s predictions deviate by roughly 27.56 % from actual revenue on a typical day

The metrics are serialised to JSON for downstream comparison:

metrics = {
    "MAE":  float(mae),
    "RMSE": float(rmse),
    "MAPE": float(mape)
}

{
    "MAE": 54.85448774193466,
    "RMSE": 77.30509736516379,
    "MAPE": 27.557921673518198
}

Output Files

Two files are written to the outputs/ directory on every run.

File	Columns	Description
`outputs/forecast_baseline.csv`	`ds`, `yhat`, `yhat_lower`, `yhat_upper`, `y`	Full forecast including historical fit and 90-day future window; `y` is `NaN` for future rows
`outputs/metrics_baseline.json`	`MAE`, `RMSE`, `MAPE`	Scalar evaluation metrics calculated on historical actuals

Get Started

Architecture

ETL Pipeline

Analytics & Insights

Forecasting Models

Prophet Baseline Model — Fini Marketing Intelligence

Data Preparation

Train / Test Split

Model Configuration

Generating the Forecast

Evaluation

Metric Definitions

Output Files

Build docs developers (and LLMs) love

Get Started

Architecture

ETL Pipeline

Analytics & Insights

Forecasting Models

Documentation Index

​Data Preparation

​Train / Test Split

​Model Configuration

​Generating the Forecast

​Evaluation

​Metric Definitions

​Output Files

Build docs developers (and LLMs) love

Data Preparation

Train / Test Split

Model Configuration

Generating the Forecast

Evaluation

Metric Definitions

Output Files