RFM Customer Segmentation for Marketing Intelligence

RFM segmentation is a behaviour-based customer classification technique that scores every buyer on three independent dimensions — how recently they purchased, how often they purchase, and how much revenue they generate. By combining those three scores into a composite profile, the Fini platform can classify its entire customer base into five actionable segments without requiring manual rules or machine-learning infrastructure. The pipeline reads raw transaction data directly from fact_sales, computes all metrics in pandas, and writes the final scored table back to PostgreSQL as customer_rfm, ready for use in Power BI or downstream campaign tooling.

The Three RFM Dimensions

Each dimension is scored from 1 to 5 using pd.qcut, which divides customers into five equal-sized quantile bins. This means scores are always relative to the current customer population rather than fixed absolute thresholds.

Recency (R)

Days since the customer’s last purchase, measured from a snapshot date of max(sale_date) + 1 day. Scored 5 = most recent down to 1 = least recent. Lower recency days → higher R score.

Frequency (F)

Total number of purchase transactions per customer. Scored 1 = fewest purchases up to 5 = most purchases. Higher transaction count → higher F score.

Monetary (M)

Total cumulative revenue generated by the customer. Scored 1 = lowest revenue up to 5 = highest revenue. Higher lifetime spend → higher M score.

RFM Calculation

The script reads customer_id, sale_date, and revenue from fact_sales, then derives the three raw metrics through a single groupby aggregation before scoring each with pd.qcut.

sales = pd.read_sql("""
SELECT
    customer_id,
    sale_date,
    revenue
FROM fact_sales
""", engine)

sales["sale_date"] = pd.to_datetime(
    sales["sale_date"]
)

snapshot_date = (
    sales["sale_date"].max()
    + pd.Timedelta(days=1)
)

rfm = sales.groupby("customer_id").agg({
    "sale_date": lambda x: (
        snapshot_date - x.max()
    ).days,
    "customer_id": "count",
    "revenue": "sum"
})

rfm.columns = [
    "recency",
    "frequency",
    "monetary"
]

rfm.reset_index(inplace=True)

rfm["R_score"] = pd.qcut(
    rfm["recency"],
    5,
    labels=[5, 4, 3, 2, 1]
).astype(int)

rfm["F_score"] = pd.qcut(
    rfm["frequency"],
    5,
    labels=[1, 2, 3, 4, 5]
).astype(int)

rfm["M_score"] = pd.qcut(
    rfm["monetary"],
    5,
    labels=[1, 2, 3, 4, 5]
).astype(int)

The snapshot date is always derived dynamically as the latest sale_date in the dataset plus one day. This ensures recency is computed consistently regardless of when the pipeline runs.

Customer Segments

After scoring, each customer is assigned to exactly one segment by the segment_customer function, which applies priority-ordered conditions. A customer matches the first rule that applies; all remaining customers fall into Others.

def segment_customer(row):

    if row["R_score"] >= 4 and \
       row["F_score"] >= 4 and \
       row["M_score"] >= 4:

        return "Champions"

    elif row["F_score"] >= 4:

        return "Loyal Customers"

    elif row["R_score"] >= 4:

        return "Potential Loyalists"

    elif row["R_score"] <= 2 and \
         row["F_score"] >= 3:

        return "At Risk"

    else:
        return "Others"


rfm["segment"] = rfm.apply(
    segment_customer,
    axis=1
)

Champions

Rule: R_score ≥ 4 AND F_score ≥ 4 AND M_score ≥ 4Customers who purchased recently, purchase frequently, and generate the highest revenue. The brand’s most valuable cohort — ideal targets for loyalty rewards, early product access, and brand ambassador programmes.

Loyal Customers

Rule: F_score ≥ 4 (any R and M)High-frequency buyers regardless of recency or spend level. They return consistently and respond well to subscription offers, bulk discounts, and frequency-based rewards.

Potential Loyalists

Rule: R_score ≥ 4 (any F and M)Recently active customers who have not yet reached high purchase frequency. With the right nurture campaigns — such as onboarding sequences or second-purchase incentives — they can be converted into Loyal Customers.

At Risk

Rule: R_score ≤ 2 AND F_score ≥ 3Previously frequent buyers whose purchasing activity has dropped off significantly. Win-back campaigns, personalised re-engagement offers, and lapsed-customer discounts are recommended for this group.

Customers who do not meet the criteria for any of the four named segments are labelled Others. This group is typically low-frequency, low-recency, and low-spend — occasional buyers who have not yet established a pattern.

Output Table: `customer_rfm`

The final scored DataFrame is written to PostgreSQL using to_sql with if_exists="replace", which drops and recreates the table on every pipeline run to keep scores current.

rfm.to_sql(
    "customer_rfm",
    engine,
    if_exists="replace",
    index=False
)

The customer_rfm table contains one row per customer with the following columns:

Column	Type	Description
`customer_id`	text / integer	Unique customer identifier, matches `dim_customers.customer_id`
`recency`	integer	Days since the customer’s last purchase relative to the snapshot date
`frequency`	integer	Total number of purchase transactions in `fact_sales`
`monetary`	numeric	Total cumulative revenue generated by the customer
`R_score`	integer	Recency quantile score (1–5); 5 = most recent
`F_score`	integer	Frequency quantile score (1–5); 5 = most frequent
`M_score`	integer	Monetary quantile score (1–5); 5 = highest spend
`segment`	text	Assigned segment label: `Champions`, `Loyal Customers`, `Potential Loyalists`, `At Risk`, or `Others`

Connect the customer_rfm table directly to Power BI using the PostgreSQL connector and use the segment column as a slicer. This lets marketing teams filter any dashboard visual — revenue, product mix, regional map — by customer segment without writing additional SQL.

Get Started

Architecture

ETL Pipeline

Analytics & Insights

Forecasting Models

RFM Customer Segmentation for Marketing Intelligence

The Three RFM Dimensions

Recency (R)

Frequency (F)

Monetary (M)

RFM Calculation

Customer Segments

Champions

Loyal Customers

Potential Loyalists

At Risk

Output Table: `customer_rfm`

Build docs developers (and LLMs) love

Get Started

Architecture

ETL Pipeline

Analytics & Insights

Forecasting Models

Documentation Index

​The Three RFM Dimensions

Recency (R)

Frequency (F)

Monetary (M)

​RFM Calculation

​Customer Segments

Champions

Loyal Customers

Potential Loyalists

At Risk

​Output Table: customer_rfm

Build docs developers (and LLMs) love

The Three RFM Dimensions

RFM Calculation

Customer Segments

Output Table: `customer_rfm`