Skip to main content

Overview

Ratio studies compare assessed values to sale prices, providing essential metrics for evaluating mass appraisal accuracy and uniformity. OpenAVM Kit implements IAAO-standard ratio studies with full statistical rigor.
Ratio studies are the primary tool for measuring assessment performance and demonstrating compliance with professional standards.

The RatioStudy Class

OpenAVM Kit provides two ratio study classes:
from openavmkit.ratio_study import RatioStudy, RatioStudyBootstrapped

# Basic ratio study
rs = RatioStudy(
    predictions=predicted_values,
    ground_truth=sale_prices,
    max_trim=0.25
)

# With confidence intervals
rs_boot = RatioStudyBootstrapped(
    predictions=predicted_values,
    ground_truth=sale_prices,
    max_trim=0.25,
    confidence_interval=0.95,
    iterations=10000
)

Key Attributes

The RatioStudy class computes:
  • count: Number of observations
  • median_ratio: Median of prediction/ground_truth ratios
  • mean_ratio: Mean of prediction/ground_truth ratios
  • cod: Coefficient of Dispersion
  • cod_trim: COD after trimming outliers
  • prd: Price-Related Differential
  • prb: Price-Related Bias

Coefficient of Dispersion (COD)

COD measures the average deviation from the median ratio:

Formula

COD = (Average Absolute Deviation / Median Ratio) × 100

Implementation

From openavmkit/utilities/stats.py:
import numpy as np

def calc_cod(ratios: np.ndarray) -> float:
    """
    Calculate Coefficient of Dispersion
    
    Parameters
    ----------
    ratios : np.ndarray
        Array of assessment-to-sale ratios
    
    Returns
    -------
    float
        COD value (lower is better)
    """
    if len(ratios) == 0:
        return float("nan")
    
    median_ratio = np.median(ratios)
    abs_deviations = np.abs(ratios - median_ratio)
    avg_deviation = np.mean(abs_deviations)
    
    return (avg_deviation / median_ratio) * 100

Interpretation

Exceptional uniformity. Assessments are highly consistent across properties.
Strong performance for residential properties. Meets IAAO standards.
Acceptable for most residential property. May need improvement for certain segments.
Significant variation in assessment ratios. Review model and data quality.
COD standards vary by property type:
  • Single-family residential: Target < 10.0
  • Income-producing properties: Target < 15.0
  • Vacant land: Target < 20.0
PRD detects systematic bias related to property value:

Formula

PRD = (Mean Ratio) / (Weighted Mean Ratio)
Where weighted mean ratio uses sale prices as weights.

Implementation

import numpy as np
from openavmkit.utilities.data import div_series_z_safe

def calc_prd(predictions: np.ndarray, ground_truth: np.ndarray) -> float:
    """
    Calculate Price-Related Differential
    
    PRD > 1.0 indicates assessment regressivity (over-assessing low-value properties)
    PRD < 1.0 indicates assessment progressivity (over-assessing high-value properties)
    
    Parameters
    ----------
    predictions : np.ndarray
        Predicted values
    ground_truth : np.ndarray
        Actual sale prices
    
    Returns
    -------
    float
        PRD value (target is 1.00)
    """
    if len(predictions) == 0:
        return float("nan")
    
    ratios = div_series_z_safe(predictions, ground_truth)
    mean_ratio = np.mean(ratios)
    
    # Weighted mean ratio
    weighted_mean = np.sum(predictions) / np.sum(ground_truth)
    
    return mean_ratio / weighted_mean

Interpretation

PRD ValueMeaningAction Required
1.00Perfect proportionalityNone
0.98-1.03Excellent (IAAO target)None
1.03-1.05Slight regressivityMonitor
> 1.05Significant regressivityModel adjustment needed
0.95-0.98Slight progressivityMonitor
< 0.95Significant progressivityModel adjustment needed
Regressivity (PRD > 1.00): Lower-valued properties are assessed at higher percentages than higher-valued properties. This is generally considered unfair.Progressivity (PRD < 1.00): Higher-valued properties are assessed at higher percentages. Less common but also problematic.
PRB is an alternative measure of vertical equity based on regression:

Formula

PRB uses the coefficient from regressing percentage differences on sale prices:
Percentage Difference = (Prediction - Sale Price) / Sale Price
PRB = coefficient from regression on log(Sale Price)

Implementation

import numpy as np
from sklearn.linear_model import LinearRegression

def calc_prb(predictions, ground_truth, confidence_interval=0.95):
    """
    Calculate Price-Related Bias
    
    Returns
    -------
    tuple
        (prb_value, prb_low, prb_high)
    """
    if len(predictions) < 2:
        return (float("nan"), float("nan"), float("nan"))
    
    # Calculate percentage differences
    pct_diff = (predictions - ground_truth) / ground_truth
    
    # Prepare regression data
    X = np.log(ground_truth).reshape(-1, 1)
    y = pct_diff
    
    # Fit regression
    model = LinearRegression()
    model.fit(X, y)
    
    prb_value = model.coef_[0]
    
    # Calculate confidence interval (bootstrap)
    # ... bootstrap implementation ...
    
    return prb_value, prb_low, prb_high

Interpretation

  • PRB = 0: No price-related bias
  • PRB > 0: Regressivity (under-assessing high-value properties)
  • PRB < 0: Progressivity (over-assessing high-value properties)
IAAO Standards:
  • Excellent: -0.05 to +0.05
  • Acceptable: -0.10 to +0.10

Trimmed vs. Untrimmed Statistics

Ratio studies report both trimmed and untrimmed statistics:

Why Trim?

Outliers can distort metrics. Trimming removes extreme ratios while retaining the typical distribution:
from openavmkit.utilities.stats import trim_outlier_ratios

# Trim to interquartile range
trim_predictions, trim_ground_truth = trim_outlier_ratios(
    predictions,
    ground_truth,
    max_trim=0.25  # No more than 25% trimmed
)

# Calculate trimmed COD
trim_ratios = trim_predictions / trim_ground_truth
cod_trim = calc_cod(trim_ratios)

Comparing Results

rs = RatioStudy(predictions, ground_truth, max_trim=0.25)

# Display summary
df_summary = rs.summary()
print(df_summary)
Output:
       Data  Count    COD  Med.Ratio
0  Untrimmed  5,234  12.45      1.024
1    Trimmed  4,123   8.32      1.018
Trimmed statistics focus on the typical property, while untrimmed statistics include all sales. Both are important for comprehensive assessment.

Ratio Study Breakdowns

Analyze quality metrics by property characteristics:

Configuration

analysis:
  ratio_study:
    look_back_years: 1
    breakdowns:
      - by: property_class
      - by: neighborhood
      - by: year_built
        quantiles: 4
      - by: sale_price
        slice_size: 50000

Running Breakdowns

from openavmkit.ratio_study import run_and_write_ratio_study_breakdowns

run_and_write_ratio_study_breakdowns(settings)
This generates reports showing COD, median ratio, and confidence intervals for each breakdown category.

Bootstrap Confidence Intervals

The RatioStudyBootstrapped class provides confidence intervals:
rs = RatioStudyBootstrapped(
    predictions,
    ground_truth,
    max_trim=0.25,
    confidence_interval=0.95,
    iterations=10000
)

print(f"COD: {rs.cod.value:.1f} [{rs.cod.low:.1f}, {rs.cod.high:.1f}]")
print(f"Median Ratio: {rs.median_ratio.value:.3f}")
print(f"PRD: {rs.prd.value:.3f} [{rs.prd.low:.3f}, {rs.prd.high:.3f}]")

Summary Output

df = rs.summary()
print(df)

Vacant vs. Improved Properties

Ratio studies should separate vacant land from improved properties:
from openavmkit.data import get_vacant_sales

df_vacant = get_vacant_sales(df_sales, settings)
df_improved = get_vacant_sales(df_sales, settings, invert=True)

# Separate ratio studies
rs_vacant = RatioStudy(
    df_vacant["prediction"],
    df_vacant["sale_price"],
    max_trim=0.25
)

rs_improved = RatioStudy(
    df_improved["prediction"],
    df_improved["sale_price"],
    max_trim=0.25
)
Vacant land typically has higher COD values (20-25) due to greater heterogeneity.

Best Practices

1

Use Recent Sales

Limit analysis to sales within 1-2 years of the assessment date
2

Calculate Both Trimmed and Untrimmed

Trimmed statistics show typical performance; untrimmed shows overall coverage
3

Report Confidence Intervals

Bootstrap methods provide robust uncertainty estimates
4

Analyze by Segments

Calculate separate statistics for different property types and value ranges
5

Monitor PRD and PRB

Vertical equity is as important as overall accuracy

Next Steps

Equity Studies

Learn about horizontal and vertical equity analysis

Quality Metrics

Explore additional quality evaluation approaches

Build docs developers (and LLMs) love