Classes
HorizontalEquityStudy
Perform horizontal equity analysis and summarize the results. Attributes:summary(HorizontalEquitySummary): Overall summary statisticscluster_summaries(dict[str, HorizontalEquityClusterSummary]): Dictionary mapping cluster IDs to their summaries
__init__
Input DataFrame containing data for horizontal equity analysis
Column name indicating cluster membership
Column name of the values to analyze
HorizontalEquitySummary
Summary statistics for horizontal equity analysis. Attributes:rows(int): Total number of rows in the input DataFrameclusters(int): Total number of clusters identifiedmin_chd(float): Minimum CHD (Coefficient of Horizontal Dispersion) value of any clustermax_chd(float): Maximum CHD value of any clustermedian_chd(float): Median CHD value of all clustersp05_chd(float): 5th percentile CHD valuep25_chd(float): 25th percentile CHD valuep75_chd(float): 75th percentile CHD valuep95_chd(float): 95th percentile CHD value
__init__
Total number of rows in the DataFrame
Total number of clusters
Minimum CHD value
Maximum CHD value
Median CHD value
5th percentile CHD value
25th percentile CHD value
75th percentile CHD value
95th percentile CHD value
print()
Generate a formatted DataFrame summary of the horizontal equity statistics.
Transposed DataFrame with all CHD statistics and cluster information
HorizontalEquityClusterSummary
Summary for an individual horizontal equity cluster. Attributes:id(str): Identifier of the clustercount(int): Number of records in the clusterchd(float): CHD value for the clustermin(float): Minimum value in the clustermax(float): Maximum value in the clustermedian(float): Median value in the cluster
__init__
Cluster identifier
Number of records in the cluster
CHD value for the cluster
Minimum value in the cluster
Maximum value in the cluster
Median value in the cluster
Functions
mark_horizontal_equity_clusters_per_model_group_sup
Mark horizontal equity clusters on the ‘universe’ DataFrame of a SalesUniversePair. Updates the ‘universe’ DataFrame with horizontal equity clusters by callingmark_horizontal_equity_clusters and then sets the updated DataFrame in sup.
SalesUniversePair containing sales and universe data
Settings dictionary
If True, prints progress information
If True, uses cached DataFrame if available
If True, marks land horizontal equity clusters
If True, marks improvement horizontal equity clusters
Updated SalesUniversePair with marked horizontal equity clusters
mark_horizontal_equity_clusters
Compute and mark horizontal equity clusters in the DataFrame. Uses clustering (viamake_clusters) based on a location field and categorical/numeric fields specified in settings to generate a horizontal equity cluster ID which is stored in the specified id_name column.
Input DataFrame
Settings dictionary
If True, prints progress information
The settings object to use for horizontal equity analysis
Name of the column to store the horizontal equity cluster ID
Output folder path (stores information about the clusters for later use)
TimingData object to record performance metrics
DataFrame with a new cluster ID column (id_name)
Metrics
Coefficient of Horizontal Dispersion (CHD)
The CHD is analogous to the Coefficient of Dispersion (COD) but measures dispersion within clusters of similar properties rather than across all properties. Formula:- Lower CHD indicates more uniform assessments for similar properties
- CHD is calculated for each cluster separately
- The overall horizontal equity is assessed by examining the distribution of CHD values across all clusters
- High CHD in specific clusters indicates properties that should be similar are being assessed inconsistently
- Location (e.g., neighborhood, market area)
- Categorical characteristics (e.g., property type, quality grade)
- Numeric characteristics (e.g., building area, land area)
- Vacant vs. improved status