Quality Control

Overview

The quality_control module provides functions to validate and correct land values and other assessment data. It performs sanity checks and applies corrections to ensure data quality.

check_land_values()

from openavmkit.quality_control import check_land_values

df_corrected = check_land_values(df, model_group)

Perform comprehensive sanity checks on land values and apply corrections where necessary.

Parameters

df_in

pd.DataFrame

required

DataFrame containing assessment data with land and market values

model_group

str

required

The model group being validated (e.g., “residential”, “commercial”)

Returns

pd.DataFrame

A copy of the input DataFrame with corrected land values

Quality Checks Performed

The function performs the following validation checks:

1. Negative Values

Market value: Cannot be negative
Land value: Cannot be negative
Improvement value: Cannot be negative

2. Land vs Market Value

Land > Market: Land value cannot exceed total market value
Separate tracking for vacant vs improved properties

3. Land Allocation

Improved properties: Land allocation should be less than 1.0 (building has value)
Vacant properties: Land allocation should equal 1.0 (no building value)

4. Consistency Checks

Market value must equal land value + improvement value
Land allocation must equal land value / market value

Corrections Applied

When validation failures are detected, the function applies the following corrections:

Negative values: Set to zero or minimum threshold
Land > Market: Cap land value at market value
Invalid allocations: Recalculate based on building presence
Inconsistencies: Recompute derived fields

Example Usage

from openavmkit.quality_control import check_land_values
import pandas as pd

# Load assessment data
df = pd.read_parquet("data/assessments.parquet")

print(f"Records before validation: {len(df)}")
print(f"Invalid records: {df['land_value'] > df['market_value']].sum()}")

# Perform quality checks
df_clean = check_land_values(df, model_group="residential")

print(f"Records after validation: {len(df_clean)}")
print(f"Corrected records: {(df['land_value'] != df_clean['land_value']).sum()}")

# Review corrections
corrections = df[df['land_value'] != df_clean['land_value']]
print(f"\nExample corrections:")
print(corrections[['parcel_id', 'market_value', 'land_value', 'building_sqft']].head())

Validation Report

The function tracks the number of violations for each check:

counts = {
    "market_lt_land": 0,           # Market < land (general)
    "negative_market": 0,          # Negative market value
    "negative_land": 0,            # Negative land value
    "negative_impr": 0,            # Negative improvement value
    "land_gt_market": 0,           # Land > market (general)
    "land_gt_market_vacant": 0,    # Land > market (vacant)
    "land_gt_market_improved": 0,  # Land > market (improved)
    "bldg_yes_land_alloc_ge_1": 0, # Building exists but land_alloc >= 1
    "bldg_no_land_alloc_ne_1": 0,  # No building but land_alloc != 1
}

These counts are logged with warning messages indicating the severity of each issue.

Quality control checks modify data in place. Always inspect the corrections and understand their impact before using corrected values for official assessments.

Best Practices

Review corrections manually

Always review a sample of corrected records to ensure the automated fixes are appropriate for your jurisdiction’s assessment practices.

Track correction rates

Monitor the percentage of records requiring correction over time. High correction rates may indicate upstream data quality issues.

Preserve original values

Keep the original uncorrected data for audit purposes and to track data quality trends.

Document exceptions

Some legitimate cases may trigger false positives (e.g., contaminated sites with negative improvement value). Document these exceptions.

Land Valuation

Land value modeling functions

Assessment Quality

Overall quality metrics and evaluation

Core Modules

Analysis & Evaluation

Data Processing

Specialized Analysis

Utilities

Cloud & Storage

Quality & Reports

Overview

check_land_values()

Parameters

Returns

Quality Checks Performed

1. Negative Values

2. Land vs Market Value

3. Land Allocation

4. Consistency Checks

Corrections Applied

Example Usage

Validation Report

Best Practices

Land Valuation

Assessment Quality

Build docs developers (and LLMs) love

Core Modules

Analysis & Evaluation

Data Processing

Specialized Analysis

Utilities

Cloud & Storage

Quality & Reports

​Overview

​check_land_values()

​Parameters

​Returns

​Quality Checks Performed

​1. Negative Values

​2. Land vs Market Value

​3. Land Allocation

​4. Consistency Checks

​Corrections Applied

​Example Usage

​Validation Report

​Best Practices

​Related

Land Valuation

Assessment Quality

Build docs developers (and LLMs) love

Overview

check_land_values()

Parameters

Returns

Quality Checks Performed

1. Negative Values

2. Land vs Market Value

3. Land Allocation

4. Consistency Checks

Corrections Applied

Example Usage

Validation Report

Best Practices

Related