Skip to main content

Overview

The settings.json file is the central configuration file for OpenAVM Kit. It controls all aspects of your locality’s modeling and analysis workflow, from data processing to model training and report generation.

File Location

The settings file should be placed in your locality’s in/ directory:
notebooks/pipeline/data/<locality_slug>/
  ├── in/
  │   ├── settings.json
  ├── out/
For more information on locality setup, see the Getting Started guide.

Basic Structure

A minimal settings.json file can be as simple as:
{}
This empty configuration will load default values from the built-in template. However, you’ll typically want to customize various aspects of the processing pipeline.

Main Configuration Sections

The settings file is organized into major sections:

Locality

Basic information about your locality, units of measurement, and geographic center

Data

Data loading, processing, merging, filling, and enrichment settings

Modeling

Model configuration, feature selection, and training instructions

Analysis

Ratio studies, equity analysis, and report generation settings

Locality Configuration

Define basic locality information:
{
  "locality": {
    "slug": "us-tx-travis",
    "name": "Travis County",
    "units": "imperial",
    "center": {
      "latitude": 30.2672,
      "longitude": -97.7431
    }
  }
}
locality.slug
string
required
Unique identifier for your locality in format: <country>-<state>-<locality>
locality.name
string
Human-readable name for your locality
locality.units
string
default:"imperial"
Unit system: imperial (feet, square feet, miles) or metric (meters, square meters, kilometers)
locality.center
object
Geographic center point of your locality
latitude
number
required
Latitude coordinate
longitude
number
required
Longitude coordinate

Data Configuration

Control how data is loaded and processed:
{
  "data": {
    "load": {},
    "process": {
      "merge": {},
      "fill": {
        "zero": [
          "bldg_area_finished_sqft"
        ]
      },
      "reconcile": {},
      "enrich": {}
    }
  }
}
data.process.fill.zero
array
List of fields to fill with zero values when missing
data.process.enrich
object
Enrichment configuration for Census, OpenStreetMap, and other data sources. See Census API and OpenStreetMap for details.

Modeling Configuration

Configure model training and feature selection:
{
  "modeling": {
    "metadata": {
      "modeler": "Your Name",
      "valuation_date": "2024-01-01"
    },
    "instructions": {
      "feature_selection": {
        "thresholds": {
          "correlation": 0.1,
          "vif": 10,
          "p_value": 0.05,
          "t_value": 2
        },
        "weights": {
          "vif": 3,
          "p_value": 3,
          "t_value": 2
        }
      }
    },
    "models": {
      "default": {
        "dep_vars": [],
        "interactions": {
          "default": true
        }
      }
    }
  }
}
modeling.metadata.modeler
string
Name of the person or organization creating the models
modeling.metadata.valuation_date
string
Date for property valuation in format YYYY-MM-DD
modeling.instructions.feature_selection.thresholds
object
Thresholds for automatic feature selection
correlation
number
default:"0.1"
Minimum correlation with target variable
vif
number
default:"10"
Maximum variance inflation factor (multicollinearity threshold)
p_value
number
default:"0.05"
Maximum p-value for statistical significance
t_value
number
default:"2"
Minimum t-statistic value

Analysis Configuration

Configure analysis and reporting:
{
  "analysis": {
    "ratio_study": {
      "look_back_years": 1,
      "breakdowns": [
        {"by": "sale_price", "quantiles": 10}
      ]
    },
    "report": {
      "formats": ["pdf", "html", "md"]
    }
  }
}
analysis.ratio_study.look_back_years
number
default:"1"
Number of years to look back for sales comparison
analysis.report.formats
array
default:"[\"pdf\", \"html\", \"md\"]"
Output formats for generated reports. See PDF Reports for PDF setup.

Field Classification

Classify your data fields into categories for proper modeling:
{
  "field_classification": {
    "land": {
      "numeric": [
        "land_area_sqft",
        "dist_to_cbd"
      ],
      "categorical": [
        "zoning",
        "flood_risk"
      ],
      "boolean": [
        "is_corner_lot"
      ]
    },
    "impr": {
      "numeric": [
        "bldg_area_finished_sqft",
        "bldg_age_years"
      ],
      "categorical": [
        "bldg_type",
        "bldg_quality"
      ],
      "boolean": [
        "has_garage"
      ]
    },
    "other": {
      "numeric": [],
      "categorical": [],
      "boolean": []
    }
  }
}
field_classification.land
object
Fields related to land characteristics
field_classification.impr
object
Fields related to building/improvement characteristics
field_classification.other
object
Fields not classified as land or improvements

Variable Replacement

Settings supports variable replacement using the $$ prefix:
{
  "custom": {
    "my_field": "land_area_sqft"
  },
  "data": {
    "process": {
      "fill": {
        "zero": ["$$custom.my_field"]
      }
    }
  }
}
The $$custom.my_field reference will be replaced with land_area_sqft during settings loading.

Settings Merging

Your local settings.json is merged with the built-in template. Use these prefixes to control merge behavior:
+key
prefix
Add to template array rather than replacing it
!key
prefix
Force overwrite template value (stomps template)

Loading Settings in Code

Load settings programmatically:
from openavmkit.utilities.settings import load_settings

# Load from default location (in/settings.json)
settings = load_settings()

# Load from custom path
settings = load_settings("path/to/settings.json")

# Load from dictionary
settings = load_settings(settings_object={"locality": {"slug": "us-tx-travis"}})

Helper Functions

OpenAVM Kit provides utility functions for accessing settings:
from openavmkit.utilities.settings import (
    get_valuation_date,
    get_center,
    area_unit,
    length_unit,
    get_fields_land,
    get_fields_impr
)

# Get the configured valuation date
val_date = get_valuation_date(settings)

# Get the locality center coordinates
longitude, latitude = get_center(settings)

# Get the area unit (sqft or sqm)
unit = area_unit(settings)  # Returns "sqft" or "sqm"

# Get land fields classified by type
land_fields = get_fields_land(settings, df)
# Returns: {"categorical": [...], "numeric": [...], "boolean": [...]}

Best Practices

Never store credentials in settings.jsonThe settings file may be uploaded to cloud storage. Always store API keys, passwords, and tokens in the .env file instead.
Start with defaultsBegin with an empty {} configuration and add customizations as needed. The built-in template provides sensible defaults.
Use comments for documentationPrefix keys with __ to add comments that will be ignored:
{
  "__comment": "This is a comment that will be removed",
  "locality": {
    "slug": "us-tx-travis"
  }
}

Next Steps

Cloud Storage

Configure Azure, HuggingFace, or SFTP storage

Census API

Set up Census data enrichment

OpenStreetMap

Enable geographic feature enrichment

PDF Reports

Install wkhtmltopdf for PDF generation

Build docs developers (and LLMs) love