A “locality” in OpenAVM Kit represents a geographic area containing properties you want to analyze. This could be a city, county, neighborhood, or any other region. This guide walks you through creating and configuring a new locality.
Understanding Locality Structure
Each locality follows a specific naming convention and directory structure that OpenAVM Kit relies on.
Naming Convention
Locality names follow this format:
<country_code>-<state_or_province_code>-<locality_name>
All components should be lowercase with no spaces. Use underscores instead of dashes within the locality name itself.
Components:
Country code : 2-letter ISO 3166-1 code (e.g., us, no)
State/province code : 2-letter ISO 3166-2 code (e.g., tx, ca, ny)
Locality name : Human-readable identifier (no dashes, use underscores)
Examples:
us-nc-guilford # Guilford County, North Carolina, USA
us-tx-austin # City of Austin, Texas, USA
no-03-oslo # City of Oslo, Norway
no-50-orkdal # Orkdal kommune, Norway
us-ny-new_york_city # New York City (underscores, not dashes)
Step-by-Step Setup
Create the locality directory
Navigate to the notebooks/pipeline/data/ directory and create your locality folder:
cd notebooks/pipeline/data/
mkdir us-tx-imaginarycounty
cd us-tx-imaginarycounty
Create required subdirectories
Every locality needs in/ and out/ directories:
The final structure should look like:
notebooks/pipeline/data/
└── us-tx-imaginarycounty/
├── in/
│ └── settings.json
└── out/
Create a settings.json file in the in/ directory. Start with a minimal configuration:
A blank settings file with just {} is sufficient to get started. OpenAVM Kit will merge this with default settings.
Place your raw data files in the in/ directory:
Required : geo_parcels - A geospatial file (GeoJSON, Shapefile, GeoPackage) containing parcel geometries
Optional : Additional CSV, Parquet, or geospatial files with property characteristics and sales data
in/
├── settings.json
├── parcels.gpkg # Parcel geometries (required)
├── property_data.csv # Property characteristics
└── sales.csv # Sales transactions
If you want to sync data with cloud storage, create a cloud.json file in your locality directory:
{
"type" : "azure" ,
"azure_storage_container_url" : "https://landeconomics.blob.core.windows.net/localities-public"
}
{
"type" : "huggingface" ,
"hf_repo_id" : "your-username/your-repo" ,
"hf_revision" : "main"
}
Never commit cloud.json or .env files to version control. They may contain sensitive credentials.
Open a Jupyter notebook and initialize your locality:
from openavmkit.pipeline import init_notebook
locality = "us-tx-imaginarycounty"
init_notebook(locality)
This creates the necessary directory structure and prepares the environment.
Configuring Settings
As you develop your locality, you’ll expand the settings.json file to configure:
Data sources : Define which files to load and how to process them
Model groups : Specify how to segment properties (e.g., single-family, commercial)
Modeling parameters : Configure algorithms, variables, and validation settings
Valuation date : Set the target date for valuations
Example settings structure:
{
"valuation_date" : "2024-01-01" ,
"data" : {
"load" : {
"geo_parcels" : {
"path" : "parcels.gpkg" ,
"type" : "geopackage"
},
"sales" : {
"path" : "sales.csv" ,
"type" : "csv"
}
}
},
"modeling" : {
"model_groups" : {
"single_family" : {
"filter" : [
{ "field" : "bldg_type" , "operator" : "==" , "value" : "Single Family" }
]
}
}
}
}
Switching Between Localities
To work with multiple localities, simply change the locality variable at the top of your notebook:
locality = "us-ca-san_francisco" # Switch to different locality
init_notebook(locality)
When switching localities, clear and reset your notebook to avoid mixing data from different localities.
Next Steps
Data Assembly Learn how to load and process your data files
Configuration Reference Explore all available settings options