Skip to main content
A “locality” in OpenAVM Kit represents a geographic area containing properties you want to analyze. This could be a city, county, neighborhood, or any other region. This guide walks you through creating and configuring a new locality.

Understanding Locality Structure

Each locality follows a specific naming convention and directory structure that OpenAVM Kit relies on.

Naming Convention

Locality names follow this format:
<country_code>-<state_or_province_code>-<locality_name>
All components should be lowercase with no spaces. Use underscores instead of dashes within the locality name itself.
Components:
  • Country code: 2-letter ISO 3166-1 code (e.g., us, no)
  • State/province code: 2-letter ISO 3166-2 code (e.g., tx, ca, ny)
  • Locality name: Human-readable identifier (no dashes, use underscores)
Examples:
us-nc-guilford       # Guilford County, North Carolina, USA
us-tx-austin         # City of Austin, Texas, USA
no-03-oslo           # City of Oslo, Norway
no-50-orkdal         # Orkdal kommune, Norway
us-ny-new_york_city  # New York City (underscores, not dashes)

Step-by-Step Setup

1
Create the locality directory
2
Navigate to the notebooks/pipeline/data/ directory and create your locality folder:
3
cd notebooks/pipeline/data/
mkdir us-tx-imaginarycounty
cd us-tx-imaginarycounty
4
Create required subdirectories
5
Every locality needs in/ and out/ directories:
6
mkdir in
mkdir out
7
The final structure should look like:
8
notebooks/pipeline/data/
└── us-tx-imaginarycounty/
    ├── in/
    │   └── settings.json
    └── out/
9
Create the settings file
10
Create a settings.json file in the in/ directory. Start with a minimal configuration:
11
{}
12
A blank settings file with just {} is sufficient to get started. OpenAVM Kit will merge this with default settings.
13
Add your data files
14
Place your raw data files in the in/ directory:
15
  • Required: geo_parcels - A geospatial file (GeoJSON, Shapefile, GeoPackage) containing parcel geometries
  • Optional: Additional CSV, Parquet, or geospatial files with property characteristics and sales data
  • 16
    Example directory:
    17
    in/
    ├── settings.json
    ├── parcels.gpkg           # Parcel geometries (required)
    ├── property_data.csv      # Property characteristics
    └── sales.csv              # Sales transactions
    
    18
    Configure cloud sync (optional)
    19
    If you want to sync data with cloud storage, create a cloud.json file in your locality directory:
    20
    Azure (Public)
    {
      "type": "azure",
      "azure_storage_container_url": "https://landeconomics.blob.core.windows.net/localities-public"
    }
    
    HuggingFace
    {
      "type": "huggingface",
      "hf_repo_id": "your-username/your-repo",
      "hf_revision": "main"
    }
    
    SFTP
    {
      "type": "sftp"
    }
    
    21
    Never commit cloud.json or .env files to version control. They may contain sensitive credentials.
    22
    Initialize in a notebook
    23
    Open a Jupyter notebook and initialize your locality:
    24
    from openavmkit.pipeline import init_notebook
    
    locality = "us-tx-imaginarycounty"
    init_notebook(locality)
    
    25
    This creates the necessary directory structure and prepares the environment.

    Configuring Settings

    As you develop your locality, you’ll expand the settings.json file to configure:
    • Data sources: Define which files to load and how to process them
    • Model groups: Specify how to segment properties (e.g., single-family, commercial)
    • Modeling parameters: Configure algorithms, variables, and validation settings
    • Valuation date: Set the target date for valuations
    Example settings structure:
    settings.json
    {
      "valuation_date": "2024-01-01",
      "data": {
        "load": {
          "geo_parcels": {
            "path": "parcels.gpkg",
            "type": "geopackage"
          },
          "sales": {
            "path": "sales.csv",
            "type": "csv"
          }
        }
      },
      "modeling": {
        "model_groups": {
          "single_family": {
            "filter": [
              {"field": "bldg_type", "operator": "==", "value": "Single Family"}
            ]
          }
        }
      }
    }
    

    Switching Between Localities

    To work with multiple localities, simply change the locality variable at the top of your notebook:
    locality = "us-ca-san_francisco"  # Switch to different locality
    init_notebook(locality)
    
    When switching localities, clear and reset your notebook to avoid mixing data from different localities.

    Next Steps

    Data Assembly

    Learn how to load and process your data files

    Configuration Reference

    Explore all available settings options

    Build docs developers (and LLMs) love