Data Models

All models are importable directly from archeo_cluster.models:

from archeo_cluster.models import (
    # Config
    AppConfig, ClusteringConfig, DetectionConfig, PathConfig,
    # Detection
    BatchDetectionResult, ContourFeatures, DetectedObject, DetectionResult,
    # Clustering
    BatchClusteringResult, ClusterInfo, ClusteringResult, ElbowResult, SilhouetteResult,
    # Performance
    PerformanceSummary, StageMetrics,
)

Config models

`DetectionConfig`

Pydantic BaseModel. Configuration for object detection parameters.

target_color

str

default:"#A98876"

Target color in hex format. The detector converts this to HSV and applies hue_offset, saturation_offset, and value_offset to produce the color range mask.

min_area

int

default:"50"

Minimum contour area in pixels. Must be ≥ 1.

max_area

int

default:"5000"

Maximum contour area in pixels. Must be ≥ 1.

kernel_size

tuple[int, int]

default:"(5, 5)"

Size of the morphological operation kernel used for closing and opening passes on the mask.

hue_offset

int

default:"10"

Tolerance for the hue channel in HSV color matching. Range 0–90.

saturation_offset

int

default:"50"

Tolerance for the saturation channel. Range 0–127.

value_offset

int

default:"50"

Tolerance for the value channel. Range 0–127.

`ClusteringConfig`

Pydantic BaseModel. Configuration for K-Means clustering parameters.

max_k

int

default:"10"

Maximum number of clusters to evaluate in the elbow method. Range 2–50.

random_state

int

default:"42"

Random seed for reproducible K-Means results.

min_samples_per_cluster

int

default:"2"

Minimum samples required to attempt clustering on an image. Must be ≥ 1.

compute_silhouette

bool

default:"True"

When True, silhouette scores are computed as a complementary validation metric alongside the elbow method.

`PathConfig`

Pydantic BaseModel. Configuration for file system paths.

data_dir

Path

default:"Path('data')"

Base directory for input data.

results_dir

Path

default:"Path('results')"

Directory for output results.

plots_dir

Path

default:"Path('plots')"

Directory for generated plots.

PathConfig also provides ensure_directories() which calls mkdir(parents=True, exist_ok=True) on results_dir and plots_dir.

`AppConfig`

Pydantic BaseModel. Top-level application configuration that combines all sub-configurations.

from archeo_cluster.models import AppConfig
from pathlib import Path

# Load from YAML
config = AppConfig.from_yaml(Path("config.yaml"))

# Or build in code
config = AppConfig(debug=True, log_level="DEBUG")
config.to_yaml(Path("config.yaml"))

detection

DetectionConfig

default:"DetectionConfig()"

Detection sub-configuration.

clustering

ClusteringConfig

default:"ClusteringConfig()"

Clustering sub-configuration.

paths

PathConfig

default:"PathConfig()"

Path sub-configuration.

debug

bool

default:"False"

Enable debug mode.

log_level

str

default:"INFO"

Logging level string (e.g. "DEBUG", "INFO", "WARNING").

Class methods

AppConfig.from_yaml(path: Path) -> AppConfig — load from a YAML file.
config.to_yaml(path: Path) -> None — save to a YAML file.

Detection models

`ContourFeatures`

Pydantic BaseModel. Features extracted from a single contour. All numeric fields are ≥ 0.

area

float

required

Contour area in pixels.

perimeter

float

required

Contour perimeter in pixels.

centroid_x

int

required

X coordinate of the centroid. Used for spatial analysis.

centroid_y

int

required

Y coordinate of the centroid. Used for spatial analysis.

circularity

float

required

4 * π * area / perimeter². Equals 1.0 for a perfect circle.

aspect_ratio

float

required

Bounding rectangle width divided by height.

solidity

float

required

Contour area divided by convex hull area.

extent

float

required

Contour area divided by bounding rectangle area.

`DetectedObject`

Pydantic BaseModel. A detected object with its source metadata and extracted features.

image_name

str

required

Filename of the source image.

contour_index

int

required

Zero-based index of this contour within the image. Must be ≥ 0.

features

ContourFeatures

required

Extracted geometric features.

`DetectionResult`

Dataclass. Result of detection on a single image.

image_name

str

required

Name of the processed image.

contours

list[NDArray[Any]]

default:"[]"

Filtered OpenCV contour arrays that passed area constraints.

objects

list[DetectedObject]

default:"[]"

Detected objects with extracted features.

processing_steps

dict[str, NDArray[Any]]

default:"{}"

Intermediate processing images keyed by step name. Populated when ObjectDetector.save_intermediate is True.

count

int

Read-only property. len(objects).

DetectionResult.to_feature_rows() returns a list[dict[str, float | int | str]] suitable for building a pandas DataFrame or writing a CSV.

`BatchDetectionResult`

Dataclass. Aggregated results from processing a directory of images.

results

list[DetectionResult]

default:"[]"

One DetectionResult per successfully processed image.

total_objects

int

Read-only property. Sum of all detected objects across every image.

image_count

int

Read-only property. Number of images in results.

BatchDetectionResult.to_feature_rows() returns a combined list[dict[str, float | int | str]] from all images.

Clustering models

`ClusterInfo`

Pydantic BaseModel. Summary statistics for a single cluster.

cluster_id

int

required

Zero-based cluster identifier. Must be ≥ 0.

size

int

required

Number of objects assigned to this cluster. Must be ≥ 0.

centroid_x

float

required

Mean X coordinate of objects in this cluster.

centroid_y

float

required

Mean Y coordinate of objects in this cluster.

mean_area

float

required

Average area (pixels) of objects in this cluster. Must be ≥ 0.

mean_perimeter

float

required

Average perimeter (pixels) of objects in this cluster. Must be ≥ 0.

`ElbowResult`

Pydantic BaseModel. Result of the elbow method K-selection.

k_values

list[int]

required

K values that were evaluated (typically [1, 2, ..., max_k]).

inertias

list[float]

required

Within-cluster sum of squares (WCSS / inertia) for each K value.

optimal_k

int

required

The K at the elbow point. Must be ≥ 1.

`SilhouetteResult`

Pydantic BaseModel. Result of silhouette score analysis. Silhouette coefficient ranges from -1 (poor) to 1 (perfect).

k_values

list[int]

required

K values that were evaluated.

silhouette_scores

list[float | None]

required

Silhouette coefficient for each K. None for K=1 (undefined) and for any K exceeding the sample count.

optimal_k

int | None

default:"None"

K with the maximum silhouette score. None if computation was not possible. Must be ≥ 2 when set.

`ClusteringResult`

Dataclass. Result of K-Means clustering on a single image.

image_name

str

required

Name of the source image.

optimal_k

int

required

Optimal number of clusters determined by the elbow method.

labels

list[int]

default:"[]"

Cluster assignment for each object row in the input DataFrame.

clusters

list[ClusterInfo]

default:"[]"

Summary statistics for each cluster.

elbow_result

ElbowResult | None

default:"None"

Output from the elbow method analysis.

silhouette_result

SilhouetteResult | None

default:"None"

Output from silhouette score analysis. None when ClusteringConfig.compute_silhouette is False.

cluster_count

int

Read-only property. len(clusters).

`BatchClusteringResult`

Dataclass. Aggregated results from clustering all images in a features CSV.

results

list[ClusteringResult]

default:"[]"

One ClusteringResult per image that had enough samples to cluster.

image_count

int

Read-only property. Number of images in results.

total_clusters

int

Read-only property. Sum of cluster_count across all results.

BatchClusteringResult.get_result(image_name: str) -> ClusteringResult | None finds the result for a specific image by name.

Performance models

`StageMetrics`

Pydantic BaseModel. Performance metrics for a single pipeline stage (detection, clustering, or analysis).

stage_name

str

required

Human-readable stage identifier (e.g. "detection").

duration_seconds

float

required

Wall-clock time in seconds with millisecond precision.

memory_peak_mb

float

required

Peak memory delta in megabytes (process RSS).

memory_before_mb

float

default:"0.0"

Process memory before the stage started (MB).

memory_after_mb

float

default:"0.0"

Process memory after the stage completed (MB).

cpu_percent

float

default:"0.0"

Average CPU usage during stage execution (0–100 per core).

timestamp

datetime

default:"datetime.now()"

When the measurement was taken.

`PerformanceSummary`

Pydantic BaseModel. Aggregated performance metrics for an entire pipeline run.

from archeo_cluster.models import PerformanceSummary

summary = PerformanceSummary(
    detection=detection_metrics,
    clustering=clustering_metrics,
    analysis=analysis_metrics,
    total_duration_seconds=12.4,
    peak_memory_mb=184.0,
)
print(summary.is_complete())  # True

detection

StageMetrics | None

default:"None"

Metrics for the object detection stage.

clustering

StageMetrics | None

default:"None"

Metrics for the K-Means clustering stage.

analysis

StageMetrics | None

default:"None"

Metrics for the spatial analysis stage.

total_duration_seconds

float

default:"0.0"

Sum of all stage durations in seconds.

peak_memory_mb

float

default:"0.0"

Maximum memory seen across all stages in megabytes.

PerformanceSummary.is_complete() -> bool returns True when all three stage metrics (detection, clustering, analysis) are not None.

Get Started

CLI Reference

Configuration

Guides

Python API

Contributing

Config models

`DetectionConfig`

`ClusteringConfig`

`PathConfig`

`AppConfig`

Detection models

`ContourFeatures`

`DetectedObject`

`DetectionResult`

`BatchDetectionResult`

Clustering models

`ClusterInfo`

`ElbowResult`

`SilhouetteResult`

`ClusteringResult`

`BatchClusteringResult`

Performance models

`StageMetrics`

`PerformanceSummary`

Build docs developers (and LLMs) love

Get Started

CLI Reference

Configuration

Guides

Python API

Contributing

​Config models

​DetectionConfig

​ClusteringConfig

​PathConfig

​AppConfig

​Detection models

​ContourFeatures

​DetectedObject

​DetectionResult

​BatchDetectionResult

​Clustering models

​ClusterInfo

​ElbowResult

​SilhouetteResult

​ClusteringResult

​BatchClusteringResult

​Performance models

​StageMetrics

​PerformanceSummary

Build docs developers (and LLMs) love

Config models

`DetectionConfig`

`ClusteringConfig`

`PathConfig`

`AppConfig`

Detection models

`ContourFeatures`

`DetectedObject`

`DetectionResult`

`BatchDetectionResult`

Clustering models

`ClusterInfo`

`ElbowResult`

`SilhouetteResult`

`ClusteringResult`

`BatchClusteringResult`

Performance models

`StageMetrics`

`PerformanceSummary`