archeo_cluster.models:
Config models
DetectionConfig
Pydantic BaseModel. Configuration for object detection parameters.
Target color in hex format. The detector converts this to HSV and applies
hue_offset, saturation_offset, and value_offset to produce the color range mask.Minimum contour area in pixels. Must be ≥ 1.
Maximum contour area in pixels. Must be ≥ 1.
Size of the morphological operation kernel used for closing and opening passes on the mask.
Tolerance for the hue channel in HSV color matching. Range 0–90.
Tolerance for the saturation channel. Range 0–127.
Tolerance for the value channel. Range 0–127.
ClusteringConfig
Pydantic BaseModel. Configuration for K-Means clustering parameters.
Maximum number of clusters to evaluate in the elbow method. Range 2–50.
Random seed for reproducible K-Means results.
Minimum samples required to attempt clustering on an image. Must be ≥ 1.
When
True, silhouette scores are computed as a complementary validation metric alongside the elbow method.PathConfig
Pydantic BaseModel. Configuration for file system paths.
Base directory for input data.
Directory for output results.
Directory for generated plots.
PathConfig also provides ensure_directories() which calls mkdir(parents=True, exist_ok=True) on results_dir and plots_dir.
AppConfig
Pydantic BaseModel. Top-level application configuration that combines all sub-configurations.
Detection sub-configuration.
Clustering sub-configuration.
Path sub-configuration.
Enable debug mode.
Logging level string (e.g.
"DEBUG", "INFO", "WARNING").AppConfig.from_yaml(path: Path) -> AppConfig— load from a YAML file.config.to_yaml(path: Path) -> None— save to a YAML file.
Detection models
ContourFeatures
Pydantic BaseModel. Features extracted from a single contour. All numeric fields are ≥ 0.
Contour area in pixels.
Contour perimeter in pixels.
X coordinate of the centroid. Used for spatial analysis.
Y coordinate of the centroid. Used for spatial analysis.
4 * π * area / perimeter². Equals 1.0 for a perfect circle.Bounding rectangle width divided by height.
Contour area divided by convex hull area.
Contour area divided by bounding rectangle area.
DetectedObject
Pydantic BaseModel. A detected object with its source metadata and extracted features.
Filename of the source image.
Zero-based index of this contour within the image. Must be ≥ 0.
Extracted geometric features.
DetectionResult
Dataclass. Result of detection on a single image.
Name of the processed image.
Filtered OpenCV contour arrays that passed area constraints.
Detected objects with extracted features.
Intermediate processing images keyed by step name. Populated when
ObjectDetector.save_intermediate is True.Read-only property.
len(objects).DetectionResult.to_feature_rows() returns a list[dict[str, float | int | str]] suitable for building a pandas DataFrame or writing a CSV.
BatchDetectionResult
Dataclass. Aggregated results from processing a directory of images.
One
DetectionResult per successfully processed image.Read-only property. Sum of all detected objects across every image.
Read-only property. Number of images in
results.BatchDetectionResult.to_feature_rows() returns a combined list[dict[str, float | int | str]] from all images.
Clustering models
ClusterInfo
Pydantic BaseModel. Summary statistics for a single cluster.
Zero-based cluster identifier. Must be ≥ 0.
Number of objects assigned to this cluster. Must be ≥ 0.
Mean X coordinate of objects in this cluster.
Mean Y coordinate of objects in this cluster.
Average area (pixels) of objects in this cluster. Must be ≥ 0.
Average perimeter (pixels) of objects in this cluster. Must be ≥ 0.
ElbowResult
Pydantic BaseModel. Result of the elbow method K-selection.
K values that were evaluated (typically
[1, 2, ..., max_k]).Within-cluster sum of squares (WCSS / inertia) for each K value.
The K at the elbow point. Must be ≥ 1.
SilhouetteResult
Pydantic BaseModel. Result of silhouette score analysis. Silhouette coefficient ranges from -1 (poor) to 1 (perfect).
K values that were evaluated.
Silhouette coefficient for each K.
None for K=1 (undefined) and for any K exceeding the sample count.K with the maximum silhouette score.
None if computation was not possible. Must be ≥ 2 when set.ClusteringResult
Dataclass. Result of K-Means clustering on a single image.
Name of the source image.
Optimal number of clusters determined by the elbow method.
Cluster assignment for each object row in the input DataFrame.
Summary statistics for each cluster.
Output from the elbow method analysis.
Output from silhouette score analysis.
None when ClusteringConfig.compute_silhouette is False.Read-only property.
len(clusters).BatchClusteringResult
Dataclass. Aggregated results from clustering all images in a features CSV.
One
ClusteringResult per image that had enough samples to cluster.Read-only property. Number of images in
results.Read-only property. Sum of
cluster_count across all results.BatchClusteringResult.get_result(image_name: str) -> ClusteringResult | None finds the result for a specific image by name.
Performance models
StageMetrics
Pydantic BaseModel. Performance metrics for a single pipeline stage (detection, clustering, or analysis).
Human-readable stage identifier (e.g.
"detection").Wall-clock time in seconds with millisecond precision.
Peak memory delta in megabytes (process RSS).
Process memory before the stage started (MB).
Process memory after the stage completed (MB).
Average CPU usage during stage execution (0–100 per core).
When the measurement was taken.
PerformanceSummary
Pydantic BaseModel. Aggregated performance metrics for an entire pipeline run.
Metrics for the object detection stage.
Metrics for the K-Means clustering stage.
Metrics for the spatial analysis stage.
Sum of all stage durations in seconds.
Maximum memory seen across all stages in megabytes.
PerformanceSummary.is_complete() -> bool returns True when all three stage metrics (detection, clustering, analysis) are not None.