Manual digitizing of agricultural field boundaries is one of the most time-consuming tasks in precision agronomy: a single field typically takes 15 to 30 minutes using GIS software, and errors compound when hundreds of lots need to be processed before a growing season starts. AgroIA’s poligonizador eliminates this bottleneck by combining Sentinel-2 satellite imagery from Google Earth Engine with Meta’s Segment Anything Model (SAM) to produce a georeferenced GeoJSON polygon from nothing more than a GPS coordinate and an estimated area — in a matter of seconds per field.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/sdarionicolas-boop/AgroIA-RAG/llms.txt
Use this file to discover all available pages before exploring further.
How it works
SAM is a promptable image segmentation model trained on over one billion masks. AgroIA feeds it a colourised NDVI image derived from Sentinel-2 Band 8 (NIR) and Band 4 (Red), using the field’s GPS centroid as a foreground prompt and an area-based bounding box to constrain the search region. The model returns a binary mask that is then vectorised, filtered for plausible area, and georeferenced back to WGS-84 coordinates.Download Sentinel-2 NDVI tile
EngineSatelital.descargar_ndvi() queries COPERNICUS/S2_HARMONIZED for the best available scene within a 60-day window ending at the event date (or the fallback range 2025-12-01 to 2026-03-31). Scenes with more than 20% cloud cover are excluded. The tile is centred on a 2,500 m buffer around the input GPS point.Normalise NDVI to RGB
ndvi_a_rgb() clips NDVI to the range [-0.2, 0.8], normalises to [0, 255], and applies the RdYlGn colormap. This gives SAM a high-contrast image where healthy vegetation appears green and bare soil or stressed areas appear red.Compute dynamic bounding box
bounding_box_dinamico() calculates the expected pixel extent of the field from the reported damage area (dano_ha) at 10 m/pixel resolution. The resulting bounding box constrains SAM to the region most likely to correspond to the target field.Run SAM inference
SegmentadorSAM.segmentar() runs SAM with a single positive point at the image centre and the computed bounding box. If the segmented mask area exceeds 1.5× the expected area (leak detection), the model is re-run with four additional negative points at the bounding box corners to suppress background bleed.Vectorise and filter
OpenCV
findContours extracts the largest contour from the binary mask. approxPolyDP simplifies the polygon using a smoothing factor of 0.005 × perimeter length. Polygons with a computed geodesic area outside the range of 5–800 ha are rejected.Input formats
CSV / Excel file
A tabular file with latitude, longitude, crop type, event date, and estimated area per row. Column names are matched flexibly —
lat_dec, latitude, lat, y, and latitud are all recognised automatically.Shapefile
An existing
.shp file is passed directly to run_full_analysis(). The pipeline validates CRS and reprojects to a dynamic UTM zone. Useful when boundaries are partially known.Output
The poligonizador produces a GeoJSONFeatureCollection where each feature carries the following properties:
| Property | Type | Description |
|---|---|---|
id | string | Original row identifier from the input file |
localidad | string | Locality or municipality |
cultivo | string | Crop type |
area_ha | number | Computed geodesic area in hectares |
error_pct | number | Percentage difference from the reported damage area |
sam_score | number | SAM mask confidence score (0–1) |
fecha_sat | string | Date of the Sentinel-2 scene used |
The GeoJSON output is automatically saved every 10 records during batch processing, so partial results are preserved if the run is interrupted.
Production runs
AgroIA has been validated in two production polygon runs using real agricultural event data from Argentina:1st run — TAYPE zone
268 polygons delineated over maize fields in the TAYPE zone. Input data consisted of GPS centroids with estimated damage areas. Output:
Poligonizacion/1ER CORRIDA/.2nd run — Pivot circles
340 pivot irrigation polygons in the Tandil/Balcarce region. Circular pivot fields are particularly challenging for manual digitizing; SAM’s prompt-based approach handles them naturally. Output:
Poligonizacion/2DA CORRIDA PIVOTES/.Validation metrics
The 2nd run was independently validated against INTA Balcarce reference boundaries:| Metric | Value |
|---|---|
| Geometric precision vs. INTA Balcarce | 75% |
| Mean SAM confidence score | 0.962 |
| Mean area error | 9.8% |
The 75% geometric precision figure compares SAM-derived polygon boundaries against manually digitised INTA reference polygons using intersection-over-union. The 9.8% area error reflects the difference between the computed geodesic area and the declared damage area in the input dataset — not a systematic over- or under-segmentation.
Hardware and model configuration
| Parameter | Value |
|---|---|
| SAM model | vit_b (ViT-Base) |
| Checkpoint | sam_vit_b_01ec64.pth |
| Device | Auto-detected: CUDA if available, else CPU |
| Sentinel-2 resolution | 10 m/pixel |
| Cloud filter | < 20% cloud cover |
| Area filter | 5–800 ha |
Notebooks
The following notebooks inPoligonizacion/ provide interactive environments for running and inspecting polygon generation:
| Notebook | Purpose |
|---|---|
AgroIA_Poligonizador_Master.ipynb | Full pipeline: CSV input → GeoJSON + PDF report + HTML map |
Herramienta_Definitiva_Poligonizacion.ipynb | Refined version with advanced leak control |
Poligonizador_Colab.ipynb | Google Colab-compatible version for GPU access without local setup |
poligonizador_final.py accepts --csv (input file) and --output (filename prefix) flags and runs the same pipeline as the notebooks.
Related pages
Batch processing guide
Process hundreds of polygons from a GeoJSON in a single command.
AgroIA Score
How delineated polygons are scored using NDVI and climate data.
Pipeline guide
Run the full analysis pipeline end-to-end on a single field.
Architecture overview
System-level view of how SAM, GEE, and the RAG engine interact.