Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/sdarionicolas-boop/AgroIA-RAG/llms.txt

Use this file to discover all available pages before exploring further.

Manual digitizing of agricultural field boundaries is one of the most time-consuming tasks in precision agronomy: a single field typically takes 15 to 30 minutes using GIS software, and errors compound when hundreds of lots need to be processed before a growing season starts. AgroIA’s poligonizador eliminates this bottleneck by combining Sentinel-2 satellite imagery from Google Earth Engine with Meta’s Segment Anything Model (SAM) to produce a georeferenced GeoJSON polygon from nothing more than a GPS coordinate and an estimated area — in a matter of seconds per field.

How it works

SAM is a promptable image segmentation model trained on over one billion masks. AgroIA feeds it a colourised NDVI image derived from Sentinel-2 Band 8 (NIR) and Band 4 (Red), using the field’s GPS centroid as a foreground prompt and an area-based bounding box to constrain the search region. The model returns a binary mask that is then vectorised, filtered for plausible area, and georeferenced back to WGS-84 coordinates.
1

Download Sentinel-2 NDVI tile

EngineSatelital.descargar_ndvi() queries COPERNICUS/S2_HARMONIZED for the best available scene within a 60-day window ending at the event date (or the fallback range 2025-12-01 to 2026-03-31). Scenes with more than 20% cloud cover are excluded. The tile is centred on a 2,500 m buffer around the input GPS point.
2

Normalise NDVI to RGB

ndvi_a_rgb() clips NDVI to the range [-0.2, 0.8], normalises to [0, 255], and applies the RdYlGn colormap. This gives SAM a high-contrast image where healthy vegetation appears green and bare soil or stressed areas appear red.
3

Compute dynamic bounding box

bounding_box_dinamico() calculates the expected pixel extent of the field from the reported damage area (dano_ha) at 10 m/pixel resolution. The resulting bounding box constrains SAM to the region most likely to correspond to the target field.
4

Run SAM inference

SegmentadorSAM.segmentar() runs SAM with a single positive point at the image centre and the computed bounding box. If the segmented mask area exceeds 1.5× the expected area (leak detection), the model is re-run with four additional negative points at the bounding box corners to suppress background bleed.
5

Vectorise and filter

OpenCV findContours extracts the largest contour from the binary mask. approxPolyDP simplifies the polygon using a smoothing factor of 0.005 × perimeter length. Polygons with a computed geodesic area outside the range of 5–800 ha are rejected.
6

Georeference to WGS-84

Each pixel coordinate is linearly interpolated back to geographic coordinates using the bounds of the GEE bounding box, producing a valid GeoJSON Polygon geometry ready for downstream processing.

Input formats

CSV / Excel file

A tabular file with latitude, longitude, crop type, event date, and estimated area per row. Column names are matched flexibly — lat_dec, latitude, lat, y, and latitud are all recognised automatically.

Shapefile

An existing .shp file is passed directly to run_full_analysis(). The pipeline validates CRS and reprojects to a dynamic UTM zone. Useful when boundaries are partially known.

Output

The poligonizador produces a GeoJSON FeatureCollection where each feature carries the following properties:
PropertyTypeDescription
idstringOriginal row identifier from the input file
localidadstringLocality or municipality
cultivostringCrop type
area_hanumberComputed geodesic area in hectares
error_pctnumberPercentage difference from the reported damage area
sam_scorenumberSAM mask confidence score (0–1)
fecha_satstringDate of the Sentinel-2 scene used
The GeoJSON output is automatically saved every 10 records during batch processing, so partial results are preserved if the run is interrupted.

Production runs

AgroIA has been validated in two production polygon runs using real agricultural event data from Argentina:

1st run — TAYPE zone

268 polygons delineated over maize fields in the TAYPE zone. Input data consisted of GPS centroids with estimated damage areas. Output: Poligonizacion/1ER CORRIDA/.

2nd run — Pivot circles

340 pivot irrigation polygons in the Tandil/Balcarce region. Circular pivot fields are particularly challenging for manual digitizing; SAM’s prompt-based approach handles them naturally. Output: Poligonizacion/2DA CORRIDA PIVOTES/.

Validation metrics

The 2nd run was independently validated against INTA Balcarce reference boundaries:
MetricValue
Geometric precision vs. INTA Balcarce75%
Mean SAM confidence score0.962
Mean area error9.8%
The 75% geometric precision figure compares SAM-derived polygon boundaries against manually digitised INTA reference polygons using intersection-over-union. The 9.8% area error reflects the difference between the computed geodesic area and the declared damage area in the input dataset — not a systematic over- or under-segmentation.

Hardware and model configuration

ParameterValue
SAM modelvit_b (ViT-Base)
Checkpointsam_vit_b_01ec64.pth
DeviceAuto-detected: CUDA if available, else CPU
Sentinel-2 resolution10 m/pixel
Cloud filter< 20% cloud cover
Area filter5–800 ha
Running on a CUDA-capable GPU reduces per-polygon inference time from ~8 seconds (CPU) to under 1 second. For the 340-polygon pivot run, GPU processing cut total wall time from roughly 45 minutes to under 6 minutes.

Notebooks

The following notebooks in Poligonizacion/ provide interactive environments for running and inspecting polygon generation:
NotebookPurpose
AgroIA_Poligonizador_Master.ipynbFull pipeline: CSV input → GeoJSON + PDF report + HTML map
Herramienta_Definitiva_Poligonizacion.ipynbRefined version with advanced leak control
Poligonizador_Colab.ipynbGoogle Colab-compatible version for GPU access without local setup
The command-line entry point poligonizador_final.py accepts --csv (input file) and --output (filename prefix) flags and runs the same pipeline as the notebooks.
The poligonizador requires a valid Google Earth Engine project ID configured in config/.env under GEE_PROJECT_ID. Run earthengine authenticate before first use. See the environment configuration page for setup instructions.

Batch processing guide

Process hundreds of polygons from a GeoJSON in a single command.

AgroIA Score

How delineated polygons are scored using NDVI and climate data.

Pipeline guide

Run the full analysis pipeline end-to-end on a single field.

Architecture overview

System-level view of how SAM, GEE, and the RAG engine interact.

Build docs developers (and LLMs) love