AgroIA ships as a Docker Compose stack — a FastAPI backend on port 8000, a Streamlit dashboard on port 8501, and an optional Telegram bot — plus an Ollama instance that runs on the host machine. The steps below take you from a fresh clone to a verified running system and your first pipeline analysis against a real shapefile. The whole sequence takes about ten minutes, excluding Ollama model download time.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/sdarionicolas-boop/AgroIA-RAG/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before you begin, confirm the following are available on your machine:- Docker and Docker Compose — Docker Desktop ≥ 4.x or Docker Engine with the Compose plugin.
- Ollama running locally (
ollama serve) with the two required models pulled (see below). - Google Earth Engine credentials — an authenticated
earthengineCLI and a registered GEE cloud project. - Python 3.10+ — only required if you plan to run
start.pyoutside Docker.
Ollama must run on the host machine, not inside Docker. The containers reach it via
host.docker.internal:11434. On Linux, add --add-host=host.docker.internal:host-gateway to your Docker run command if host.docker.internal is not automatically resolved.Pull the required Ollama models
nomic-embed-text generates the 768-dimensional embeddings stored in pgvector. gemma3:4b is the generation model used by the RAG engine for natural-language lot queries.
Setup
Configure environment variables
Copy the example configuration file and fill in your credentials.Open
config/.env and set at minimum the following variables:config/.env
Start PostgreSQL with pgvector
The database must be running before the API container starts. Spin up a pgvector-enabled PostgreSQL instance with Docker:Then apply the schema migration (idempotent — safe to run multiple times):This creates the
informes_lotes and lote_historial tables with all required indexes and the vector extension.Start the stack with Docker Compose
docker-compose.yml:| Container | Service | Port |
|---|---|---|
agroia_api | FastAPI (ingestion + lotes + RAG) | 8000 |
agroia_ui | Streamlit dashboard | 8501 |
config/.env via env_file and connect to host.docker.internal for the database and Ollama.The Telegram bot is not included in the Compose file. Start it separately with
python start.py --bot once the API is running.Running services outside Docker
If you prefer to run all services directly on your machine — useful for pipeline development or debugging — install the Python dependencies and use the unifiedstart.py launcher:
- All services
- Individual services
- Skip prerequisites check
Ctrl+C to stop all processes cleanly.Running your first pipeline analysis
The pipeline takes a shapefile and a crop type, runs the full GEE + NASA POWER + Score analysis, and automatically ingests the result into the RAG database.Authenticate Google Earth Engine
GEE_PROJECT_ID in config/.env matches your registered GEE cloud project.Run the pipeline on a shapefile
maiz, soja, trigo, girasol. If omitted, the default is maiz.The pipeline executes these steps internally:init_gee()— GEE authentication and project binding.validar_shapefile()— CRS validation and dynamic UTM projection.get_nasa_climate_safe()— six years of NASA POWER climate data.get_gee_ndvi_validado()— Sentinel-2 SR NDVI with window fallback.calcular_score()— AgroIA Score (0–100) and K-Means A/B/C zoning.build_report()— PDF written tosrc/outputs/.generar_mapa_offline()— interactive HTML map written tooutputs/.enviar_al_rag()— automatic ingestion into pgvector.
Process a batch from SAM GeoJSON output
To process all polygons produced by the SAM delineator in one operation:GEE is initialized once and reused across all polygons. The final output prints a success/failure count.
Explore results in the dashboard
Open
http://localhost:8501 to explore the ingested lots. You can browse the lot ranking by AgroIA Score, inspect NDVI time series charts, view the Folium interactive map, and query the RAG agent in natural language — for example: “Which lots had the lowest NDVI stability in 2024?”Service URLs reference
| Service | URL | Notes |
|---|---|---|
| FastAPI REST | http://localhost:8000 | Base URL for all API calls |
| Interactive API docs | http://localhost:8000/docs | Swagger UI (auto-generated) |
| Health check | http://localhost:8000/health | Returns {"status":"ok"} |
| Streamlit dashboard | http://localhost:8501 | Lot explorer + RAG chat |
Next steps
System architecture
Understand how each component connects and how data flows from shapefile input to RAG-powered report.
Pipeline guide
Deep dive into pipeline configuration, cultivo parameters, output formats, and batch processing.
Environment configuration
Full reference for all
.env variables, their defaults, and validation rules.API overview
Complete API reference including authentication, request schemas, and response formats.