Local RAG engine: agronomic expert on your data

The RAG engine is a local agronomic expert that has read every field report you have ever generated. It knows the NDVI history of each lot, the breakdown of every Score component across campaigns, which years were climatically stressed, and how spatial variability has evolved over time. When you ask “Why did POLIGONO_14 score 58 this season?”, it retrieves the exact consolidated report and the most relevant campaign records, then lets Gemma 3 synthesise a structured agronomic response — all without sending a single byte of your data to an external server.

Architecture

The RAG engine (src/rag/core.py) is built on three components that run entirely on your local infrastructure:

pgvector

PostgreSQL extension that stores 768-dimensional embeddings in the informes_lotes table and enables cosine similarity search via the <=> operator.

nomic-embed-text

Embedding model served by Ollama. Converts both stored field reports and incoming questions into the same vector space for semantic retrieval.

Gemma 3 (gemma3:4b)

Generation model served by Ollama. Receives the retrieved context and user question, then produces a calibrated agronomic response with adaptive length.

FastAPI ingestion layer

Writes field reports and time-series records into informes_lotes and lote_historial after each pipeline run. The RAG engine reads; the API writes.

Retrieval strategy

A key design decision is that informes_lotes has a UNIQUE(lote_id) constraint — there is always exactly one consolidated row per field. A naive top-k vector search would therefore return the same single row regardless of top_k. AgroIA uses a two-step retrieval strategy to make top_k meaningful:

Consolidated report via vector search

fetch_context() embeds the user’s question with nomic-embed-text and queries informes_lotes using pgvector cosine similarity (embedding <=> %s::vector). This always returns the single consolidated record for the requested field, including score total, NDVI average, thermal stress, and the full technical content block.

Time series from lote_historial

The function then queries lote_historial for the top_k - 1 most recent campaign years, ordered by anio DESC. Each row carries the full Score breakdown (Vigor, Stability, Cleanliness, Climate), NDVI at critical month, heat hours, and spatial zone flags.

Context assembly

Both fragments are joined into a single string with citation markers: [ID-X] for the consolidated report and [Campaña YYYY] for each historical record. This assembled context is prepended to the user question before being sent to Gemma 3.

Increasing top_k gives the LLM more campaign years as context. The default is top_k=3, which provides the consolidated report plus the 2 most recent campaigns. For multi-year trend questions, pass top_k=6 or higher.

What the agent knows

Each field record in the RAG database contains the following information, all derived from the pipeline:

Data point	Source table	Field
NDVI at critical month (current)	`informes_lotes`	`ndvi_promedio`
Accumulated heat hours (current)	`informes_lotes`	`gdd_acumulados`
AgroIA Score total	`informes_lotes`	`score_total`
Crop and area	`informes_lotes`	`cultivo`, `superficie_ha`
Full technical narrative	`informes_lotes`	`contenido_tecnico`
Score components per campaign	`lote_historial`	`score_vigor`, `score_estabilidad`, `score_limpieza`, `score_clima`
NDVI per campaign	`lote_historial`	`ndvi_critico`
Spatial zone classification	`lote_historial`	`zonificacion_activa`, `puntos_zona_c`
Years excluded from scoring	`lote_historial`	`valido_para_score`

Agent prompt

The following BASE_PROMPT constant, defined in src/rag/core.py, governs the agent’s behaviour and is imported — never duplicated — by the Streamlit dashboard and Telegram bot:

BASE_PROMPT = (
    "Eres un asistente agronómico experto del sistema AgroIA. "
    "Responde basándote estrictamente en el contexto recuperado. "
    "ADAPTA LA EXTENSIÓN: Si la pregunta es simple, responde de forma corta y directa. Si es compleja o pide análisis, sé profundo. "
    "No te limites a repetir números; analiza la relación entre ellos si es relevante. "
    "Si preguntan por el Score, desglosa los componentes (Vigor, Estabilidad, Limpieza, Clima) solo si es necesario para responder la duda. "
    "CITAS: Usa [ID-X] para referirte al informe actual y [Campaña YYYY] en lugar de [HIST-YYYY] para que sea más natural para el usuario. "
    "Si no hay información suficiente, indicá 'Dato no disponible'."
)

The prompt enforces three important behaviours: strict grounding in retrieved context, adaptive response length, and human-readable citations rather than internal database IDs.

Example questions and responses

Simple score query

Question: ¿Cuál es el Score actual de TAYPE_LOTE_001?Expected response: A direct one-line answer citing the Score total from [ID-X], e.g. El Score actual de TAYPE_LOTE_001 es 74/100 [ID-3].

Component breakdown

Question: ¿Por qué el score de POLIGONO_14 bajó respecto a la campaña anterior?Expected response: A comparison of Vigor, Stability, Cleanliness, and Climate between the current consolidated report and the most recent [Campaña YYYY] entries, identifying which component changed most and providing an agronomic interpretation.

Multi-year trend analysis

Question: ¿Cómo evolucionó la estabilidad de INTA_PIVOTE_001 en los últimos 4 años?Expected response: A structured analysis of the score_estabilidad values across the 4 most recent campaigns from lote_historial, noting any trend or anomaly. Requires top_k=5 or higher to retrieve sufficient history.

Climate stress question

Question: ¿Tuvo estrés térmico MAIZSUPERPRUEBA este ciclo?Expected response: A direct answer citing gdd_acumulados from the consolidated report and the crop’s heat threshold, concluding whether the Climate component was significantly penalised.

Missing data

Question: ¿Qué pasó con los datos satelitales de 2022 en POLIGONO_07?Expected response: If the year appears with valido_para_score = false in lote_historial, the agent reports it was excluded and provides the reason if available in the technical narrative. If no data exists at all, it responds with Dato no disponible.

Public API

The following functions are exported from src/rag/core.py and should be imported — never reimplemented — by other modules:

Function	Signature	Description
`consultar_agente`	`(lote_id, pregunta, top_k=3) → str`	Full RAG pipeline: retrieval + generation
`fetch_context`	`(lote_id, pregunta, top_k=3) → str`	Retrieval only, no LLM call
`listar_lotes`	`() → list[str]`	All `lote_id` values in `informes_lotes`
`get_historial_lote_raw`	`(lote_id) → list[dict]`	Complete time series for a field
`get_datos_lote_raw`	`(lote_id) → dict \| None`	Consolidated report as a dict
`BASE_PROMPT`	constant	System prompt for import by UI and bot

Always import BASE_PROMPT from src/rag/core.py rather than redefining it in other modules. Divergent prompts across the dashboard and bot create inconsistent agent behaviour and make prompt tuning error-prone.

Data sovereignty

Every component of the RAG engine runs on your own infrastructure:

PostgreSQL + pgvector runs in a local Docker container.
Ollama serves both nomic-embed-text and gemma3:4b locally — no API keys, no data egress.
Field reports never leave your network. The only external calls are to Google Earth Engine (NDVI retrieval) and NASA POWER (climate data) during pipeline execution, not during RAG queries.

To start all services including the RAG-enabled dashboard and Telegram bot, run python start.py. To verify that Ollama models are available before starting, run python start.py --check.

AgroIA Score

Understand the four components the RAG engine reasons about.

Database configuration

Set up PostgreSQL with pgvector and the informes_lotes schema.

Models configuration

Configure embedding and generation models in Ollama.

RAG API module

API reference for consultar_agente() and fetch_context().

Get Started

Core Concepts

Guides

Configuration

Local RAG engine: agronomic expert on your data

Architecture

pgvector

nomic-embed-text

Gemma 3 (gemma3:4b)

FastAPI ingestion layer

Retrieval strategy

What the agent knows

Agent prompt

Example questions and responses

Public API

Data sovereignty

AgroIA Score

Database configuration

Models configuration

RAG API module

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Configuration

Documentation Index

​Architecture

pgvector

nomic-embed-text

Gemma 3 (gemma3:4b)

FastAPI ingestion layer

​Retrieval strategy

​What the agent knows

​Agent prompt

​Example questions and responses

​Public API

​Data sovereignty

​Related pages

AgroIA Score

Database configuration

Models configuration

RAG API module

Build docs developers (and LLMs) love

Architecture

Retrieval strategy

What the agent knows

Agent prompt

Example questions and responses

Public API

Data sovereignty

Related pages