Vanguardia EPIS — Student and AI Data Model Reference

Vanguardia EPIS operates on a small set of well-defined JSON structures that flow from the raw data files on disk, through the deterministic classifier, and out to the API consumer. This page documents every field in each structure, the Pydantic models that govern the API’s request/response contracts, and the on-disk cache format written by the Gemini AI client. All field names, types, and example values are taken directly from the source files in data/, backend/main.py, backend/classifier.py, and cache/.

Student Record (`data/estudiantes.json`)

The primary dataset lives in data/estudiantes.json as a JSON array of student objects. Each object supplies the raw inputs that the classifier and AI client consume. All three risk-signal fields (asistencia_pct, notas_promedio, participacion) are nullable — missing data is a first-class concept in the system rather than an error.

string

required

Unique student identifier. Format: EST-XXX, where XXX is a zero-padded auto-incremented integer (e.g., EST-001, EST-020). Generated by the POST /api/estudiantes endpoint when creating a new record.

nombre

string

required

Student’s full name. Example: "María Quispe Huamán".

grado

string

required

Grade level in free-text Spanish format. Examples: "1ro de secundaria", "3ro de secundaria", "5to de secundaria".

asistencia_pct

float | null

Monthly attendance expressed as a percentage from 0 to 100. A null value means no attendance record was available for the current period. Values outside the 0–100 range are normalised to null by normalizar_campo() before classification. Example: 95, 60, null.

notas_promedio

float | null

General grade average on Peru’s vigesimal scale (0–20). A null value means no grades were registered for the current period. Values outside 0–20 are normalised to null. Example: 15.5, 9.5, null.

participacion

string | null

Qualitative classroom participation level. Must be one of "alta", "media", or "baja". Any other string value is normalised to null by normalizar_campo(). Example: "alta", "baja", null.

lengua_materna

string | null

Student’s mother tongue. Common values in the dataset: "castellano", "quechua". Defaults to "castellano" when not provided via the API. Included for cultural-context awareness in AI-generated recommendations.

observaciones

string | null

Free-text teacher notes attached to the student record. Example: "Múltiples inasistencias consecutivas reportadas.". Passed to the AI client as additional context. null when no notes exist.

Example student records

The following entries are taken directly from data/estudiantes.json:

[
  {
    "id": "EST-001",
    "nombre": "María Quispe Huamán",
    "grado": "3ro de secundaria",
    "asistencia_pct": 95,
    "notas_promedio": 15.5,
    "participacion": "alta",
    "lengua_materna": "castellano",
    "observaciones": null
  },
  {
    "id": "EST-002",
    "nombre": "Jhon Huamán Torres",
    "grado": "2do de secundaria",
    "asistencia_pct": 60,
    "notas_promedio": 9.5,
    "participacion": "baja",
    "lengua_materna": "quechua",
    "observaciones": "Múltiples inasistencias consecutivas reportadas."
  },
  {
    "id": "EST-004",
    "nombre": "Luis Ccorahua Ramos",
    "grado": "4to de secundaria",
    "asistencia_pct": null,
    "notas_promedio": null,
    "participacion": null,
    "lengua_materna": "quechua",
    "observaciones": "Sin registros disponibles esta semana."
  },
  {
    "id": "EST-012",
    "nombre": "Raúl Ccahuana Puma",
    "grado": "5to de secundaria",
    "asistencia_pct": null,
    "notas_promedio": 14.5,
    "participacion": null,
    "lengua_materna": "quechua",
    "observaciones": "Falta registro de asistencia y participación del mes."
  },
  {
    "id": "EST-020",
    "nombre": "Brayan Asto Ccopa",
    "grado": "2do de secundaria",
    "asistencia_pct": 82,
    "notas_promedio": null,
    "participacion": "media",
    "lengua_materna": "quechua",
    "observaciones": "Sin calificaciones registradas en el bimestre."
  }
]

Teacher Record (`data/docentes.json`)

Teacher (docente) records are stored in data/docentes.json and managed through the POST /api/docentes and GET /api/docentes endpoints. The structure mirrors the DocenteInput / DocenteResultado Pydantic models in backend/main.py.

string

required

Unique teacher identifier. Format: DOC-XXX, zero-padded auto-incremented integer. Example: "DOC-002".

nombre

string

required

Teacher’s full name. Example: "Carlos Ruíz".

string

required

Institutional email address. Example: "[email protected]".

asignacion

string

required

Subject or role assignment description. Example: "Tutoría 3° Sec".

grado_clave

string

required

Short grade key that links the teacher to a student cohort. Example: "3ro". Corresponds to the prefix used in student grado strings.

estado

string

required

Current status of the teacher account. Allowed values: "Activo" or "Inactivo". Defaults to "Activo" when creating via the API.

Example teacher record

[
  {
    "id": "DOC-002",
    "nombre": "Carlos Ruíz",
    "email": "[email protected]",
    "asignacion": "Tutoría 3° Sec",
    "grado_clave": "3ro",
    "estado": "Activo"
  }
]

Classification Result (output of `clasificar_estudiante()`)

clasificar_estudiante() in backend/classifier.py returns a plain Python dict — not a Pydantic model — that is then merged into the EstudianteResultado API response. It contains three keys:

nivel

string

The overall risk level for this student. One of four emoji sentinels:

"🟢" — Bajo (low risk)
"🟡" — Medio (medium risk)
"🔴" — Alto (high risk)
"⚪" — Dato insuficiente (insufficient data)

motivos

list[string]

Human-readable audit trail, one string per evaluated variable plus one for each missing variable. These strings are displayed directly in the dashboard. Examples:

"Asistencia: 95% — nivel 🟢 (umbral 🟢 es ≥90%)"
"Notas: 9.5 — nivel 🔴 (umbral 🟢 es ≥13)"
"Participación: baja — nivel 🟡"
"Asistencia: sin dato — evaluado sin esta variable"

variables_faltantes

list[string]

Subset of {"asistencia", "notas", "participacion"} listing which of the three signal variables had no valid data. Empty list when all three variables are present and valid.

Example classification result

For a student with asistencia_pct=95, notas_promedio=15.5, participacion="alta" (EST-001):

{
    "nivel": "🟢",
    "motivos": [
        "Asistencia: 95% — nivel 🟢 (umbral 🟢 es ≥90%)",
        "Notas: 15.5 — nivel 🟢 (umbral 🟢 es ≥13)",
        "Participación: alta — nivel 🟢"
    ],
    "variables_faltantes": []
}

For a student with all three variables null (EST-004):

{
    "nivel": "⚪",
    "motivos": [
        "Las 3 variables carecen de dato — requiere revisión manual del docente"
    ],
    "variables_faltantes": ["asistencia", "notas", "participacion"]
}

`EstudianteResultado` — API Response Model

EstudianteResultado is the Pydantic model returned by GET /api/estudiantes, GET /api/estudiantes/{id}, and POST /api/estudiantes. It merges the original student record fields with the classifier output and the AI-generated text.

string

Student identifier, e.g. "EST-001".

nombre

string

Student’s full name.

grado

string

Grade level string, e.g. "3ro de secundaria".

nivel_riesgo

string

Risk level emoji sentinel: "🟢", "🟡", "🔴", or "⚪". Direct passthrough from clasificar_estudiante()["nivel"].

motivos

list[string]

Audit trail list from the classifier. Each entry is a short human-readable sentence describing one variable’s contribution to the risk level.

variables_faltantes

list[string]

Variables that were absent or invalid for this student. Subset of ["asistencia", "notas", "participacion"].

explicacion

string | null

AI-generated natural-language explanation of the student’s situation. Written in Spanish, student-specific (not a template), produced by the Gemini client. null when origen_ia is "error_sin_cache" and no prior cache entry exists.

recomendacion

string | null

AI-generated personalised intervention recommendation for the teacher. Written in respectful, non-stigmatising language. null in the same circumstances as explicacion.

origen_ia

string

Provenance code for the AI-generated text. Possible values:

Value	Meaning
`"vivo"`	Freshly generated by the Gemini API in this request
`"fallback"`	Loaded from the on-disk cache (`cache/respuestas_ia.json`) because the live API call failed or was skipped
`"error_sin_cache"`	Live API failed and no cache entry exists; `explicacion`/`recomendacion` will be `null`
`"no_aplica"`	Student’s `nivel_riesgo` is `"⚪"` — no AI call is made for insufficient-data cases

asistencia_pct

float | null

Raw attendance value from the dataset, passed through unchanged for frontend display.

notas_promedio

float | null

Raw grade average, passed through for frontend display.

participacion

string | null

Raw participation value, passed through for frontend display.

lengua_materna

string | null

Student’s mother tongue, passed through for frontend display.

observaciones

string | null

Free-text teacher notes, passed through for frontend display.

Example `EstudianteResultado` response

{
  "id": "EST-001",
  "nombre": "María Quispe Huamán",
  "grado": "3ro de secundaria",
  "nivel_riesgo": "🟢",
  "motivos": [
    "Asistencia: 95% — nivel 🟢 (umbral 🟢 es ≥90%)",
    "Notas: 15.5 — nivel 🟢 (umbral 🟢 es ≥13)",
    "Participación: alta — nivel 🟢"
  ],
  "variables_faltantes": [],
  "explicacion": "María muestra un desempeño académico sólido y consistente, con una asistencia del 95% y un promedio de 15.5, lo que refleja un compromiso activo con su proceso de aprendizaje. Su participación alta en clase refuerza este panorama positivo.",
  "recomendacion": "Se recomienda continuar el acompañamiento actual y valorar su esfuerzo de manera regular para sostener esta trayectoria. Puede ser un buen momento para identificar si tiene intereses o habilidades que desee desarrollar con apoyo adicional del docente.",
  "origen_ia": "fallback",
  "asistencia_pct": 95,
  "notas_promedio": 15.5,
  "participacion": "alta",
  "lengua_materna": "castellano",
  "observaciones": null
}

AI Cache Entry (`cache/respuestas_ia.json`)

The Gemini AI client persists generated text to cache/respuestas_ia.json so that subsequent API requests can return the same explanation without re-calling the external model. The file is a single JSON object keyed by student ID. Key: string — Student ID in EST-XXX format. Value object fields:

explicacion

string

The full AI-generated explanation paragraph for this student. Written in Spanish. Personalised to the specific data values and context of the student — not a static template.

recomendacion

string

The AI-generated intervention recommendation addressed to the teacher. Phrased respectfully; avoids stigmatising language per the system’s design guidelines.

generado_en

string

ISO 8601 timestamp of when the AI text was generated. Example: "2026-07-01T16:24:00+00:00".

Example cache entries

{
  "EST-001": {
    "explicacion": "María muestra un desempeño académico sólido y consistente, con una asistencia del 95% y un promedio de 15.5, lo que refleja un compromiso activo con su proceso de aprendizaje. Su participación alta en clase refuerza este panorama positivo.",
    "recomendacion": "Se recomienda continuar el acompañamiento actual y valorar su esfuerzo de manera regular para sostener esta trayectoria. Puede ser un buen momento para identificar si tiene intereses o habilidades que desee desarrollar con apoyo adicional del docente.",
    "generado_en": "2026-07-01T16:24:00+00:00"
  },
  "EST-002": {
    "explicacion": "Jhon presenta una asistencia del 60% y un promedio de 9.5, junto con una participación baja en aula, lo que sugiere que puede estar atravesando dificultades que le impiden involucrarse de manera regular en las actividades escolares.",
    "recomendacion": "Se recomienda establecer una conversación privada y empática con Jhon para conocer los factores que están afectando su asistencia y rendimiento. Involucrar a la familia y, de ser posible, al equipo de orientación para diseñar juntos un plan de acompañamiento personalizado.",
    "generado_en": "2026-07-01T16:24:00+00:00"
  }
}

Students with nivel_riesgo: "⚪" (such as EST-004) are absent from the cache file entirely — no key is written for them. Because origen_ia is set to "no_aplica", the AI client is never called for ⚪ students, so there is nothing to persist. When looking up a ⚪ student you will not find their ID as a key in cache/respuestas_ia.json.

Validation Rules (`normalizar_campo()`)

Before any field reaches the classifier, normalizar_campo() in backend/classifier.py sanitises the three signal variables. Out-of-range or unrecognised values are silently coerced to null (with a [WARN] log line), meaning they contribute to variables_faltantes rather than triggering a classifier error.

Field	Valid range / values	Out-of-range treatment
`asistencia_pct`	`float` in `[0, 100]`	Values outside this range → treated as `null`
`notas_promedio`	`float` in `[0, 20]`	Values outside this range → treated as `null`
`participacion`	`"alta"`, `"media"`, or `"baja"`	Any other string → treated as `null`

# From backend/classifier.py — normalizar_campo()

PARTICIPACION_VALIDAS = {"alta", "media", "baja"}

# asistencia_pct: must be 0–100
if not (0 <= val <= 100):
    e["asistencia_pct"] = None

# notas_promedio: must be 0–20
if not (0 <= val <= 20):
    e["notas_promedio"] = None

# participacion: must be one of the three allowed values
if e["participacion"] not in PARTICIPACION_VALIDAS:
    e["participacion"] = None

Validation happens before classification. A student record that arrives with asistencia_pct: 150 or participacion: "regular" will be classified as if those fields were null, and the field names will appear in variables_faltantes in the API response. No HTTP error is raised.

When writing student records via POST /api/estudiantes, always supply participacion as exactly "alta", "media", or "baja" (lowercase, no accents). Any other capitalisation or spelling — including "Alta", "Media", "Baja" — will be treated as missing data.

Endpoints

Datos

Vanguardia EPIS — Student and AI Data Model Reference

Student Record (`data/estudiantes.json`)

Example student records

Teacher Record (`data/docentes.json`)

Example teacher record

Classification Result (output of `clasificar_estudiante()`)

Example classification result

`EstudianteResultado` — API Response Model

Example `EstudianteResultado` response

AI Cache Entry (`cache/respuestas_ia.json`)

Example cache entries

Validation Rules (`normalizar_campo()`)

Build docs developers (and LLMs) love

Endpoints

Datos

Documentation Index

​Student Record (data/estudiantes.json)

​Example student records

​Teacher Record (data/docentes.json)

​Example teacher record

​Classification Result (output of clasificar_estudiante())

​Example classification result

​EstudianteResultado — API Response Model

​Example EstudianteResultado response

​AI Cache Entry (cache/respuestas_ia.json)

​Example cache entries

​Validation Rules (normalizar_campo())

Build docs developers (and LLMs) love

Student Record (`data/estudiantes.json`)

Example student records

Teacher Record (`data/docentes.json`)

Example teacher record

Classification Result (output of `clasificar_estudiante()`)

Example classification result

`EstudianteResultado` — API Response Model

Example `EstudianteResultado` response

AI Cache Entry (`cache/respuestas_ia.json`)

Example cache entries

Validation Rules (`normalizar_campo()`)