Vanguardia EPIS operates on a small set of well-defined JSON structures that flow from the raw data files on disk, through the deterministic classifier, and out to the API consumer. This page documents every field in each structure, the Pydantic models that govern the API’s request/response contracts, and the on-disk cache format written by the Gemini AI client. All field names, types, and example values are taken directly from the source files inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Pierrot-01/Hackathon_epis_2026/llms.txt
Use this file to discover all available pages before exploring further.
data/, backend/main.py, backend/classifier.py, and cache/.
Student Record (data/estudiantes.json)
The primary dataset lives in data/estudiantes.json as a JSON array of student objects. Each object supplies the raw inputs that the classifier and AI client consume. All three risk-signal fields (asistencia_pct, notas_promedio, participacion) are nullable — missing data is a first-class concept in the system rather than an error.
Unique student identifier. Format:
EST-XXX, where XXX is a zero-padded auto-incremented integer (e.g., EST-001, EST-020). Generated by the POST /api/estudiantes endpoint when creating a new record.Student’s full name. Example:
"María Quispe Huamán".Grade level in free-text Spanish format. Examples:
"1ro de secundaria", "3ro de secundaria", "5to de secundaria".Monthly attendance expressed as a percentage from 0 to 100. A
null value means no attendance record was available for the current period. Values outside the 0–100 range are normalised to null by normalizar_campo() before classification. Example: 95, 60, null.General grade average on Peru’s vigesimal scale (0–20). A
null value means no grades were registered for the current period. Values outside 0–20 are normalised to null. Example: 15.5, 9.5, null.Qualitative classroom participation level. Must be one of
"alta", "media", or "baja". Any other string value is normalised to null by normalizar_campo(). Example: "alta", "baja", null.Student’s mother tongue. Common values in the dataset:
"castellano", "quechua". Defaults to "castellano" when not provided via the API. Included for cultural-context awareness in AI-generated recommendations.Free-text teacher notes attached to the student record. Example:
"Múltiples inasistencias consecutivas reportadas.". Passed to the AI client as additional context. null when no notes exist.Example student records
The following entries are taken directly fromdata/estudiantes.json:
Teacher Record (data/docentes.json)
Teacher (docente) records are stored in data/docentes.json and managed through the POST /api/docentes and GET /api/docentes endpoints. The structure mirrors the DocenteInput / DocenteResultado Pydantic models in backend/main.py.
Unique teacher identifier. Format:
DOC-XXX, zero-padded auto-incremented integer. Example: "DOC-002".Teacher’s full name. Example:
"Carlos Ruíz".Institutional email address. Example:
"[email protected]".Subject or role assignment description. Example:
"Tutoría 3° Sec".Short grade key that links the teacher to a student cohort. Example:
"3ro". Corresponds to the prefix used in student grado strings.Current status of the teacher account. Allowed values:
"Activo" or "Inactivo". Defaults to "Activo" when creating via the API.Example teacher record
Classification Result (output of clasificar_estudiante())
clasificar_estudiante() in backend/classifier.py returns a plain Python dict — not a Pydantic model — that is then merged into the EstudianteResultado API response. It contains three keys:
The overall risk level for this student. One of four emoji sentinels:
"🟢"— Bajo (low risk)"🟡"— Medio (medium risk)"🔴"— Alto (high risk)"⚪"— Dato insuficiente (insufficient data)
Human-readable audit trail, one string per evaluated variable plus one for each missing variable. These strings are displayed directly in the dashboard. Examples:
"Asistencia: 95% — nivel 🟢 (umbral 🟢 es ≥90%)""Notas: 9.5 — nivel 🔴 (umbral 🟢 es ≥13)""Participación: baja — nivel 🟡""Asistencia: sin dato — evaluado sin esta variable"
Subset of
{"asistencia", "notas", "participacion"} listing which of the three signal variables had no valid data. Empty list when all three variables are present and valid.Example classification result
For a student withasistencia_pct=95, notas_promedio=15.5, participacion="alta" (EST-001):
EstudianteResultado — API Response Model
EstudianteResultado is the Pydantic model returned by GET /api/estudiantes, GET /api/estudiantes/{id}, and POST /api/estudiantes. It merges the original student record fields with the classifier output and the AI-generated text.
Student identifier, e.g.
"EST-001".Student’s full name.
Grade level string, e.g.
"3ro de secundaria".Risk level emoji sentinel:
"🟢", "🟡", "🔴", or "⚪". Direct passthrough from clasificar_estudiante()["nivel"].Audit trail list from the classifier. Each entry is a short human-readable sentence describing one variable’s contribution to the risk level.
Variables that were absent or invalid for this student. Subset of
["asistencia", "notas", "participacion"].AI-generated natural-language explanation of the student’s situation. Written in Spanish, student-specific (not a template), produced by the Gemini client.
null when origen_ia is "error_sin_cache" and no prior cache entry exists.AI-generated personalised intervention recommendation for the teacher. Written in respectful, non-stigmatising language.
null in the same circumstances as explicacion.Provenance code for the AI-generated text. Possible values:
| Value | Meaning |
|---|---|
"vivo" | Freshly generated by the Gemini API in this request |
"fallback" | Loaded from the on-disk cache (cache/respuestas_ia.json) because the live API call failed or was skipped |
"error_sin_cache" | Live API failed and no cache entry exists; explicacion/recomendacion will be null |
"no_aplica" | Student’s nivel_riesgo is "⚪" — no AI call is made for insufficient-data cases |
Raw attendance value from the dataset, passed through unchanged for frontend display.
Raw grade average, passed through for frontend display.
Raw participation value, passed through for frontend display.
Student’s mother tongue, passed through for frontend display.
Free-text teacher notes, passed through for frontend display.
Example EstudianteResultado response
AI Cache Entry (cache/respuestas_ia.json)
The Gemini AI client persists generated text to cache/respuestas_ia.json so that subsequent API requests can return the same explanation without re-calling the external model. The file is a single JSON object keyed by student ID.
Key: string — Student ID in EST-XXX format.
Value object fields:
The full AI-generated explanation paragraph for this student. Written in Spanish. Personalised to the specific data values and context of the student — not a static template.
The AI-generated intervention recommendation addressed to the teacher. Phrased respectfully; avoids stigmatising language per the system’s design guidelines.
ISO 8601 timestamp of when the AI text was generated. Example:
"2026-07-01T16:24:00+00:00".Example cache entries
Students with
nivel_riesgo: "⚪" (such as EST-004) are absent from the cache file entirely — no key is written for them. Because origen_ia is set to "no_aplica", the AI client is never called for ⚪ students, so there is nothing to persist. When looking up a ⚪ student you will not find their ID as a key in cache/respuestas_ia.json.Validation Rules (normalizar_campo())
Before any field reaches the classifier, normalizar_campo() in backend/classifier.py sanitises the three signal variables. Out-of-range or unrecognised values are silently coerced to null (with a [WARN] log line), meaning they contribute to variables_faltantes rather than triggering a classifier error.
| Field | Valid range / values | Out-of-range treatment |
|---|---|---|
asistencia_pct | float in [0, 100] | Values outside this range → treated as null |
notas_promedio | float in [0, 20] | Values outside this range → treated as null |
participacion | "alta", "media", or "baja" | Any other string → treated as null |