Salud IA Bot separates data ingestion from data consumption into two clearly distinct phases. XML files from SIVIGILA, the Ministerio de Salud, regional provider registries, and the PAI vaccination programme are parsed once on a developer’s machine and stored in a portable SQLite database. In production the application only ever opens that pre-built database — no XML parsing, no in-memory trees, no startup delay. This design keeps RAM consumption low and delivers sub-3-second responses even on shared-memory hosting tiers such as Render’s free tier.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/RubenDarioGuerreroNeira/Ecosistema-IA-Colombia/llms.txt
Use this file to discover all available pages before exploring further.
Two-Phase Approach
Migration Phase (one-time, local)
Run the seed and import scripts from the
scripts/ directory. Each script reads one or more XML files with fast-xml-parser or xml2js, maps the parsed records to TypeORM entities, and bulk-saves them to data/salud-ia-bot.db in chunks of 100 rows. This phase runs on the developer’s machine before deployment.XML Data Sources
Each XML file maps to a dedicated seed script and a corresponding runtime service. The table below lists all source files verified against thedata/ directory:
| XML File | Source | Content | Migration Script + Service |
|---|---|---|---|
Eventos_de_Interés_en_Salud_Pública_20260514.xml | SIVIGILA | Transmissible disease events (dengue, zika, malaria, tuberculosis, etc.) | scripts/import-data.ts + HealthDataService |
Salud_Mental.xml | Ministerio Salud | CIE-10 mental health diagnoses and care records | scripts/import-data.ts + MentalHealthService |
Salud_sexual_-_preguntas.xml | Internal | Sexual and reproductive health Q&A | scripts/import-data.ts + SexualHealthService |
Prestadores_de_Salud_Departamento_de_Antioquia.xml | Regions | Antioquia health providers | scripts/seed-antioquia.ts + AntioquiaHealthService |
Centros_de_salud_Yopal._.xml | Regions | Yopal health centres with GPS coordinates | scripts/import-data.ts + YopalHealthService |
SERVICIOS_OFERTADOS_RED_DE_SALUD_DEL_CENTRO_ESE_POR_SEDE_CALI.xml | Regions | Cali services by sede and complexity level | scripts/import-data.ts + CaliHealthService |
servicios_salud_boyaca.xml | Regions | Boyaca provider catalogue | scripts/import-data.ts + BoyacaHealthService |
Coberturas_administrativas_de_vacunación_por_departamento_20260528.xml | PAI | Departmental vaccination coverage | scripts/seed-vaccination.ts + VaccinationService |
Cobertura_de_Vacunación_PAI_en_el_Valle_del_Cauca.xml | PAI | Valle del Cauca PAI coverage | scripts/seed-vaccination.ts + VaccinationService |
DATOS_DE_VACUNACIÓN_EN_NIÑOS_Y_NIÑAS.xml | PAI | Children’s vaccination data | scripts/seed-vaccination.ts + VaccinationService |
Calidad_del_Aire_en_Colombia_(Promedio_Anual)_20260528.xml | External API | Annual average air quality indicators by municipality | AirQualityService |
TypeORM Configuration
The database module configures TypeORM to use thebetter-sqlite3 driver. The synchronize: false flag is critical — it ensures the schema is never auto-modified at startup and that the tables seeded by the migration scripts remain intact:
entities array is imported from src/entities/index.ts and includes all eight entity classes registered in DataModule:
Seed Script Pattern
All seed and import scripts follow the same three-step pattern: parse the XML withfast-xml-parser (or xml2js for complex nested structures), map each row through a typed mapper function, then bulk-save to SQLite using TypeORM’s chunked save:
chunk: 100 option splits large inserts into batches of 100 rows, preventing SQLite parameter-binding limits from being exceeded on datasets with thousands of records.
The full SIVIGILA XML schema looks like this:
Data Models
The eight TypeORM entities cover five conceptual domains:HealthEvent
Maps SIVIGILA transmissible disease records. Fields include event name, total cases, female/male split, urban/rural split, age groups (infant through elderly), and notification date. Queried by
HealthDataService and SaludPublicaService.MentalHealth (Diagnosis)
Stores CIE-10 mental health diagnosis entries from
Salud_Mental.xml. Fields include diagnosis code, diagnosis name, total cases, and demographic breakdowns. Queried by MentalHealthService.SexualHealth (QA)
A question-and-answer store from
Salud_sexual_-_preguntas.xml. Each row holds a question string and a pre-written respuesta text. SexualHealthService runs keyword search across question fields.Provider entities
Four separate entities —
AntioquiaProvider, BoyacaProvider, CaliProvider, YopalProvider — reflect the different schemas of each regional dataset. YopalProvider includes latitud and longitud columns to support the Haversine geosearch.Vaccination
Stores PAI departmental and municipal coverage records from all three vaccination XML files. Fields include department name, vaccine type, and coverage percentage.
VaccinationService exposes getAllDepartament() and per-vaccine queries consumed by MlPredictionService for the composite risk score.Benefits of the SQLite Approach
Zero parse overhead
No XML is loaded at application startup. TypeORM opens the SQLite file in milliseconds and queries are resolved via indexed table scans rather than full in-memory tree traversal.
Reduced RAM usage
Large XML trees for Antioquia (~thousands of providers) and vaccination data stay on disk. Services such as
AntioquiaHealthService and VaccinationService use TypeORM repository queries instead of loading arrays into memory.Faster response times
Combined with the NestJS
CacheModule used in DataModule and BotModule, frequently-queried results are memoized in memory. DatasetBuilderService additionally maintains a 24-hour in-process cache for the data tensors fed to the ML prediction models.The migration scripts are standalone TypeScript files (not NestJS modules) and must be run with
ts-node outside the NestJS application lifecycle. They connect directly to TypeORM via DataSource and terminate after the import is complete. They do not run when npm run start:dev or npm run start:prod is executed.