Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/LMendoza70/SSA/llms.txt

Use this file to discover all available pages before exploring further.

The Chatbot module gives the general population a natural-language interface to the Jurisdicción Sanitaria’s entire knowledge base. Instead of searching menus or reading long documents, a user can type a plain question — in Spanish — and receive a concise, sourced answer drawn directly from official CMS content. No custom model is trained; the system’s intelligence comes from combining PostgreSQL’s pgvector extension for semantic search with an external large-language-model (LLM) API for fluent response generation. This means the chatbot’s knowledge is always current: the moment an editor publishes or updates an article, it becomes available to the chatbot without any manual retraining step.

Architecture: Retrieval Augmented Generation (RAG)

RAG is the pattern that makes it possible to ground an LLM’s output in a specific, controlled knowledge base. The chatbot never generates answers from the model’s training data alone — every response is constructed from content that exists in the CMS at the time of the query.
User Query

Embedding Generation (user query → vector)

Semantic Search (pgvector cosine similarity)

Context Retrieval (top-N relevant content chunks)

LLM Prompt Construction (system prompt + context + user query)

LLM Response Generation (external LLM API)

Cited Response returned to User
Because all knowledge comes from the CMS, the chatbot can only answer questions about topics the Jurisdicción has published content on. If no sufficiently similar content exists, the chatbot returns a graceful “no information available” message rather than hallucinating an answer.

How Embeddings Work

Embeddings are the bridge between human language and the vector similarity search that powers the chatbot. Every time a piece of content is published or updated in the CMS, the following indexing pipeline runs automatically:
1

Text extraction and chunking

The content body (rich text from the Tiptap editor) is stripped of HTML markup and split into overlapping chunks of approximately 500 tokens. Overlapping ensures that sentences spanning a chunk boundary are not lost.
2

Embedding generation

Each chunk is sent to the embedding model (e.g. OpenAI text-embedding-3-small). The model returns a 1,536-dimension floating-point vector that encodes the semantic meaning of the text.
3

Vector storage via pgvector

The vector is stored in the content_embeddings table alongside the source contentId and chunk index. PostgreSQL’s pgvector extension provides the vector column type and the <=> cosine-distance operator used at query time.
4

Lifecycle management

When a content item is updated, its embeddings are deleted and regenerated from the new body. When a content item is archived or soft-deleted, its embeddings are removed from the search index so the chatbot cannot cite retired content.

ContentEmbedding Schema

model ContentEmbedding {
  id          String   @id @default(uuid())
  contentId   String
  chunkIndex  Int
  chunkText   String
  embedding   Unsupported("vector(1536)")
  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt
}
At query time, the user’s question is itself converted to a vector and compared against all stored embeddings using cosine distance. The pgvector <=> operator returns the chunks whose meaning is closest to the question.
-- Semantic similarity search: find the 5 most relevant chunks
SELECT
  content_id,
  chunk_text,
  1 - (embedding <=> $1::vector) AS similarity
FROM content_embeddings
ORDER BY embedding <=> $1::vector
LIMIT 5;
Only chunks whose similarity score exceeds CHATBOT_SIMILARITY_THRESHOLD are included in the LLM prompt context. Chunks below the threshold are discarded, and if no chunks pass the threshold, the chatbot returns its fallback message.

Chatbot API

Ask a Question

POST /chatbot/query
Content-Type: application/json
{
  "question": "¿Cómo puedo prevenir el dengue?",
  "language": "es"
}
The language field currently supports "es" (Spanish). The system prompt instructs the LLM to respond in the requested language. Response:
{
  "answer": "Para prevenir el dengue, elimine los criaderos de agua estancada en su hogar: vacíe cubetas, floreros y llantas que acumulen agua. Use repelente de insectos con DEET y duerma bajo mosquitero. Consulte a su médico ante los primeros síntomas.",
  "sources": [
    {
      "contentId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "title": "Guía de Prevención del Dengue",
      "url": "/content/guia-prevencion-dengue",
      "similarity": 0.94
    },
    {
      "contentId": "f9e8d7c6-b5a4-3210-fedc-ba0987654321",
      "title": "Campaña Patio Limpio 2024",
      "url": "/content/campana-patio-limpio-2024",
      "similarity": 0.87
    }
  ]
}
Every response includes a sources array so users can read the full official articles behind the answer.

Force Re-index All Content

Administrators can trigger a full re-indexing of all published content — for example, after changing the chunking strategy or switching embedding models.
POST /chatbot/reindex
Authorization: Bearer <admin-token>
Re-indexing deletes all existing embeddings and regenerates them from scratch. This process is resource-intensive and will temporarily degrade chatbot quality while it runs. Schedule it during off-peak hours and monitor embedding API costs.

Environment Configuration

# OpenAI (or compatible embedding + chat provider)
OPENAI_API_KEY=sk-...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_CHAT_MODEL=gpt-4o

# RAG tuning
CHATBOT_MAX_CONTEXT_CHUNKS=5
CHATBOT_SIMILARITY_THRESHOLD=0.75
VariablePurpose
OPENAI_API_KEYAPI key for the embedding and chat completion provider
OPENAI_EMBEDDING_MODELModel used to generate content and query vectors
OPENAI_CHAT_MODELModel used to generate the final natural-language response
CHATBOT_MAX_CONTEXT_CHUNKSMaximum number of retrieved chunks passed to the LLM prompt (default 5)
CHATBOT_SIMILARITY_THRESHOLDMinimum cosine similarity score for a chunk to be included in context (0–1)
Increase CHATBOT_SIMILARITY_THRESHOLD to make the chatbot more conservative (fewer, higher-quality citations). Lower it to increase recall at the risk of including loosely relevant context. A value between 0.70 and 0.80 works well for health content in Spanish.

Knowledge Base Maintenance

The chatbot’s knowledge is always derived from published CMS content. Publishing a new article, disease guide, or FAQ entry automatically makes it searchable by the chatbot — no manual retraining or admin action required.
The following events in the CMS trigger automatic embedding updates:
CMS EventEmbedding Action
Content publishedChunks generated and embeddings created
Content body updatedExisting embeddings deleted and regenerated
Content archivedEmbeddings deleted from content_embeddings table
Content hard-deletedEmbeddings cascade-deleted via foreign key
This event-driven approach means the embedding index is always consistent with the live CMS state without requiring periodic batch jobs.

Safety and Accuracy

Health information carries a higher-than-average responsibility for accuracy. The chatbot incorporates several safeguards:

Context-only answers

The LLM system prompt explicitly instructs the model to answer only from the provided context chunks. If the answer cannot be found in the context, the model must say so — it must not draw on its training data.

Similarity threshold gate

If no retrieved chunk exceeds CHATBOT_SIMILARITY_THRESHOLD, the chatbot returns a standard “no information available” message and directs the user to call the Jurisdicción’s helpline.

Source citations

Every response includes the source content items with their similarity scores. Users can follow the link to read the full official document and verify the answer.

Medical disclaimer

Every response is appended with a disclaimer reminding the user that the chatbot is an information assistant, not a medical professional, and directing them to consult a health provider for personal medical decisions.
The chatbot is an information assistant, not a medical advisor. All responses must include a disclaimer directing users to consult qualified health professionals before making any medical decisions. Never remove or suppress this disclaimer in production.

CMS Overview

All chatbot knowledge originates from CMS content.

Content Types

Understand the content types indexed for semantic search.

Timeline

Timeline events are also indexed and citable by the chatbot.

Build docs developers (and LLMs) love