The Chatbot module gives the general population a natural-language interface to the Jurisdicción Sanitaria’s entire knowledge base. Instead of searching menus or reading long documents, a user can type a plain question — in Spanish — and receive a concise, sourced answer drawn directly from official CMS content. No custom model is trained; the system’s intelligence comes from combining PostgreSQL’sDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/LMendoza70/SSA/llms.txt
Use this file to discover all available pages before exploring further.
pgvector extension for semantic search with an external large-language-model (LLM) API for fluent response generation. This means the chatbot’s knowledge is always current: the moment an editor publishes or updates an article, it becomes available to the chatbot without any manual retraining step.
Architecture: Retrieval Augmented Generation (RAG)
RAG is the pattern that makes it possible to ground an LLM’s output in a specific, controlled knowledge base. The chatbot never generates answers from the model’s training data alone — every response is constructed from content that exists in the CMS at the time of the query.Because all knowledge comes from the CMS, the chatbot can only answer questions about topics the Jurisdicción has published content on. If no sufficiently similar content exists, the chatbot returns a graceful “no information available” message rather than hallucinating an answer.
How Embeddings Work
Embeddings are the bridge between human language and the vector similarity search that powers the chatbot. Every time a piece of content is published or updated in the CMS, the following indexing pipeline runs automatically:Text extraction and chunking
The content body (rich text from the Tiptap editor) is stripped of HTML markup and split into overlapping chunks of approximately 500 tokens. Overlapping ensures that sentences spanning a chunk boundary are not lost.
Embedding generation
Each chunk is sent to the embedding model (e.g. OpenAI
text-embedding-3-small). The model returns a 1,536-dimension floating-point vector that encodes the semantic meaning of the text.Vector storage via pgvector
The vector is stored in the
content_embeddings table alongside the source contentId and chunk index. PostgreSQL’s pgvector extension provides the vector column type and the <=> cosine-distance operator used at query time.ContentEmbedding Schema
Semantic Similarity Search
At query time, the user’s question is itself converted to a vector and compared against all stored embeddings using cosine distance. Thepgvector <=> operator returns the chunks whose meaning is closest to the question.
CHATBOT_SIMILARITY_THRESHOLD are included in the LLM prompt context. Chunks below the threshold are discarded, and if no chunks pass the threshold, the chatbot returns its fallback message.
Chatbot API
Ask a Question
language field currently supports "es" (Spanish). The system prompt instructs the LLM to respond in the requested language.
Response:
sources array so users can read the full official articles behind the answer.
Force Re-index All Content
Administrators can trigger a full re-indexing of all published content — for example, after changing the chunking strategy or switching embedding models.Environment Configuration
| Variable | Purpose |
|---|---|
OPENAI_API_KEY | API key for the embedding and chat completion provider |
OPENAI_EMBEDDING_MODEL | Model used to generate content and query vectors |
OPENAI_CHAT_MODEL | Model used to generate the final natural-language response |
CHATBOT_MAX_CONTEXT_CHUNKS | Maximum number of retrieved chunks passed to the LLM prompt (default 5) |
CHATBOT_SIMILARITY_THRESHOLD | Minimum cosine similarity score for a chunk to be included in context (0–1) |
Knowledge Base Maintenance
The chatbot’s knowledge is always derived from published CMS content. Publishing a new article, disease guide, or FAQ entry automatically makes it searchable by the chatbot — no manual retraining or admin action required.
| CMS Event | Embedding Action |
|---|---|
| Content published | Chunks generated and embeddings created |
| Content body updated | Existing embeddings deleted and regenerated |
| Content archived | Embeddings deleted from content_embeddings table |
| Content hard-deleted | Embeddings cascade-deleted via foreign key |
Safety and Accuracy
Health information carries a higher-than-average responsibility for accuracy. The chatbot incorporates several safeguards:Context-only answers
The LLM system prompt explicitly instructs the model to answer only from the provided context chunks. If the answer cannot be found in the context, the model must say so — it must not draw on its training data.
Similarity threshold gate
If no retrieved chunk exceeds
CHATBOT_SIMILARITY_THRESHOLD, the chatbot returns a standard “no information available” message and directs the user to call the Jurisdicción’s helpline.Source citations
Every response includes the source content items with their similarity scores. Users can follow the link to read the full official document and verify the answer.
Medical disclaimer
Every response is appended with a disclaimer reminding the user that the chatbot is an information assistant, not a medical professional, and directing them to consult a health provider for personal medical decisions.
Related Modules
CMS Overview
All chatbot knowledge originates from CMS content.
Content Types
Understand the content types indexed for semantic search.
Timeline
Timeline events are also indexed and citable by the chatbot.