Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/HugoX2003/nisira-assistant/llms.txt

Use this file to discover all available pages before exploring further.

The RAG (Retrieval-Augmented Generation) System API exposes low-level control over the NISIRA pipeline: checking readiness, triggering document synchronization from Google Drive, and running queries directly against the vector store with fine-grained parameter control. These endpoints are distinct from POST /api/chat/ — they bypass conversation persistence and are better suited for testing, evaluation scripts, and the admin panel. Unless noted, all endpoints require a valid JWT Bearer token.

GET /api/rag/status/

Returns the current readiness state of the RAG pipeline by instantiating a RAGPipeline object and calling is_ready() on each of its components. Requires JWT authentication.

Response

rag_available
boolean
true if the RAG Python modules were successfully imported at server start. false means the dependencies are missing and a 503 is returned instead.
status
object
Component-level readiness map.
timestamp
string
ISO 8601 timestamp of when the status was checked.

Example

curl -X GET https://your-domain.com/api/rag/status/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
{
  "rag_available": true,
  "status": {
    "modules_available": true,
    "components": {
      "vector_store": true,
      "embeddings": true,
      "document_processor": true
    },
    "version": "1.0.0"
  },
  "timestamp": "2024-11-10T14:00:00.000Z"
}

POST /api/rag/initialize/

Initializes the RAGPipeline instance and verifies that all components are operational. This is typically called once after deployment to warm up the pipeline before the first user query. Requires JWT authentication.

Response

message
string
Human-readable confirmation, e.g., "Sistema RAG inicializado correctamente".
result
object

Example

curl -X POST https://your-domain.com/api/rag/initialize/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
{
  "message": "Sistema RAG inicializado correctamente",
  "result": {
    "success": true,
    "components": {
      "vector_store": true,
      "embeddings": true,
      "document_processor": true
    },
    "message": "Sistema RAG inicializado"
  }
}

POST /api/rag/sync/

Triggers a document synchronization pass via pipeline.sync_and_process_documents(). This downloads new or changed files from Google Drive (or local storage), chunks them, generates embeddings, and upserts the vectors into the configured store. Accepts an optional force_reprocess flag to re-embed files that are already indexed. Requires JWT authentication.

Request Body

force_reprocess
boolean
default:"false"
When true, re-processes and re-indexes all documents even if they have already been embedded (identified by MD5 hash). Use this after updating the embedding model or chunking strategy.

Response

message
string
Confirmation string on success.
result
object
Sync results object returned by the pipeline.

Example

curl -X POST https://your-domain.com/api/rag/sync/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." \
  -H "Content-Type: application/json" \
  -d '{"force_reprocess": false}'
{
  "message": "Documentos sincronizados y procesados correctamente",
  "result": {
    "success": true,
    "downloaded": 3,
    "processed": 3,
    "skipped": 47,
    "errors": 0
  }
}

POST /api/rag/query/

Runs a direct RAG query outside of any conversation context. Useful for evaluation scripts, batch testing, and the admin panel. The top_k parameter can be set explicitly to override the adaptive calculation used by POST /api/chat/.
This endpoint has AllowAny permission in the source code — a JWT token is not enforced at the Django permission layer. However, best practice is to always supply the Authorization header in production environments.

Request Body

question
string
required
The question to answer. Must be a non-empty string.
top_k
integer
Number of document chunks to retrieve. Overrides the adaptive calculation. Clamped to the range [3, 15]. Omit to use calculate_adaptive_top_k() automatically.
include_generation
boolean
default:"true"
When true, passes the retrieved context to the LLM and returns a natural-language answer. When false, returns only the retrieved chunks without generating a response.

Response

question
string
Echo of the submitted question.
answer
string
LLM-generated answer (only present when include_generation is true).
sources
array
Array of source citation objects (same structure as POST /api/chat/ sources).
relevant_documents_count
integer
Number of document chunks that were retrieved from the vector store.
generation_used
boolean
Whether the LLM generation step was executed.
timestamp
string
ISO 8601 timestamp of the query.

Example

curl -X POST https://your-domain.com/api/rag/query/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." \
  -H "Content-Type: application/json" \
  -d '{
    "question": "¿Qué documentos se necesitan para un permiso de funcionamiento?",
    "top_k": 8,
    "include_generation": true
  }'

POST /api/rag/chat/

An enhanced chat endpoint that integrates the full RAG pipeline with automatic conversation history support. Unlike POST /api/chat/, this endpoint retrieves the 6 most recent messages from the conversation and passes them as context to the LLM, producing richer grounded answers. It auto-creates a new conversation if conversation_id is omitted, and falls back gracefully to the basic chat handler if the RAG modules are unavailable. Returns 201 Created on success. Requires JWT authentication.
For most frontend integrations, POST /api/rag/chat/ is the recommended endpoint. It provides richer metrics tracking (via MetricsTracker) compared to POST /api/chat/, which uses a simpler response generation path.

Request Body

content
string
required
The user’s message. Must be non-empty.
conversation_id
string
Slug or legacy numeric ID of an existing conversation. If omitted, a new conversation is created automatically with the message content as the title (truncated to 50 characters).
use_rag
boolean
default:"true"
Set to false to bypass the RAG pipeline and use the basic keyword-based response generator. Automatically set to false for short greetings (30 characters or fewer) to avoid unnecessary retrieval.
Conversation history is built automatically by the backend — the 6 most recent messages from the conversation are retrieved from the database and passed to the LLM as context. There is no history request body parameter.

Response

conversation_id
string
Slug of the conversation (new or existing).
user_message
object
The saved user message record: {id, content, timestamp}.
assistant_message
object
The saved assistant message record: {id, content, timestamp, rating, rating_issue_tag}.
response
string
The assistant’s answer text (duplicated from assistant_message.content for frontend compatibility).
rag_used
boolean
Whether the RAG pipeline was actually invoked for this response.
sources
array
Source citations for the response. Empty array when rag_used is false.
metrics
object
Performance metrics summary from MetricsTracker: includes latency breakdown, top-k used, and RAGAS scores when available.

Example

curl -X POST https://your-domain.com/api/rag/chat/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "¿Cuál es el plazo para renovar una licencia de actividad económica?",
    "conversation_id": "aB3dEfGhIjK"
  }'

GET /api/rag/system-status/

Returns a detailed diagnostic snapshot of the RAG pipeline by calling pipeline.get_system_status(). This is a more verbose version of GET /api/rag/status/ that exposes internal component metadata.
This endpoint has AllowAny permission — a JWT token is not enforced at the Django permission layer. Best practice is to supply the Authorization header in production environments.

Response

rag_available
boolean
Whether the RAG modules are importable.
system_status
object
Detailed component status object returned directly by pipeline.get_system_status(). Contents vary by deployment configuration but typically include vector store connection details, embedding model info, and document counts.
timestamp
string
ISO 8601 timestamp of the status check.

Example

curl -X GET https://your-domain.com/api/rag/system-status/ \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
{
  "rag_available": true,
  "system_status": {
    "vector_store": {
      "backend": "postgres",
      "total_documents": 1847,
      "is_ready": true
    },
    "embeddings": {
      "model": "text-embedding-3-small",
      "dimension": 1536,
      "is_ready": true
    }
  },
  "timestamp": "2024-11-10T16:30:00.000Z"
}

Build docs developers (and LLMs) love