Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HugoX2003/nisira-assistant/llms.txt
Use this file to discover all available pages before exploring further.
NISIRA’s retrieval system does not use a fixed number of documents for every query. Instead, the calculate_adaptive_top_k() function in api/views.py analyses each incoming question and computes an appropriate top_k value — the number of document chunks to retrieve — before the RAG pipeline even starts. This keeps simple questions lean and ensures complex, multi-part queries receive the breadth of context they require.
The Problem with a Fixed top_k
Before the adaptive system was introduced, every query used top_k = 5. That created two opposing failure modes:
| Scenario | Fixed top_k = 5 | Result |
|---|
| Simple one-line question | Retrieves 5 chunks when 2–3 suffice | Wasted embedding computation and context space |
| Complex multi-part question | Only 5 chunks for a query needing 10+ | Under-retrieval; incomplete answers |
| Comparative analysis across documents | Capped at 5 | Misses relevant chunks from secondary sources |
The Algorithm
calculate_adaptive_top_k(question: str) -> int runs three independent scoring factors and combines them:
Factor 1 — Query Length → Base top_k
| Query length (chars) | Base top_k |
|---|
| < 50 | 3 |
| 50 – 99 | 5 |
| 100 – 149 | 7 |
| ≥ 150 | 9 |
Factor 2 — Multiple Question Marks
Each ? beyond the first adds +2 to top_k, capped at +4.
"¿Qué es ISO 27001? ¿Cuáles son sus requisitos? ¿Cómo se implementa?"
→ 3 question marks → +4 bonus
Factor 3 — Complexity Keywords
The function scans for eight Spanish-language keywords that signal a need for deeper analysis:
comparar, diferencia, analizar, explicar detalladamente, por qué, cómo funciona, implementar, relacionan
Each match adds +1, capped at +3.
Hard Limits
final_top_k = max(3, min(base_k + question_bonus + keyword_bonus, 15))
- Minimum: 3 — always retrieve at least 3 chunks.
- Maximum: 15 — never exceed 15 to protect latency and context length.
Query Type Ranges
| Query Type | top_k Range | Example Query |
|---|
| Simple | 3–5 | "¿Qué es ISO 27001?" |
| Medium | 5–8 | "Explica los controles de seguridad de ISO 27001" |
| Complex | 8–12 | "¿Cómo se relacionan ISO 27001 e ISO 27002 y cuáles son sus principales diferencias?" |
| Very Complex | 12–15 | "Compara ISO 27001, ISO 27002 e ISO 27005, analizando sus diferencias, similitudes y cómo se complementan entre sí" |
Code Example
from api.views import calculate_adaptive_top_k
# Simple query — length 18 chars, no keywords
calculate_adaptive_top_k("¿Qué es ISO 27001?")
# → 3
# Medium query — length 55 chars, 1 complexity keyword ("explicar detalladamente")
calculate_adaptive_top_k("Explica detalladamente los principios de ISO 27001")
# → 5 (base) + 1 (keyword) = 6
# Complex multi-question query (length ≈ 111 chars → 100–149 range)
calculate_adaptive_top_k(
"¿Por qué es importante implementar ISO 27001? "
"¿Cómo se relaciona con GDPR? ¿Qué controles son necesarios?"
)
# → 7 (base 100–149 chars) + 4 (3×?) + 2 (2 keywords: "por qué", "implementar") = 13
Integration Points
The adaptive calculation is applied in two places inside api/views.py:
-
rag_query view — if the caller does not include a top_k field in the POST body, calculate_adaptive_top_k() is called automatically. If an explicit top_k is supplied, that value is respected as-is.
if 'top_k' in request.data:
top_k = request.data.get('top_k')
else:
top_k = calculate_adaptive_top_k(question)
-
rag_enhanced_chat view — always computes adaptive_top_k regardless of the request payload, because the chat endpoint owns the full conversation context needed for a good estimate.
Impact on Retrieval Quality
| Query Complexity | Fixed top_k = 5 | Adaptive top_k | Outcome |
|---|
| Simple (≤ 50 chars) | 5 (over-fetches) | 3 | Faster; tighter context window |
| Medium | 5 (usually OK) | 5–7 | Slight improvement in recall |
| Complex | 5 (under-fetches) | 8–12 | Significantly more complete answers |
| Very complex | 5 (severely limited) | 12–15 | Multi-source synthesis enabled |
The adaptive top_k is logged at the INFO level for every request: "Consulta con top_k adaptativo: N (longitud: M chars)". Check your Django logs to monitor how the system is classifying incoming queries.