NISIRA exposes dedicated endpoints for health checks and performance metrics, and aggregates them in the Admin Panel Metrics and Pipeline tabs. Every query the assistant answers is automatically recorded in the database, so metrics reflect real production traffic without any manual instrumentation.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HugoX2003/nisira-assistant/llms.txt
Use this file to discover all available pages before exploring further.
Health check endpoints
GET /api/health/
A lightweight liveness check that returns immediately without requiring authentication.GET /api/status/
A richer status endpoint that runs all registered health checks and returns component-level results. Implemented inmonitoring/health.py using the SERVICE_REGISTRY, which defines four check functions: api, database, worker, and vector_db.
overall_status function returns "healthy" when all components pass, "degraded" when at least one passes, and "down" when all fail.
Each component check measures its own execution time and reports it as
latency_ms. If a check throws an exception it is caught, ok is set to false, and the error message is included in details.error — the remaining checks still execute.Performance metrics
Query performance data is stored in theQueryMetrics model and aggregated on demand by GET /api/admin/metrics/. Three headline numbers are surfaced in the Metrics tab → Summary view:
| Metric | Source field | Unit | Description |
|---|---|---|---|
Latencia promedio (latenciaTotal) | QueryMetrics.total_latency | seconds | Average wall-clock time from query receipt to full response, calculated as end_time − start_time using time.time() |
Velocidad (reduccionTiempo) | derived from RAGASMetrics.response_text / total_latency | tokens/second | Average response token count divided by total latency across all recorded queries |
Calidad RAGAS (calidadRespuesta) | RAGASMetrics.wer_score | 0–1 | Composite RAGAS quality score stored in the wer_score field by the custom evaluator |
Total queries (totalQueries) | QueryMetrics row count | count | Total recorded queries in the database |
QueryMetrics model also tracks time_to_first_token, retrieval_time, generation_time, documents_retrieved, is_complex_query, and query_complexity_score. All of these are accessible per-query through the query detail endpoint.
Precision metrics
Precision data is stored in theRAGASMetrics model (computed by the custom evaluator, no external API required) and linked to QueryMetrics via a foreign key. Available fields:
| Metric | Field | Range | Calculation |
|---|---|---|---|
| Precision@k | precision_at_k | 0–1 | Fraction of the retrieved k documents whose Jaccard overlap with the response exceeds 20% |
| Recall@k | recall_at_k | 0–1 | Fraction of retrieved contexts that contributed at least one 3-word n-gram to the response |
| Faithfulness | faithfulness_score | 0–1 | Fraction of response sentences whose keywords are ≥ 60% covered by the retrieved contexts |
| Hallucination rate | hallucination_rate | 0–1 | 1.0 − faithfulness_score; auto-computed on save |
| Answer relevancy | answer_relevancy | 0–1 | Fraction of query keywords present in the response, with a length bonus for 20–300 word responses |
| WER | wer_score | 0–∞ | Word Error Rate (Levenshtein distance), recorded only when ground-truth is available |
Admin metrics API
GET /api/admin/metrics/
Returns all aggregated metrics in a single JSON response. Requires admin JWT.Metrics are computed live from the
QueryMetrics and RAGASMetrics tables on each request. If totalQueries is 0, no queries have been made yet — metrics populate automatically as users interact with the chat assistant.Query history
GET /api/admin/metrics/queries/
Returns a paginated list of all recorded queries with their performance metrics. Acceptspage, page_size, and complex_only query parameters.
queries array includes query_id, query_text (truncated to 200 characters), timestamp, is_complex, complexity_score, a performance block (total_latency, time_to_first_token, retrieval_time, generation_time, documents_retrieved), and a precision block when RAGAS metrics are available.
GET /api/admin/metrics/queries/<query_id>/
Returns the full detail for a single query, including a step-by-step explanation of how each metric was calculated for that specific request — including the exact formula, input values, and a human-readable interpretation. Useful for debugging retrieval quality on individual queries.precision block with k_value, documentos_relevantes, documentos_irrelevantes, and the full calculation string (e.g. "3 documentos relevantes / 5 documentos totales = 0.6000").
Rating metrics
User thumbs-up / thumbs-down feedback from the chat interface is aggregated at:total_ratings, a distribution object (likes, dislikes, like_percentage, dislike_percentage), a top_issues list of the most-reported issue tags (irrelevante, sin_evidencia, tardio, alucinacion, accion_incorrecta, otro), and a recent_ratings array with the latest individual feedback entries.
Pipeline status
Check the operational state of all RAG subsystems:overall is "operational" when both embeddings and vector_store are true; otherwise "degraded". The Pipeline tab in the Admin Panel renders these four boolean flags as status cards with check/cross icons.
Guardrail status
The experiment guardrail endpoint is available tois_staff users and reports whether the latest ExperimentRun passed its quality thresholds and whether the user satisfaction rate is above the configured floor:
guardrail_passed: false response means either the last experiment was blocked, the satisfaction rate is below threshold, or there are failed rating feedback events pending review.
Direct model inspection
For ad-hoc queries against the raw metric tables, use the Django admin at/admin/. The models api | Query metrics and api | RAGAS metrics are both registered and searchable. You can also query the database directly: