Skip to main content
GET
/
monitoring
QA Monitoring
curl --request GET \
  --url https://api.example.com/monitoring
{
  "requests_total": 123,
  "avg_latency_ms": 123,
  "avg_retrieval_accuracy": 123,
  "hallucination_rate": 123
}

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/RaviTejaMedarametla/Data-Science-AI-Portfolio/llms.txt

Use this file to discover all available pages before exploring further.

QA Service Monitoring

Returns aggregated metrics for the QA service including request counts, latency, accuracy, and hallucination rates.

Endpoint

GET /monitoring

Response

requests_total
integer
required
Total number of QA requests processed since service startup.
avg_latency_ms
number
required
Average response latency across all requests in milliseconds.
avg_retrieval_accuracy
number
required
Average retrieval accuracy score (0.0 to 1.0). Measures how well retrieved documents match the query.
hallucination_rate
number
required
Rate of responses flagged as potential hallucinations (0.0 to 1.0). Higher values indicate more responses without supporting citations.

Example Request

cURL
curl http://localhost:8000/monitoring

Example Response

{
  "requests_total": 1250,
  "avg_latency_ms": 856.3421,
  "avg_retrieval_accuracy": 0.7823,
  "hallucination_rate": 0.0456
}

Metrics Interpretation

Latency

  • < 500ms: Excellent performance
  • 500-1000ms: Good performance
  • > 1000ms: Consider optimization (reduce chunk size, use faster embeddings)

Retrieval Accuracy

  • > 0.8: High quality matches
  • 0.6-0.8: Moderate quality, acceptable
  • < 0.6: Poor retrieval, review document chunking strategy

Hallucination Rate

  • < 0.05: Low risk, model stays grounded
  • 0.05-0.10: Moderate risk, monitor closely
  • > 0.10: High risk, review prompts and citation validation

Implementation

The monitoring metrics are updated after each QA request (see src/qa_api.py:70-78 and src/qa_api.py:211-219):
monitoring: Dict[str, Any] = {
    'requests_total': 0,
    'latency_ms_total': 0.0,
    'retrieval_accuracy_total': 0.0,
    'hallucination_count': 0,
}
Metrics reset on service restart.
  • QA Ask - Submit questions to the QA system
  • QA Stream - Stream answers in real-time
  • QA Health - Check if QA service is ready

Build docs developers (and LLMs) love