System Health Check
Retrieves the current health status of the API and its dependencies. This endpoint is typically used by monitoring systems, load balancers, and orchestration platforms.Authorization
No authentication required. This endpoint is public to allow external monitoring.Response
The health check endpoint returns different responses based on the system status:Healthy Status
When all systems are operational:- Status Code:
200 OK - Response:
Healthy
Degraded Status
When some non-critical systems are experiencing issues:- Status Code:
200 OK - Response: Details about degraded services
Unhealthy Status
When critical systems are down:- Status Code:
503 Service Unavailable - Response: Details about failing health checks
Health Checks Performed
The endpoint validates:- Database Connectivity - Verifies connection to the application database
- Application Services - Ensures core application services are responsive
Example Request
Example Responses
Usage in Production
Load Balancer Configuration
Configure your load balancer to perform health checks:Kubernetes Liveness Probe
Example Kubernetes configuration:Readiness Probe
Example readiness probe for Kubernetes:Monitoring Integration
Integrate with monitoring tools:Docker Health Check
Example Docker health check:Best Practices
Timeout Configuration
- Set appropriate timeouts (3-5 seconds recommended)
- Ensure health checks don’t overwhelm the system
- Use exponential backoff for retries
Monitoring Frequency
- Production: Every 30-60 seconds
- Development: Every 60-120 seconds
- Don’t check more frequently than necessary
Response Handling
- 200 OK: System is healthy, route traffic normally
- 503 Unavailable: System is unhealthy, stop routing traffic
- Timeout: Treat as unhealthy after threshold
Database Health
The health check verifies database connectivity by:- Testing connection pool availability
- Executing a lightweight query
- Validating response time is within acceptable limits
Logging
Health check failures are logged but don’t trigger audit logs to avoid:- Excessive log volume
- Performance impact during outages
- Filling audit tables with monitoring data
Troubleshooting
Common Issues
503 Service Unavailable- Check database connection string
- Verify network connectivity
- Review application logs
- Check resource availability (CPU, memory)
- Database query performance issues
- Network latency
- Resource exhaustion
- Connection pool saturation
- Database connection pool issues
- Network instability
- Resource contention
- Insufficient health check thresholds