The Incidents API is the core of Sentinel SoftServe. It exposes endpoints to list, create, and inspect incidents, manage their lifecycle status, generate and save post-mortems, export full incident bundles, query relevant runbooks from ChromaDB, find similar past incidents via episodic memory, and fetch live Prometheus metrics. All endpoints require a valid Supabase JWT Bearer token.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nicolas344/Sentinel-SoftServe/llms.txt
Use this file to discover all available pages before exploring further.
Incident Status Lifecycle
Incidents move through the following statuses as the LangGraph agent pipeline and human operators interact with them:| Status | Description |
|---|---|
detected | Incident created; not yet analyzed |
investigating | Agent is gathering logs and metrics |
analyzed | Agent has produced a diagnosis and proposed action |
awaiting_approval | Proposed action is pending human review |
executing_solution | Action is actively being executed |
verifying | Post-execution verification in progress |
resolved | Incident fully resolved |
failed | Execution failed or action was rejected |
resolved and failed are terminal statuses. The active filter alias matches all non-terminal statuses.
GET /api/incidents
List all incidents with optional filtering and pagination. Auth required: YesFilter by incident source. One of
container, database, or manual.Filter by container runtime. One of
docker, podman, or kubernetes.Filter by severity level. One of
critical, high, medium, or low.Filter by status. Pass an exact
StatusType value, or the special alias active to return all incidents not in resolved or failed.Page number (1-based).
Number of incidents per page.
Array of incident objects for the current page.
Total number of incidents matching the applied filters.
Current page number.
Page size used for this response.
Total number of pages (
ceil(total / limit)).- Request
- Response
POST /api/incidents
Create a manual incident and immediately trigger the LangGraph analysis pipeline as a background task. Auth required: YesStatus code:
201 Created
Short description of the incident. Maximum 200 characters.
The affected resource — a container name, database name, or service identifier. Maximum 100 characters.
Severity level:
critical, high, medium, or low.Origin of the incident:
container, database, or manual.Optional free-text description or initial log snippet. Maximum 5000 characters. Stored as the incident’s initial
logs field.When
source_type is container, container_runtime defaults to docker. For database and manual sources, container_runtime is set to null.- Request
- Response (201)
GET /api/incidents/
Retrieve a single incident by its UUID. Auth required: YesUUID of the incident.
404 Not Found with {"detail": "Incidente no encontrado"} if the ID does not exist.
- Request
- Response
PATCH /api/incidents//status
Update the status of an incident. If the new status isresolved, a post-mortem generation job is queued as a background task.
Auth required: Yes
UUID of the incident.
The new status. Must be a valid
StatusType: detected, investigating, analyzed, awaiting_approval, executing_solution, verifying, resolved, or failed.Setting status to
resolved also writes the current UTC timestamp to resolved_at and schedules post-mortem generation in the background.- Request
- Response
GET /api/incidents//post-mortem
Retrieve the post-mortem report for an incident. If the incident isresolved but no post-mortem has been saved yet, one is generated on demand synchronously.
Auth required: Yes
UUID of the incident.
UUID of the incident.
Current incident status.
Markdown-formatted post-mortem content. Empty string if not yet available.
ISO 8601 timestamp of the last update, or
null.true if the post-mortem was generated on-demand during this request rather than loaded from a previously saved value.- Request
- Response
PUT /api/incidents//post-mortem
Save or overwrite the post-mortem content for an incident. Useful for human-edited post-mortems after review. Auth required: YesUUID of the incident.
Markdown string. Minimum 1 character, maximum 100,000 characters.
- Request
- Response
GET /api/incidents//export
Export a complete incident bundle for archiving or sharing. Supports JSON and Markdown formats. Auth required: YesUUID of the incident.
Output format:
json or markdown. Any other value returns 400 Bad Request.| Key | Description |
|---|---|
metadata | Core incident fields (title, severity, status, timestamps) |
evidence.logs | Raw log content captured during investigation |
evidence.metrics_snapshot | Prometheus metrics snapshot taken at incident time |
evidence.agent_reasoning | LangGraph agent’s step-by-step analysis |
timeline | Ordered list of status transitions with timestamps |
decisions | Proposed and executed actions with outcomes |
actions | Detailed action execution records |
post_mortem | Post-mortem Markdown content (if available) |
- Request (JSON)
- Request (Markdown)
- Response (JSON excerpt)
GET /api/incidents//runbooks
Retrieve up to 5 relevant runbooks from the ChromaDB vector store for this incident. Runbooks are matched using semantic similarity against the incident’sincident_type and title.
Auth required: Yes
UUID of the incident.
Optional custom query string to override the default search query (
{incident_type} {title}). Useful for targeted runbook lookups.source_type and container_runtime:
| Source | Runtime | ChromaDB Collection |
|---|---|---|
container | docker | runbooks-docker |
container | podman | runbooks-podman |
container | kubernetes | runbooks-kubernetes |
database | — | runbooks-postgres |
Runbook title extracted from the
RUNBOOK: <title> line in the document.Full runbook text content.
- Request
- Response
GET /api/incidents//similar
Find up to 5 similar past incidents using ChromaDB episodic memory. The query is built from the incident’sincident_type and title. The current incident is excluded from results.
Auth required: Yes
UUID of the incident.
incidents-docker, incidents-postgres, etc.).
- Request
- Response
GET /api/incidents//metrics
Fetch the current live Prometheus metrics for the incident’s target resource. If Prometheus is unreachable but ametrics_snapshot was captured at incident creation time, the snapshot is returned as a fallback.
Auth required: Yes
UUID of the incident.
source_type:
source_type | Data source | Query |
|---|---|---|
database | Prometheus → PostgreSQL exporter | Queries using datname extracted from target (postgres/<datname>) |
container | Prometheus → cAdvisor | Queries using the container name as target |
503 Service Unavailable if Prometheus is unreachable and no metrics_snapshot fallback exists.
- Request
- Response (container)
- Response (503 – no fallback)