Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/nicolas344/Sentinel-SoftServe/llms.txt

Use this file to discover all available pages before exploring further.

Sentinel SoftServe is an agentic AI co-pilot designed for DevOps and SRE engineers who need to move fast during production incidents. It watches your infrastructure continuously, automatically detects crashes, resource exhaustion, and service degradations, then orchestrates a full triage pipeline — from log collection and root-cause analysis to proposing a safe corrective action — all without requiring you to dig through dashboards manually. Every AI decision passes through a human-in-the-loop approval gate before any remediation command is executed on your infrastructure. The project is an academic industry collaboration between Universidad EAFIT and SoftServe, deployed live at sentinel-softserve-1.onrender.com.

The problem Sentinel solves

Modern containerised workloads generate thousands of metrics and log lines per minute. When something goes wrong at 2 a.m., engineers waste critical minutes correlating Prometheus alerts with Loki logs, reading runbooks, and deciding whether to restart a container or roll back a deployment. Sentinel eliminates that toil by:
  • Automatically ingesting alerts from Alertmanager the moment Prometheus fires a rule.
  • Fetching and analysing logs from Loki against a ChromaDB runbook knowledge base.
  • Classifying the incident type and routing it to the right specialist agent (Docker, Podman, Kubernetes, or PostgreSQL).
  • Proposing one safe, whitelisted command for human approval — never executing anything autonomously.
  • Generating a post-mortem and writing the incident into episodic memory so future triage gets smarter.

Tech stack

Sentinel is built from purpose-chosen components across every layer of the stack.
LayerTechnology
FrontendReact 19 + Vite 7 + Tailwind CSS v4 + shadcn/ui
BackendFastAPI + Uvicorn
AI OrchestrationLangGraph + LangChain
LLMOpenAI gpt-4o-mini
Knowledge BaseChromaDB (runbooks RAG + episodic memory)
Auth & DBSupabase (email/password, JWT, Realtime)
Agent ObservabilityLangFuse v2 (self-hosted)
Incident DetectioncAdvisor + Prometheus + Alertmanager
LogsLoki + Promtail
DashboardsGrafana

Supported runtimes

Sentinel ships a dedicated specialist agent for each supported runtime. Each agent carries its own tool palette and ChromaDB runbook collection so investigations stay tightly scoped.
RuntimeAgentTools
DockerDockerAgentdocker_inspect, docker_logs, docker_stats, docker_ps
PodmanPodmanAgentpodman_inspect, podman_logs, podman_stats, podman_ps
KubernetesKubernetesAgentget_pod_status, describe_pod, get_pod_logs, get_pod_events, get_deployment_status, list_failing_pods
PostgreSQLPostgresAgentpg_stat_activity, pg_stat_database, pg_stat_replication, pg_locks

Where to go next

Quickstart

Run Sentinel locally in under 10 minutes with Docker Compose.

Architecture

Understand the LangGraph agent pipeline and observability stack.

Supported Runtimes

Deep-dive into each specialist agent and its tool set.

API Reference

Explore the FastAPI endpoints that power the dashboard and webhooks.
Prerequisites before you begin:
  • Docker Desktop installed and running
  • Node.js 20+
  • Python 3.9+
  • A Supabase project with a URL, service-role key, anon key, and JWT secret
  • An OpenAI API key (gpt-4o-mini access required)

Build docs developers (and LLMs) love