Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/vrashmanyu605-eng/devops-root-cause-analysis-agent/llms.txt

Use this file to discover all available pages before exploring further.

The DevOps Root Cause Analysis Agent is an AI-driven tool that cuts through the noise of modern observability data to pinpoint the root cause of production incidents. By correlating signals across logs, metrics, and distributed traces, the agent surfaces actionable hypotheses ranked by confidence — so your engineers spend time fixing problems, not hunting for them.

Quickstart

Get the agent running and perform your first root cause analysis in minutes.

How It Works

Understand the AI pipeline: signal ingestion, correlation, and hypothesis ranking.

Configuration

Set up environment variables, integrations, and agent behavior settings.

Guides

Step-by-step walkthroughs for common analysis and integration workflows.

What the Agent Does

When an incident fires, the DevOps Root Cause Analysis Agent:
1

Ingests Signals

Pulls logs, metrics, and traces from connected data sources — Prometheus, Elasticsearch, Jaeger, and more — for the incident time window.
2

Correlates Events

Uses an AI model to identify anomalies, temporal correlations, and causal relationships across all ingested signals simultaneously.
3

Ranks Hypotheses

Generates a ranked list of root cause hypotheses, each with supporting evidence drawn directly from your observability data.
4

Presents Results

Displays findings in a Streamlit dashboard and optionally routes summaries to your alerting or ticketing system.

Key Capabilities

Multi-Signal Correlation

Combines logs, metrics, and traces into a unified view for holistic incident analysis.

AI Hypothesis Ranking

Ranks root cause candidates by confidence score using LLM-based reasoning over your data.

Interactive Dashboard

Explore findings and drill into evidence through a Streamlit-powered investigation UI.

Async Processing

Celery and Redis back the analysis pipeline for non-blocking, scalable task execution.

Pluggable Connectors

Add custom data source connectors to pull from any observability or logging platform.

Alerting Integration

Push analysis summaries to PagerDuty, Slack, or any webhook-compatible endpoint.
New to the project? Start with the Quickstart to get the agent running locally, then explore the Configuration section to connect your observability stack.

Build docs developers (and LLMs) love