Scan Pipeline

The Heimdall scan pipeline is a sophisticated multi-stage process that transforms your raw source code into actionable security findings. Each scan runs through up to 9 sequential stages, orchestrated by the ScanPipeline in src/pipeline/mod.rs.

Pipeline Overview

Every scan follows this progression:

Ingest

Clone the repository and build a searchable code index

Tyr (Threat Model)

Generate a STRIDE-based threat model with attack surfaces and trust boundaries

Static Analysis

Run pattern-based vulnerability detection (Semgrep, etc.)

Taint Analysis

Trace data flows from user inputs to dangerous sinks

Config Scan

Analyze IaC and configuration files for misconfigurations

Hunt (Agentic Discovery)

Deploy AI agents to investigate each attack surface

Víðarr (Adversarial Verification)

Challenge and validate findings with skeptical analysis

Garmr (Sandbox Validation)

Execute proof-of-concept exploits in isolated containers

Report

Generate executive summaries and remediation guidance

Stage Architecture

Each stage is implemented as a self-contained module:

src/pipeline/mod.rs

pub async fn run(&self, repo: &Repo) -> HeimdallResult<()> {
    info!("Starting scan pipeline for scan_id={}", self.scan_id);

    // Stage 1: Ingest
    let ingest_output = self.run_stage("ingest", "ingesting", "ingested", async {
        let stage = ingest::IngestStage::new(
            self.scan_id,
            Arc::clone(&self.db),
            self.encryption_key,
            self.data_dir.clone(),
        );
        stage.run(repo).await
    }).await?;

    let code_index = Arc::new(ingest_output.code_index);

    // Stage 2: Tyr (Threat Model)
    let threat_model = self.run_stage("tyr", "modeling", "modeled", async {
        let stage = tyr::TyrStage::new(
            self.scan_id,
            repo.id,
            Arc::clone(&self.db),
            Arc::clone(&self.ai),
            self.default_model.clone(),
        );
        stage.run(&code_index).await
    }).await?;

    // ... additional stages follow
}

Each stage produces artifacts that downstream stages consume. For example, the Tyr threat model guides the Hunt agent’s investigations.

Monitoring Progress

Heimdall provides real-time visibility into scan execution through multiple channels:

Server-Sent Events (SSE)

Connect to the SSE endpoint to receive live updates:

curl -N -H "Authorization: Bearer YOUR_TOKEN" \
  https://app.heimdall.security/api/v1/scans/{scan_id}/events

You’ll receive events like:

{
  "event": "stage_update",
  "data": {
    "stage": "hunt",
    "status": "running",
    "message": "Investigating attack surface: Admin panel authentication"
  }
}

Scan Events Table

Query the scan_events table for a complete audit trail:

SELECT stage, task_key, status, title, detail, progress_pct, created_at
FROM scan_events
WHERE scan_id = 'YOUR_SCAN_ID'
ORDER BY created_at ASC;

Stage Status Tracking

Each stage records detailed status in the scan_stages table:

Field	Description
`stage`	Stage name (e.g., `hunt`, `tyr`, `garmr`)
`status`	Current status: `pending`, `running`, `completed`, `failed`
`started_at`	Timestamp when execution began
`completed_at`	Timestamp when execution finished
`error_message`	Failure reason if status is `failed`

Expected Timeframes

Scan duration varies based on repository size and complexity:

Small Repositories (< 10k LOC)

Total Duration: 2-5 minutes

Ingest: 10-20 seconds
Tyr: 30-60 seconds
Static Analysis: 20-40 seconds
Hunt: 1-2 minutes
Garmr: 30-60 seconds
Report: 20-30 seconds

Medium Repositories (10k-100k LOC)

Total Duration: 5-15 minutes

Ingest: 30-90 seconds
Tyr: 60-120 seconds
Static Analysis: 1-3 minutes
Hunt: 3-8 minutes
Garmr: 1-2 minutes
Report: 30-60 seconds

Large Repositories (> 100k LOC)

Total Duration: 15-45 minutes

Ingest: 2-5 minutes
Tyr: 2-4 minutes
Static Analysis: 3-8 minutes
Hunt: 8-25 minutes (parallel investigations)
Garmr: 2-5 minutes
Report: 1-2 minutes

The Hunt stage scales with the number of attack surfaces identified by Tyr. Codebases with many API endpoints or complex authentication flows will take longer to analyze.

Error Handling

The pipeline implements robust error handling with graceful degradation:

src/pipeline/mod.rs

async fn run_stage<T, F>(
    &self,
    stage_name: &str,
    status_running: &str,
    status_done: &str,
    future: F,
) -> HeimdallResult<T>
where
    F: std::future::Future<Output = HeimdallResult<T>>,
{
    // Check for user cancellation
    if self.is_cancelled().await {
        return Err(anyhow::anyhow!("Scan was cancelled"));
    }

    // Update status to running
    self.db.update_scan_status(self.scan_id, status_running, None).await?;
    self.sse.emit_stage_update(self.scan_id, stage_name, "running", None);

    match future.await {
        Ok(result) => {
            self.db.update_scan_stage_status(scan_stage.id, "completed", None).await?;
            Ok(result)
        }
        Err(e) => {
            let err_msg = format!("{e:#}");
            self.db.update_scan_stage_status(scan_stage.id, "failed", Some(&err_msg)).await?;
            self.sse.emit_error(self.scan_id, &err_msg);
            Err(e)
        }
    }
}

If a stage fails, the pipeline halts and marks the scan as failed. The error message is captured in both scan_stages.error_message and scans.error_message.

Cancellation Support

Users can cancel running scans at any time:

curl -X POST https://app.heimdall.security/api/v1/scans/{scan_id}/cancel \
  -H "Authorization: Bearer YOUR_TOKEN"

The pipeline checks for cancellation before starting each stage:

src/pipeline/mod.rs

async fn is_cancelled(&self) -> bool {
    if let Ok(Some(scan)) = self.db.get_scan_by_id(self.scan_id).await {
        scan.status == "cancelled"
    } else {
        false
    }
}

Next Steps

Hunt Agent

Learn how the agentic discovery engine works

Threat Modeling

Understand Tyr’s STRIDE-based analysis

Sandbox Validation

See how Garmr validates exploitability

Findings Management

Manage and remediate vulnerabilities

Overview

Getting Started

Core Features

Deployment

Integrations

Advanced

Pipeline Overview

Stage Architecture

Monitoring Progress

Server-Sent Events (SSE)

Scan Events Table

Stage Status Tracking

Expected Timeframes

Error Handling

Cancellation Support

Next Steps

Hunt Agent

Threat Modeling

Sandbox Validation

Findings Management

Build docs developers (and LLMs) love

Overview

Getting Started

Core Features

Deployment

Integrations

Advanced

​Pipeline Overview

​Stage Architecture

​Monitoring Progress

​Server-Sent Events (SSE)

​Scan Events Table

​Stage Status Tracking

​Expected Timeframes

​Error Handling

​Cancellation Support

​Next Steps

Hunt Agent

Threat Modeling

Sandbox Validation

Findings Management

Build docs developers (and LLMs) love

Pipeline Overview

Stage Architecture

Monitoring Progress

Server-Sent Events (SSE)

Scan Events Table

Stage Status Tracking

Expected Timeframes

Error Handling

Cancellation Support

Next Steps