How Static Analysis Rules Work
Static analysis in Heimdall (src/pipeline/static_analysis/mod.rs) runs deterministic pattern matching across your codebase:
- Pattern matching: Each rule is a compiled regex that scans source files line-by-line
- Language filtering: Rules can target specific languages (e.g., Python-only, JavaScript-only)
- Finding creation: Matches create findings with severity, CWE classification, and code snippets
- Deduplication: Each finding gets a unique fingerprint (SHA-256 hash of rule name + file path + line number)
Rule Execution Flow
Rule Format and Structure
Rules are defined as a const array insrc/pipeline/static_analysis/mod.rs:
Field Descriptions
| Field | Type | Required | Description |
|---|---|---|---|
name | &str | Yes | Unique rule identifier (kebab-case) |
pattern | &str | Yes | Rust regex pattern (supports full regex syntax) |
severity | &str | Yes | One of: critical, high, medium, low |
cwe | &str | Yes | CWE identifier (e.g., CWE-89 for SQL injection) |
description | &str | Yes | Short explanation shown to users |
languages | &[&str] | Yes | Language filter; empty array = all files |
Adding Custom Rules
Step 1: Define Your Rule
Add a new entry to theRULES array in src/pipeline/static_analysis/mod.rs:
Step 2: Test the Pattern
Verify your regex matches the code you want to detect:Step 3: Rebuild and Run
Example Rules
SQL Injection Detection
Hardcoded Secrets
Command Injection
XSS via innerHTML
Advanced Patterns
Multiline Matching
Rust regex doesn’t match across newlines by default. Use(?m) for multiline mode:
Case-Insensitive Matching
Use(?i) flag:
Negative Lookahead
Exclude false positives:Testing Rules
Unit Test Template
Test All Rules Compile
Heimdall includes a test to ensure all patterns are valid regex:Rule Performance
Catastrophic Backtracking
Avoid regex patterns that can cause exponential time complexity (ReDoS): Bad:Benchmarking
For large codebases, test rule performance:Severity Guidelines
| Severity | Use When | Examples |
|---|---|---|
| Critical | Remote code execution, authentication bypass | Command injection, hardcoded admin credentials |
| High | Data exposure, SQL injection | Unparameterized queries, secrets in code |
| Medium | XSS, CSRF, weak crypto | innerHTML assignment, MD5 usage |
| Low | Information disclosure, debug mode | Stack traces in responses, verbose errors |
CWE Reference
Common CWEs for rule classification:- CWE-78: Command Injection
- CWE-79: Cross-site Scripting (XSS)
- CWE-89: SQL Injection
- CWE-798: Hardcoded Credentials
- CWE-327: Weak Crypto Algorithm
- CWE-502: Deserialization of Untrusted Data
- CWE-918: Server-Side Request Forgery (SSRF)
Integration with Semgrep
For more complex rules, Heimdall also integrates with Semgrep:source = "static".
Rule Deduplication
Each finding gets a unique fingerprint:Best Practices
- Start with high severity: Only add rules for real vulnerabilities, not code style issues
- Test on real code: Run against your actual codebase to tune false positive rate
- Document exceptions: Use comments to explain why a pattern is safe:
- Use language filters: Avoid scanning irrelevant files (e.g., Python rules on JavaScript)
- Keep patterns simple: Complex regex is hard to maintain and can be slow
Related Files
src/pipeline/static_analysis/mod.rs— Core static analysis engine and rule definitionssrc/pipeline/static_analysis/semgrep.rs— Semgrep integrationsrc/pipeline/hunt/mod.rs— AI-powered Hunt agent (runs after static analysis)