Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/samkit511/SAW---Security-Analyst-Workspace/llms.txt

Use this file to discover all available pages before exploring further.

SAW’s threat detection layer operates in one of two modes controlled by the ASA_MODE environment variable. In HYBRID mode (the default), every log is first evaluated against a set of fast string-matching heuristics. If a heuristic fires, detection completes in microseconds with high confidence and no external API call. If no heuristic matches, the system escalates the log to a Gemini model via the google-genai SDK for deeper contextual analysis. In SAFE mode, LLM calls are disabled entirely and the pipeline runs deterministic heuristics only, returning "type": "None" for anything that doesn’t match a pattern.

How heuristic detection works

detect_threat() in threat_detector.py calls heuristic_detect() first. If the result is non-null, it is returned immediately without touching the LLM.heuristic_detect() runs canonicalize_signal() on the input before matching. Canonicalization URL-decodes the string, lowercases it, and strips common obfuscation tokens (/**/, %2f%2a%2a%2f, tab/newline characters) to prevent simple bypasses.The four pattern groups and their outputs:SQL Injection
if any(pattern in s for pattern in [
    "or '1'='1", "or 1=1", "or1=1",
    "union select", "unionselect",
    "drop table", "'--",
    "sleep(", "benchmark(", "xp_cmdshell",
]):
    return {
        "type": "SQL Injection",
        "confidence": 0.95,
        "severity": "HIGH",
        "detection_mode": "deterministic",
        "reason_source": "heuristic_rule",
        "reason": "Detected SQL injection pattern",
    }
XSS
if "<script>" in s or "javascript:" in s or "onerror=" in s or "onload=" in s:
    return {
        "type": "XSS",
        "confidence": 0.90,
        "severity": "HIGH",
        "detection_mode": "deterministic",
        "reason_source": "heuristic_rule",
        "reason": "Detected script injection pattern",
    }
Path Traversal
if "../" in s or "..\\" in s:
    return {
        "type": "Path Traversal",
        "confidence": 0.92,
        "severity": "HIGH",
        "detection_mode": "deterministic",
        "reason_source": "heuristic_rule",
        "reason": "Detected path traversal pattern",
    }
Brute Force
if "login failed" in s or "invalid password" in s:
    return {
        "type": "Brute Force",
        "confidence": 0.85,
        "severity": "MEDIUM",
        "detection_mode": "deterministic",
        "reason_source": "heuristic_rule",
        "reason": "Repeated login failure pattern",
    }
A successful heuristic match sets detection_mode = "deterministic" and reason_source = "heuristic_rule". When validate_schema() later calls calibrate_confidence(), deterministic results with reason_source = "heuristic_rule" are mapped to VERY_HIGH (confidence ≥ 0.9) or HIGH confidence buckets, which means RiskAgent will not request an ADK advisory and the deterministic pipeline remains fully authoritative.
You can disable heuristics entirely by setting ASA_ENABLE_HEURISTICS=false. Every log will then go straight to the LLM path (or return a fallback if ASA_MODE=SAFE).

ADK advisory: Gemini-backed decision review

Even after detect_threat() completes, RiskAgent can trigger a second, independent LLM call when the confidence bucket is LOW or MEDIUM. This call goes through the ASAAgent class, which wraps a Google ADK Runner with a CoordinatorAgent root agent:
if context.classification.get("confidence_bucket") in {"LOW", "MEDIUM"} \
        or context.classification.get("detection_mode") == "fallback":
    decision["adk_review"] = await self._review_with_adk(context, decision)
The ADK agent receives the full incident context — analysis, classification, and the current deterministic decision — and is instructed to return a JSON object with recommended_decision, reason, and an optional follow_up_task. The system only applies the recommendation if recommended_decision is one of EXECUTE, OBSERVE, or IGNORE; any other output is ignored. You can disable the ADK advisory entirely by setting ASA_ENABLE_ADK_ADVISORY=false.

Response caching

ASAAgent caches every ADK response in memory to avoid redundant LLM calls for identical inputs:
ADK_CACHE_TTL_SECONDS = int(os.getenv("ASA_ADK_CACHE_TTL_SECONDS", "120"))
The cache key is a deterministic JSON hash of the prompt string and a cache_context dict (which includes surface, incident_id, and confidence_bucket). Cached responses are returned immediately with cache_hit: true. On an ADK error with a retry-after hint, the error result is cached for min(ADK_CACHE_TTL_SECONDS, retry_after_seconds) to prevent hammering a rate-limited API. You can tune the TTL by setting ASA_ADK_CACHE_TTL_SECONDS in your .env.

Mode summary

ConditionDetection modeconfidence_bucketADK advisory
Heuristic match (confidence ≥ 0.9)deterministicVERY_HIGHSkipped
Heuristic match (confidence < 0.9)deterministicHIGHSkipped
No heuristic match, LLM raw ≥ 0.8llm-assistedHIGHSkipped
No heuristic match, LLM raw ≥ 0.55llm-assistedMEDIUMEligible
No heuristic match, LLM raw < 0.55llm-assistedLOWEligible
LLM failure or ASA_MODE=SAFEfallback / deterministicLOWEligible (if fallback)

Build docs developers (and LLMs) love