Tuning decision thresholds for your environment

Every log event SAW processes ends with one of three decisions: EXECUTE (apply mitigations automatically), OBSERVE (create an analyst task for human review), or IGNORE (log and discard). The boundary between these outcomes is controlled by two environment variables — ASA_EXECUTE_THRESHOLD and ASA_OBSERVE_THRESHOLD — which act as cutpoints on a continuous risk score. Understanding how that score is computed lets you calibrate the decision engine precisely for your threat environment.

The risk score formula

The base risk score is the product of two factors:

risk_score = confidence × severity_weight

confidence is a float in [0.0, 1.0] representing the system’s certainty about the classification. After raw detection, confidence values are snapped to one of four named buckets:

Bucket	Calibrated value	Typical source
`VERY_HIGH`	0.90	Heuristic match with raw confidence ≥ 0.9
`HIGH`	0.78	Heuristic match with raw confidence < 0.9, or LLM with raw ≥ 0.8
`MEDIUM`	0.58	LLM with raw confidence 0.55–0.79
`LOW`	0.35	LLM with raw confidence < 0.55, or any fallback path

severity_weight converts the categorical severity label to a numeric multiplier:

Severity	Weight
`LOW`	1
`MEDIUM`	2
`HIGH`	3

The maximum risk score the formula can produce is 0.9 × 3 = 2.7, which occurs for a VERY_HIGH confidence, HIGH severity classification (a confident heuristic hit on SQL Injection, XSS, or Path Traversal).

The decision engine logic

The decision engine in app/tools/decision_engine.py applies the following logic in order:

def decision_engine(threat: dict, escalated: bool = False) -> dict:
    base_risk = threat.get("risk_score", 0.0)
    behavior_adjustment = 0.5 if threat.get("behavior") == "Aggressive Attacker" else 0.0
    risk = round(base_risk + behavior_adjustment, 2)

    if escalated or risk >= EXECUTE_THRESHOLD:
        return {"decision": "EXECUTE", ...}
    if risk >= OBSERVE_THRESHOLD:
        return {"decision": "OBSERVE", ...}
    return {"decision": "IGNORE", ...}

There are three inputs to the decision:

base_risk — the raw confidence × severity_weight score from the detection pipeline.
behavior_adjustment — adds 0.5 to the risk score when the classification behavior field is "Aggressive Attacker". The RiskAgent sets this when escalation is active (burst or sustained attack pattern detected). This means an escalated event with a base risk score as low as 2.0 can still reach the EXECUTE threshold.
escalated — a boolean passed by the RiskAgent when evaluate_escalation returns status: true. If escalated is True, the decision is immediately EXECUTE regardless of the numeric risk score.

Default thresholds and score examples

With defaults (ASA_EXECUTE_THRESHOLD=2.5, ASA_OBSERVE_THRESHOLD=1.5):

Scores that reach EXECUTE

Scenario	Confidence	Severity	Base risk	Adjustment	Final risk	Decision
SQL Injection (heuristic)	VERY_HIGH (0.90)	HIGH	2.70	0.0	2.70	EXECUTE
XSS (heuristic)	VERY_HIGH (0.90)	HIGH	2.70	0.0	2.70	EXECUTE
Any threat, escalation active	any	any	any	+0.5	≥ 3.0	EXECUTE
Any escalated event	—	—	—	—	—	EXECUTE (forced)

Heuristic HIGH-severity matches reliably reach EXECUTE because their VERY_HIGH confidence (0.90) multiplied by the HIGH severity weight (3) produces 2.70, which clears the 2.5 threshold.

Scores that reach OBSERVE

Scenario	Confidence	Severity	Base risk	Decision
SQL Injection (LLM, high confidence)	HIGH (0.78)	HIGH	2.34	OBSERVE
XSS (LLM, high confidence)	HIGH (0.78)	HIGH	2.34	OBSERVE
Any HIGH severity, MEDIUM confidence LLM	MEDIUM (0.58)	HIGH	1.74	OBSERVE
MEDIUM severity, HIGH confidence LLM	HIGH (0.78)	MEDIUM	1.56	OBSERVE

These events clear the 1.5 OBSERVE threshold but stay below 2.5, so they generate analyst investigation tasks rather than automatic mitigations.

Scores that reach IGNORE

Scenario	Confidence	Severity	Base risk	Decision
Brute Force (heuristic)	HIGH (0.78)	MEDIUM	1.56	OBSERVE*
LOW severity, any confidence	LOW–VERY_HIGH (0.35–0.90)	LOW	0.35–0.90	IGNORE
Unknown / fallback	LOW (0.35)	LOW	0.35	IGNORE
SAFE mode event	0.0	LOW	0.0	IGNORE

*Brute Force heuristic detections land in OBSERVE at default thresholds because the MEDIUM severity cap (weight 2) limits the maximum score to 1.56. Raise ASA_OBSERVE_THRESHOLD to 1.6 to push isolated brute-force events to IGNORE.Events in IGNORE are persisted to the incident store and appear in the /latest and /tasks responses, but no blocking actions or follow-up tasks are created.

Behavior adjustment: Aggressive Attacker

When ASA_ENABLE_ESCALATION=true and the RiskAgent detects a burst or sustained attack pattern, it sets behavior: "Aggressive Attacker" on the classification. The decision engine then adds 0.5 to the computed risk score before comparing against thresholds:

adjusted_risk = base_risk + 0.5

This means an MEDIUM-severity, MEDIUM-confidence event that would normally score 0.58 × 2 = 1.16 (IGNORE) reaches 1.66 (OBSERVE) once the attacker profile is established. Combined with the escalated=True flag that also fires in the same code path, the practical effect is that any IP identified as an aggressive attacker produces an EXECUTE decision on the next event regardless of that event’s individual classification.

Tuning for your environment

Different deployment contexts call for different threshold positions:

Strict SOC: minimize missed threats

Lower ASA_OBSERVE_THRESHOLD to catch more borderline events. Lower ASA_EXECUTE_THRESHOLD to automate mitigations for a wider range of confirmed threats.

ASA_EXECUTE_THRESHOLD=2.0
ASA_OBSERVE_THRESHOLD=1.0

At these settings, any LLM HIGH-confidence detection of a HIGH-severity threat (0.78 × 3 = 2.34) reaches EXECUTE. MEDIUM-severity events at MEDIUM confidence (0.58 × 2 = 1.16) are now flagged for OBSERVE instead of IGNORE.

Lowering thresholds increases false-positive rates. More events reach OBSERVE (generating more analyst tasks) and potentially EXECUTE (triggering automated mitigations on benign traffic). Validate with representative log samples before applying to production.

Monitoring-only: reduce alert noise

Raise ASA_OBSERVE_THRESHOLD to reduce the volume of analyst tasks created for low-confidence detections. Set ASA_EXECUTE_THRESHOLD high enough that automated mitigations only fire on definitive heuristic hits.

ASA_EXECUTE_THRESHOLD=2.6
ASA_OBSERVE_THRESHOLD=2.0

At these settings, only VERY_HIGH confidence / HIGH severity events (score 2.70) reach EXECUTE. Everything else is either OBSERVE (score 2.0–2.59) or IGNORE (below 2.0). This reduces automated blocking actions while preserving visibility.

SAFE mode: no automated decisions

Set ASA_MODE=SAFE to disable LLM calls entirely. Combined with the default thresholds, heuristic-matched HIGH-severity events still reach EXECUTE. If you want no automated mitigations at all in SAFE mode, raise ASA_EXECUTE_THRESHOLD above the maximum possible score:

ASA_MODE=SAFE
ASA_EXECUTE_THRESHOLD=3.1
ASA_OBSERVE_THRESHOLD=1.5

No event can score above 3.0, so EXECUTE is unreachable. All detected threats produce OBSERVE (analyst tasks) and IGNORE only.

Threshold changes take effect immediately on server restart. There is no gradual rollout mechanism — all in-flight and future events use the new values. Test threshold changes in a staging environment using your own representative traffic before deploying to production.

Checking the active thresholds

The current threshold values are embedded in every triage response under agent_results.RiskAgent.output.decision.decision_thresholds:

"decision_thresholds": {
  "execute": 2.5,
  "observe": 1.5
}

You can also verify them at startup by checking the server logs, which include the environment variable state during module import.

Get Started

Architecture

Configuration

Guides

The risk score formula

The decision engine logic

Default thresholds and score examples

Behavior adjustment: Aggressive Attacker

Tuning for your environment

Checking the active thresholds

Build docs developers (and LLMs) love

Get Started

Architecture

Configuration

Guides

Documentation Index

​The risk score formula

​The decision engine logic

​Default thresholds and score examples

​Behavior adjustment: Aggressive Attacker

​Tuning for your environment

​Checking the active thresholds

Build docs developers (and LLMs) love

The risk score formula

The decision engine logic

Default thresholds and score examples

Behavior adjustment: Aggressive Attacker

Tuning for your environment

Checking the active thresholds