Skip to main content
whatwaf uses three primary detection techniques to identify WAFs: HTTP status code matching, header pattern matching, and response body pattern matching. Detectors combine these techniques to reliably identify specific WAF vendors.

HTTP status code matching

Many WAFs return specific status codes when blocking malicious requests:
  • 403 Forbidden - Most common blocking response
  • 404 Not Found - Some WAFs return 404 to hide blocking behavior
  • 406 Not Acceptable - Used by certain WAFs for content filtering
  • 418 I’m a teapot - Occasionally used as a blocking signal
The HttpResponse type provides helper methods for status code checks:
// Check specific status codes
resp.is_forbidden()  // status == 403
resp.is_not_found()  // status == 404

// Check status code ranges
resp.is_error()      // 400-599
resp.is_redirect()   // 300-399
resp.is_success()    // 200-299

Header pattern matching

HTTP response headers often contain vendor-specific signatures:

Header existence checks

Detect WAFs by checking if specific headers are present:
// Check if any of these headers exist
resp.has_header(&["x-datadome", "x-dd-request-id"], MatchMode::Any)
Real example - Datadome detector (src/detectors/datadome.rs:13):
pub fn detect(&self, resp: &HttpResponse) -> bool {
    resp.has_header(&["x-datadome"], MatchMode::Any)
}

Header value matching

Detect WAFs by matching text within header values:
// Check if Server header contains "ArvanCloud"
resp.header_has("server", &["ArvanCloud"], MatchMode::Any)
Real example - ArvanCloud detector (src/detectors/arvancloud.rs:13):
pub fn detect(&self, resp: &HttpResponse) -> bool {
    resp.header_has("server", &["ArvanCloud"], MatchMode::Any)
}
Many WAFs set tracking cookies: Real example - Incapsula detector (src/detectors/incapsula.rs:13):
pub fn detect(&self, resp: &HttpResponse) -> bool {
    resp.header_has("set-cookie", &["incap_ses", "visid_incap"], MatchMode::Any)
}
Incapsula is detected by checking if the Set-Cookie header contains either “incap_ses” or “visid_incap”.

Regex header matching

For complex patterns, detectors can use regular expressions:
use regex::Regex;
use once_cell::sync::Lazy;

static HEADER_PATTERN: Lazy<Vec<Regex>> = Lazy::new(|| {
    vec![Regex::new(r"X-WAF-[A-Z]+").unwrap()]
});

// Match header value against regex
resp.header_matches("x-protection", &HEADER_PATTERN, MatchMode::Any)

Response body pattern matching

WAF blocking pages often contain distinctive text or HTML structures.

Simple text matching

Detect keywords in the response body:
// Check if body contains any of these phrases (case-insensitive)
resp.body_has(&["Access Denied", "Web Application Firewall"], MatchMode::Any)

// Require ALL phrases to be present
resp.body_has(&["Sorry", "you have been blocked"], MatchMode::All)
Real example - Cloudflare detector (src/detectors/cloudflare.rs:13):
pub fn detect(&self, resp: &HttpResponse) -> bool {
    resp.body_has(
        &["Sorry, you have been blocked", "Cloudflare Ray ID"],
        MatchMode::All,
    ) && resp.is_forbidden()
}
Cloudflare requires:
  1. Body contains BOTH phrases (“Sorry, you have been blocked” AND “Cloudflare Ray ID”)
  2. HTTP status is 403 Forbidden

Regex body matching

For variable content like IDs or timestamps, use regex patterns: Real example - FortiWeb detector (src/detectors/fortiweb.rs:9):
use once_cell::sync::Lazy;
use regex::Regex;

static BODY: Lazy<Vec<Regex>> = Lazy::new(|| {
    vec![Regex::new(r"Attack ID:\s*2(?:0*\d{2})").unwrap()]
});

impl Detector for FortiWeb {
    fn detect(&self, resp: &HttpResponse) -> bool {
        resp.body_has(&["<h2 class=\"fgd_icon\">block</h2>"], MatchMode::Any)
            && resp.body_matches(&BODY, MatchMode::Any)
    }
}
FortiWeb detection requires:
  1. Body contains the static HTML <h2 class="fgd_icon">block</h2>
  2. Body matches the regex for “Attack ID: 2XX” pattern
The regex r"Attack ID:\s*2(?:0*\d{2})" matches FortiWeb attack identifiers documented in their log message reference.

Combining multiple patterns

Reliable detection combines multiple indicators to avoid false positives.

AND logic (all conditions must match)

Example - Barracuda detector (src/detectors/barracuda.rs:13):
pub fn detect(&self, resp: &HttpResponse) -> bool {
    resp.body_has(&["Barracuda Networks"], MatchMode::Any) && resp.is_not_found()
}
Barracuda is only detected when:
  1. Body contains “Barracuda Networks” AND
  2. Status code is 404
This prevents false positives from sites that merely mention Barracuda in their content.

OR logic (any condition matches)

// Detect if ANY signature is present
resp.header_has("server", &["CloudFront", "Amazon CloudFront"], MatchMode::Any)

Match modes

The MatchMode enum controls how multiple patterns are evaluated:
pub enum MatchMode {
    Any,  // At least one pattern must match (OR logic)
    All,  // All patterns must match (AND logic)
}
Example with MatchMode::All:
// ALL three phrases must be present in body
resp.body_has(
    &["Access Denied", "Reference ID", "Contact Support"],
    MatchMode::All
)
Example with MatchMode::Any:
// ANY of these headers indicate a WAF
resp.has_header(
    &["x-waf-protection", "x-firewall", "x-security-gateway"],
    MatchMode::Any
)

Detection reliability

To maximize accuracy:
  1. Combine techniques - Use multiple detection methods together
  2. Use specific patterns - Match vendor-specific text or headers
  3. Verify status codes - Ensure blocked requests return expected codes
  4. Test with multiple probes - whatwaf sends XSS, SQLi, and LFI payloads to trigger blocking

Performance considerations

  • Lazy regex compilation - Use once_cell::sync::Lazy to compile patterns once
  • Case-insensitive matching - All text matching is case-insensitive by default
  • Parallel detection - All detectors run independently and can be evaluated in parallel
  • Short-circuit evaluation - Rust’s && and || operators short-circuit for efficiency

Build docs developers (and LLMs) love