Documentation Index Fetch the complete documentation index at: https://mintlify.com/projectdiscovery/nuclei/llms.txt
Use this file to discover all available pages before exploring further.
Extractors allow you to extract and reuse data from protocol responses. They can extract values using regex, JSON paths, XPath queries, key-value pairs, or DSL expressions. Extracted data can be used in subsequent requests or displayed in output.
Regex
KVal
JSON
XPath
DSL
Extract data using regular expressions with optional capture groups. extractors :
- type : regex
name : token
regex :
- "access_token \\ $production \\ $[0-9a-z]{16} \\ $[0-9a-f]{32}"
- "Author:(?:[A-Za-z0-9 - \\ _= \" ]+)?<span(?:[A-Za-z0-9 - \\ _= \" ]+)?>([A-Za-z0-9]+)< \\ /span>"
group : 1 # Extract first capture group
Go regex engine does not support lookaheads or lookbehinds.
Extract key-value pairs from headers and cookies (case-insensitive). extractors :
- type : kval
name : session
kval :
- "server" # Extract Server header
- "phpsessid" # Extract PHPSESSID cookie
- "content_type" # Extract Content-Type (note: use _ instead of -)
KVal inputs are case-insensitive and do not support dashes (-). Replace dashes with underscores (_).
Example: Content-Type becomes content_type
Extract data using jq-style JSON queries. extractors :
- type : json
name : ids
json :
- ".[] | .id"
- ".batters | .batter | .[] | .id"
Extract data using XPath queries from HTML/XML responses. extractors :
- type : xpath
name : links
xpath :
- "/html/body/div/p[2]/a"
attribute : href # Extract href attribute
Extract data using Domain Specific Language expressions. extractors :
- type : dsl
name : info
dsl :
- "'Server: ' + header['Server']"
- "status_code + ' - ' + content_length"
Name of the extractor. Used to reference extracted values. Must be lowercase without spaces or underscores. extractors :
- type : regex
name : "api-token"
regex :
- "token=([a-zA-Z0-9]+)"
Part of the response to extract from. Each protocol exposes different parts. extractors :
- type : kval
part : header
kval :
- "set-cookie"
When true, extracted values can be used in subsequent requests but won’t appear in output. extractors :
- type : regex
name : csrf
internal : true
regex :
- "csrf_token: \\ s*([a-zA-Z0-9]+)"
Regex capture group to extract. Use 0 for full match, 1+ for specific groups. extractors :
- type : regex
name : version
group : 1
regex :
- "Version: \\ s*v?([0-9.]+)"
XPath attribute to extract from matched elements. extractors :
- type : xpath
name : image-urls
xpath :
- "//img"
attribute : src
Enable case-insensitive extraction for regex extractors. extractors :
- type : regex
case-insensitive : true
regex :
- "(?i)error"
Internal extractors are crucial for multi-step templates. They extract data from one request and make it available to subsequent requests:
Dynamic Values
DNS to HTTP
http :
- raw :
- |
POST /login HTTP/1.1
Host: {{Hostname}}
username=admin&password=test
- |
GET /api/data?token={{token}} HTTP/1.1
Host: {{Hostname}}
extractors :
- type : regex
name : token
internal : true
group : 1
regex :
- "Token: '([A-Za-z0-9]+)'"
Extract data and use it in subsequent requests:
id : extract-and-iterate
info :
name : Extract Emails and Check
author : pdteam
severity : info
flow : |
http(1)
for (let email of template["emails"]) {
set("email", email);
http(2);
}
http :
- method : GET
path :
- "{{BaseURL}}"
extractors :
- type : regex
name : emails
internal : true
regex :
- "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+ \\ .[a-zA-Z]{2,}"
- method : GET
path :
- "{{BaseURL}}/user/{{base64(email)}}"
matchers :
- type : word
words :
- "Welcome"
Protocol-specific parts
body - Response body (default)
header - Response headers
raw - Raw HTTP response
request - HTTP request
all - Body + headers
cookies_from_response - Cookies in name:value format
headers_from_response - Headers in name:value format
raw - Raw DNS response (default)
answer - DNS answer field
question - DNS question field
ns - DNS nameserver field
extra - DNS extra field
raw - Raw network response (default)
data - Response data
Real-world examples
AWS Token Extraction
Server Info Extraction
Subdomain Extraction
id : aws-token-extract
info :
name : Extract AWS Tokens
author : pdteam
severity : info
file :
- extensions :
- all
extractors :
- type : regex
name : aws-keys
regex :
- "AKIA[0-9A-Z]{16}"
- "amzn \\ .mws \\ .[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
By default, non-internal extractors display their results in the output:
[server-info] [http] [info] https://example.com
[server-info] nginx/1.18.0
[x_powered_by] PHP/7.4.3
[regex] nginx/1.18.0
[dsl] Server: nginx/1.18.0 | Status: 200
Internal extractors are only available in template variables and don’t appear in output.
Best practices
Use internal extractors for chaining - Mark extractors as internal when their values are only needed in subsequent requests
Name extractors descriptively - Use clear names that indicate what data is being extracted
Extract minimal data - Only extract the data you need to reduce memory usage
Use capture groups - For regex extractors, use capture groups to extract specific parts
Validate extracted data - Use matchers to verify extracted data meets expected format
Combine extractor types - Use multiple extractor types for different data formats
Common patterns
Extract and validate
extractors :
- type : regex
name : version
group : 1
regex :
- "Version: \\ s*([0-9.]+)"
matchers :
- type : regex
regex :
- "[0-9]+ \\ .[0-9]+ \\ .[0-9]+" # Validate semver format
extractors :
- type : regex
name : endpoints
regex :
- "/api/v[0-9]+/[a-z]+"
extractors :
- type : dsl
name : normalized-domain
dsl :
- "to_lower(trim_suffix(cname, '.'))"
Helper Functions DSL helper functions
Flow Control Conditional execution