Overview
SafeNetworking stores all data in Elasticsearch using a structured document model. The system uses five primary indices for threat events, domain caching, IoT intelligence, tag metadata, and AutoFocus API tracking.
Index Architecture
Index Naming Conventions
threat-* Time-based indices for firewall threat events (DNS, URL)
sfn-domain-details Cached domain reputation data from AutoFocus
sfn-iot-details IoT threat intelligence from honeypot database
sfn-tag-details AutoFocus tag metadata cache
af-details AutoFocus API quota tracking (single document)
Document Schemas
DNS Event Document
Primary event document for DNS-based threats stored in threat-* indices.
Document Class : DNSEventDoc (defined in project/dns/dns.py:58-88)
class DNSEventDoc ( DocType ):
'''
Each event is its own entity in the DB
'''
SFN = Object( SFNDNS )
class Index :
name = 'threat-*'
SFN Object Schema
The SFN nested object contains SafeNetworking enrichment data:
class SFNDNS ( InnerDoc ):
event_type = Text() # "DNS"
domain_name = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
device_name = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
host = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
threat_id = Text( analyzer = 'snowball' )
threat_name = Text( analyzer = 'snowball' )
tag_name = Text( fields = { 'raw' : Keyword()}) # AutoFocus tag
tag_class = Text( fields = { 'raw' : Keyword()}) # campaign/actor/malware_family
tag_group = Text( fields = { 'raw' : Keyword()}) # Tag category
tag_description = Text( analyzer = 'snowball' )
public_tag_name = Text( analyzer = 'snowball' ) # Display name
confidence_level = Integer() # 0-90% confidence
sample_date = Date() # Most recent sample date
file_type = Text( fields = { 'raw' : Keyword()}) # Malware file type
updated_at = Date() # Enrichment timestamp
processed = Integer() # Processing state
src_ip = Ip()
dst_ip = Ip()
Defined in project/dns/dns.py:37-56
Processing States
The processed field indicates event enrichment status:
Value State Description 0Unprocessed Event awaiting enrichment 1Enriched Successfully enriched with AutoFocus data 55No Tags Domain found but no threat tags available
From project/dns/runner.py:170-174
Example Document
{
"@timestamp" : "2026-03-04T10:23:45.123Z" ,
"SFN" : {
"event_type" : "DNS" ,
"domain_name" : "malicious.example.com" ,
"device_name" : "PA-VM-001" ,
"host" : "192.168.1.100" ,
"tag_name" : "Unit42.Gootkit" ,
"public_tag_name" : "Gootkit" ,
"tag_class" : "malware_family" ,
"tag_group" : "Banking Trojan" ,
"tag_description" : "Gootkit is a banking trojan..." ,
"confidence_level" : 90 ,
"sample_date" : "2026-03-03T08:15:30" ,
"file_type" : "PE32" ,
"updated_at" : "2026-03-04T10:23:46" ,
"processed" : 1 ,
"src_ip" : "192.168.1.100" ,
"dst_ip" : "203.0.113.42"
}
}
Domain Details Document
Cached domain intelligence stored in sfn-domain-details index.
Document Class : DomainDetailsDoc (defined in project/dns/dns.py:5-34)
class DomainDetailsDoc ( DocType ):
'''
Document storage for domain cache
'''
name = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
tags = Keyword() # List of tag tuples
doc_created = Date()
doc_updated = Date()
processed = Integer()
class Index :
name = 'sfn-domain-details'
The tags field stores a list of tuples containing sample and tag information:
[
(
"2026-03-03T08:15:30" , # sample_date
"PE32" , # file_type
[
(
"Gootkit" , # public_tag_name
"Unit42.Gootkit" , # tag_name
"malware_family" , # tag_class
"Banking Trojan" , # tag_group
"Gootkit is a banking..." # description
)
]
)
]
Constructed in project/dns/dnsutils.py:440-448
Cache Lifecycle
Cache Miss
Event processor queries sfn-domain-details for domain (project/dns/runner.py:54-56)
AutoFocus Lookup
If not cached or expired, query AutoFocus API for domain samples (project/dns/dnsutils.py:343-459)
Tag Processing
Extract tags from samples and fetch tag metadata (project/dns/dnsutils.py:131-150)
Cache Storage
Store domain details with doc_updated timestamp (project/dns/dnsutils.py:506-512)
Cache Validation
Check age on subsequent lookups against DNS_DOMAIN_INFO_MAX_AGE (default: 30 days)
From project/dns/dnsutils.py:462-520
IoT Event Document
Document Class : IoTEventDoc (defined in project/iot/iot.py:66-96)
class IoTEventDoc ( DocType ):
'''
Each event is its own entity in the DB
'''
IoT = Object( SFNIOT )
class Index :
name = 'iot-*'
The SFNIOT inner document has the same structure as SFNDNS but is used for IoT-specific events.
IoT Details Document
Cached IoT threat intelligence stored in sfn-iot-details index.
Document Class : IoTDetailsDoc (defined in project/iot/iot.py:5-41)
class IoTDetailsDoc ( DocType ):
'''
Document storage for IoT IP cache
'''
id = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
time = Keyword() # Observation timestamp
ip = Ip() # Malicious IP address
filetype = Text() # Malware file type
tag_name = Text() # Normalized tag name
public_tag_name = Text() # Display name
tag_description = Text()
tag_class = Text() # Threat classification
tag_group_name = Text() # Threat category
class Index :
name = 'sfn-iot-details'
Family Name Normalization
IoT malware families are normalized to Unit42 naming conventions:
def __normalizeFamilyInfo ( familyInfo ):
if (familyInfo[ 'family' ] == 'mirai' ) and (familyInfo[ 'filetype' ] == "elf" ):
return "Unit42.ELFMirai" , "ELFMirai"
elif (familyInfo[ 'family' ] == 'xorddos' ) and (familyInfo[ 'filetype' ] == "elf" ):
return "Commodity.XorDDoS" , "XorDDoS"
# ...
From project/iot/runner.py:34-62
Example Document
{
"id" : "iot-12345" ,
"time" : "2026-03-04 09:15:30" ,
"ip" : "198.51.100.42" ,
"filetype" : "elf" ,
"tag_name" : "Unit42.ELFMirai" ,
"public_tag_name" : "ELFMirai" ,
"tag_description" : "Mirai IoT botnet malware" ,
"tag_class" : "malware_family" ,
"tag_group_name" : "IoT Botnet"
}
Tag Details Document
Cached AutoFocus tag metadata stored in sfn-tag-details index.
Document Class : TagDetailsDoc (defined in project/dns/dns.py:126-157)
class TagDetailsDoc ( DocType ):
'''
Stores/caches information about each tag in the DB
'''
name = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
tag = Keyword() # Full tag object from AF
tag_groups = Keyword() # Tag categorization
doc_created = Date()
doc_updated = Date()
processed = Integer()
class Index :
name = 'sfn-tag-details'
Tag Object Structure
The tag field stores the complete AutoFocus tag response:
{
"tag_name" : "Unit42.Gootkit" ,
"public_tag_name" : "Gootkit" ,
"tag_class" : "malware_family" ,
"description" : "Gootkit is a banking trojan that targets financial institutions..."
}
Tag Groups
The tag_groups field provides hierarchical categorization:
[
{
"tag_group_name" : "Banking Trojan" ,
"description" : "Malware designed to steal financial credentials"
}
]
From project/lib/sfnutils.py:72-167
If a tag is not found in AutoFocus, SafeNetworking creates a placeholder cache entry with tag_class: "Tag not found in AF" to prevent repeated failed lookups (project/lib/sfnutils.py:149-158).
AutoFocus Details Document
Tracks AutoFocus API quota usage. Single document with ID af-details in af-details index.
Document Class : AFDetailsDoc (defined in project/dns/dns.py:91-123)
class AFDetailsDoc ( DocType ):
'''
Stores the information returned from AutoFocus about API logistics
'''
daily_points = Integer() # Total daily quota
daily_points_remaining = Integer() # Points left today
minute_points = Integer() # Per-minute quota
minute_points_remaining = Integer() # Points left this minute
minute_bucket_start = Date() # Minute window start
daily_bucket_start = Date() # Daily window start
class Index :
name = 'af-details'
id = 'af-details'
Example Document
{
"_id" : "af-details" ,
"_source" : {
"daily_points" : 50000 ,
"daily_points_remaining" : 32451 ,
"minute_points" : 16 ,
"minute_points_remaining" : 8 ,
"minute_bucket_start" : "2026-03-04T10:23:00" ,
"daily_bucket_start" : "2026-03-04T00:00:00"
}
}
Updated every AF_POOL_TIME seconds (default: 600) by the AutoFocus monitoring thread.
Field Types and Analyzers
Text vs Keyword Fields
SafeNetworking uses dual-field indexing for searchability:
domain_name = Text( analyzer = 'snowball' , fields = { 'raw' : Keyword()})
Text (analyzed) : Full-text search with stemming (e.g., “banking” matches “bank”)
Keyword (exact) : Aggregations, sorting, exact matching (e.g., “example.com”)
Date Handling
All date fields use Elasticsearch Date type with ISO 8601 format:
doc_updated = datetime.datetime.now().replace( microsecond = 0 ).isoformat( ' ' )
# Output: "2026-03-04 10:23:45"
From project/dns/dnsutils.py:469
Query Patterns
Finding Unprocessed Events
eventSearch = Search( index = "threat-*" ) \
.query( "match" , tags = "DNS" ) \
.query( "match" , ** { "SFN.processed" : 0 }) \
.sort({ "@timestamp" : { "order" : "desc" }})
eventSearch = eventSearch[: 1000 ]
From project/dns/runner.py:34-38
Checking Domain Cache
domainSearch = Search( index = "sfn-domain-details" ) \
.query( "match" , name = domainName)
if domainSearch.execute():
# Cache hit - use cached data
else :
# Cache miss - query AutoFocus
From project/dns/runner.py:54-65
Retrieving Latest IoT Update
eventSearch = Search( index = "sfn-iot-details" ) \
.sort({ "time.keyword" : { "order" : "desc" }})
eventSearch = eventSearch[: 1 ]
latestDoc = eventSearch.execute().hits[ 0 ]
From project/lib/sfnutils.py:18-24
Data Retention
Retention managed by Elasticsearch Index Lifecycle Management (ILM). Recommend 90-180 day retention based on compliance requirements.
Domain Cache (sfn-domain-details)
Cache validated on read. Entries older than DNS_DOMAIN_INFO_MAX_AGE (30 days) trigger re-query to AutoFocus.
Tag Cache (sfn-tag-details)
Cache validated on read. Entries older than DOMAIN_TAG_INFO_MAX_AGE (120 days) trigger re-query to AutoFocus.
IoT Cache (sfn-iot-details)
Continuously updated from external honeypot database. No automatic expiration.
Single document updated every 10 minutes. Historical data not retained.
Index Management
Manual Index Creation
Indices are created automatically on first document insertion, but you can pre-create with custom settings:
curl -X PUT "localhost:9200/sfn-domain-details?pretty" -H 'Content-Type: application/json' -d '
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
}
}
'
Monitoring Index Size
# Check index sizes
curl "localhost:9200/_cat/indices/sfn-*?v&s=index"
# Check document counts
curl "localhost:9200/sfn-domain-details/_count?pretty"
Next Steps
Architecture Understand how components interact in the system
Event Processing Learn about enrichment workflows and scoring algorithms