Overview
This project provides Python and R scripts for the automated reverse engineering of Controller Area Network (CAN) payloads observed from passenger vehicles. The tools enable security researchers and automotive engineers to analyze proprietary CAN bus communications without access to manufacturer specifications.Research Background
This code was originally developed by Dr. Brent Stone at the Air Force Institute of Technology (AFIT) in pursuit of a Doctor of Philosophy in Computer Science. The research focuses on enabling auditing and intrusion detection capabilities for proprietary Controller Area Networks.For detailed information about the methods and algorithms used, refer to the included dissertation: “Enabling Auditing and Intrusion Detection for Proprietary Controller Area Networks”
Key Capabilities
The CAN reverse engineering pipeline provides three main analysis stages:1. Pre-Processing
- Imports CAN log files in multiple formats (original and can-utils)
- Identifies SAE J1979 standard communications (OBD-II)
- Analyzes arbitration ID transmission frequencies
- Performs data cleaning and normalization
2. Lexical Analysis
- Detects individual time series signals within CAN payloads
- Tokenizes binary data to identify signal boundaries
- Extracts and normalizes signal values
- Generates signal dictionaries organized by arbitration ID
3. Semantic Analysis
- Correlates signals across different arbitration IDs
- Performs hierarchical clustering of related signals
- Labels signals by comparing with known J1979 data
- Produces visualizations and correlation matrices
Use Cases
Security Research
Identify potential attack surfaces and abnormal CAN bus behavior for intrusion detection systems
Vehicle Diagnostics
Reverse engineer proprietary diagnostic protocols for aftermarket tools and research
Signal Discovery
Map unknown CAN signals to physical vehicle parameters like RPM, speed, and brake pressure
Protocol Analysis
Understand proprietary communication patterns and timing characteristics