Overview
TheSignal class represents a time series signal extracted from a specific bit range within a CAN Arbitration ID’s payload. Each signal is analyzed for entropy (Shannon Index), normalized for comparison, and can be correlated with J1979 diagnostic signals.
Constructor
The CAN Arbitration ID that contains this signal (e.g., 0x123)
Starting bit position (inclusive) within the payload. Bit 0 is the leftmost bit of byte 0
Ending bit position (inclusive) within the payload. For an 8-byte payload, maximum is 63
Instance Attributes
Signal Identification
The CAN Arbitration ID containing this signal
Starting bit position (inclusive) of the signal within the payload
Ending bit position (inclusive) of the signal within the payload
Signal Data and Metadata
Pandas Series containing the decoded integer values of this signal over time. Index represents timestamps, values represent the signal data extracted from the bit range
Indicates if the signal contains dynamic data. Set to
False if shannon_index >= 0.000001, meaning the signal shows variationEntropy measure quantifying the randomness/information content of the signal. Calculated as:where
p_i is the proportion of each unique value in the time series. Higher values indicate more diverse/random signalsHuman-readable title for plotting, automatically generated as:
J1979 Correlation
If correlated with a J1979 diagnostic signal, stores the PID name (e.g., “Engine RPM”, “Speed km/h”)
Pearson Correlation Coefficient with the matched J1979 signal. Range: [-1.0, 1.0]. Values near ±1.0 indicate strong correlation
Methods
normalize_and_set_metadata
Normalization function applied to
time_series.values. Typically sklearn.preprocessing.minmax_scale. Must accept parameters: (array, copy)- Calls
set_shannon_index()to calculate entropy - Calls
update_static()to set dynamic/static classification - Calls
set_plot_title()to generate human-readable title - Calls
normalize()to scale time series values
set_shannon_index
p_i= proportion of valueiin the total population- Summation is over all unique values in the time series
SI ≈ 0: Signal contains mostly one value (low entropy, likely static or counter)SI ≈ 1-2: Signal contains moderate diversity (typical sensors)SI > 2: Signal contains high diversity (random, multi-state, or high-resolution sensor)
update_static
static attribute based on the Shannon Index. Sets static = False if the signal shows meaningful variation.
Threshold: shannon_index >= 0.000001
This conservative threshold ensures that even minimal entropy marks a signal as dynamic, filtering out only truly constant values.
Example:
set_plot_title
normalize
Normalization function (e.g.,
sklearn.preprocessing.minmax_scale). Applied to time_series.values with copy=False for in-place modification- Min-Max Scaling: Scales values to [0, 1] range
- Z-Score Normalization: Centers data around mean with unit variance
- Robust Scaling: Uses median and IQR for outlier resistance
Usage Example
Pipeline Integration
The Signal class is used in multiple pipeline stages:-
Lexical Analysis (
LexicalAnalysis.generate_signals()):- Creates Signal instances from tokenized bit ranges
- Extracts time series data from ArbID boolean matrices
- Calls
normalize_and_set_metadata()for each signal
-
Semantic Analysis (
SemanticAnalysis.subset_selection()):- Filters signals based on Shannon Index and static flags
- Selects subset for correlation analysis
-
J1979 Labeling (
SemanticAnalysis.j1979_signal_labeling()):- Correlates signals with J1979 diagnostic data
- Sets
j1979_titleandj1979_pccfor matched signals
-
Visualization (
Plotter.plot_signals_by_arb_id()):- Uses
plot_titleandj1979_titlefor labeling - Displays normalized time series
- Uses
Bit Indexing Convention
Signals use zero-based bit indexing where:- Bit 0 = leftmost bit of byte 0 (b0)
- Bits 0-7 = byte 0 (b0)
- Bits 8-15 = byte 1 (b1)
- Bits 16-23 = byte 2 (b2)
- Bits 56-63 = byte 7 (b7)