Time-Series Analysis with R

Overview

The R scripts provide advanced time-series analysis capabilities using Empirical Dynamic Modeling (EDM) to identify causal relationships between CAN signals. This goes beyond simple correlation to detect directional causality.

EDM analysis is optional but powerful. Use it when you need to understand why signals correlate, not just that they correlate.

When to Use EDM vs Built-in Correlation

Method	Use When	Provides
Built-in Correlation	Initial exploration, clustering signals	Symmetric correlation strength
EDM / CCM	Understanding causal direction, validating hypotheses	Directional causality, nonlinearity detection

Installation and Setup

R Package Installation

The analysis requires the rEDM package developed by the Sugihara Lab at UC San Diego:

install.packages("rEDM")
library(rEDM)

Set Working Directory

Point R to your data location:

setwd("C:/path/to/your/CAN/data")

CSV Data Format Requirements

EDM requires time-series data in CSV format with named columns:

speed,brake,rpm
45.2,0.0,1850
45.5,0.0,1860
45.8,12.3,1840
44.1,34.7,1820
...

Ensure your CSV has:

Header row with signal names
No missing values (or handle them appropriately)
Consistent sampling rate (or interpolate)
Sufficient length (thousands of samples recommended)

EDM Analysis Workflow

The commands_list.txt provides a complete analysis pipeline. Here’s the step-by-step workflow:

Step 1: Load Data

speed_brake_rpm <- read.csv("speed_brake_rpm.csv")

# Extract individual time series
ts_speed <- speed_brake_rpm$speed
ts_rpm <- speed_brake_rpm$rpm
ts_brake <- speed_brake_rpm$brake

Step 2: Define Train-Test Split

Split data into library (training) and prediction (testing) sets:

# Use first 50% for training, second 50% for testing
lib_speed <- c(1, 0.5*length(ts_speed))
pred_speed <- c(0.5*length(ts_speed)+1, length(ts_speed))

lib_rpm <- c(1, 0.5*length(ts_rpm))
pred_rpm <- c(0.5*length(ts_rpm)+1, length(ts_rpm))

lib_brake <- c(1, 0.5*length(ts_brake))
pred_brake <- c(0.5*length(ts_brake)+1, length(ts_brake))

Step 3: Simplex Projection (Find Optimal Embedding Dimension)

Determine the best embedding dimension E for each signal:

simplex_output_speed <- simplex(ts_speed, lib_speed, pred_speed, silent = TRUE)
simplex_output_rpm <- simplex(ts_rpm, lib_rpm, pred_rpm, silent = TRUE)
simplex_output_brake <- simplex(ts_brake, lib_brake, pred_brake, silent = TRUE)

# Save results
save(simplex_output_speed, file = "simplex_output_speed.Rda")
save(simplex_output_rpm, file = "simplex_output_rpm.Rda")
save(simplex_output_brake, file = "simplex_output_brake.Rda")

Plot embedding results:

par(mfrow=c(1,3), cex=1.5)

plot(simplex_output_speed$E, simplex_output_speed$rho, type = "l", 
     xlab = "Embedding Dimension (E)", 
     ylab = expression(paste("Forecast Skill (", rho, ")")), 
     main="Vehicle Speed")

plot(simplex_output_rpm$E, simplex_output_rpm$rho, type = "l", 
     xlab = "Embedding Dimension (E)", main="Engine RPM")

plot(simplex_output_brake$E, simplex_output_brake$rho, type = "l", 
     xlab = "Embedding Dimension (E)", main="Brake Pressure")

The optimal E is where forecast skill (ρ) peaks. Typical values for vehicle CAN data are E=7-10.

Step 4: S-Map (Detect Nonlinearity)

S-maps distinguish between red noise and nonlinear deterministic behavior:

smap_output_speed <- s_map(ts_speed, lib_speed, pred_speed, E = 9, silent = TRUE)
smap_output_rpm <- s_map(ts_rpm, lib_rpm, pred_rpm, E = 9, silent = TRUE)
smap_output_brake <- s_map(ts_brake, lib_brake, pred_brake, E = 9, silent = TRUE)

Interpret S-map results:

If forecast skill increases with θ (theta): nonlinear dynamics present
If flat across θ values: linear or random dynamics

Step 5: Convergent Cross Mapping (CCM)

CCM reveals causal direction between signals:

# Test if Speed causes RPM (and vice versa)
speed_xmap_rpm <- ccm(speed_brake_rpm, 
                      lib_column = "speed", 
                      target_column = "rpm", 
                      E = 9, 
                      lib_sizes = seq(0, 15000, by = 3000), 
                      num_samples = 10, 
                      random_libs = TRUE, 
                      replace = TRUE, 
                      num_neighbors = "E+1", 
                      tp = 0, 
                      silent = TRUE)

rpm_xmap_speed <- ccm(speed_brake_rpm, 
                      lib_column = "rpm", 
                      target_column = "speed", 
                      E = 9, 
                      lib_sizes = seq(0, 15000, by = 3000), 
                      num_samples = 10, 
                      random_libs = TRUE, 
                      replace = TRUE, 
                      num_neighbors = "E+1", 
                      tp = 0, 
                      silent = TRUE)

# Save results
save(speed_xmap_rpm, file = "speed_xmap_rpm.Rda")
save(rpm_xmap_speed, file = "rpm_xmap_speed.Rda")

Calculate mean cross-map skill:

speed_xmap_rpm_mean <- ccm_means(speed_xmap_rpm)
rpm_xmap_speed_mean <- ccm_means(rpm_xmap_speed)

Interpreting CCM Results

CCM reveals causal direction through convergence:

plot(speed_xmap_rpm_mean$lib_size, pmax(0, speed_xmap_rpm_mean$rho), 
     type = "l", col = "red", 
     xlab = "Library Size", 
     ylab = expression(paste("Cross Map Skill (", rho, ")")), 
     ylim = c(0.86, 0.95))
lines(rpm_xmap_speed_mean$lib_size, pmax(0, rpm_xmap_speed_mean$rho), 
      col = "blue")
legend(x = "topleft", 
       legend = c("Speed xmap RPM", "RPM xmap Speed"), 
       col = c("red", "blue"), lwd = 1)

Reading CCM Plots

X xmap Y means: “Can we predict Y using X’s attractor?”

If X xmap Y converges (skill increases with library size): Y causes X
If Y xmap X converges: X causes Y
If both converge: Bidirectional coupling
If neither converges: No causal relationship

Example Interpretation

For Speed and RPM:

Speed xmap RPM: Moderate convergence (ρ ≈ 0.92)
RPM xmap Speed: Strong convergence (ρ ≈ 0.94)

Conclusion: Bidirectional coupling with RPM having slightly stronger causal influence on Speed.

Real Example Analyses from commands_list.txt

The repository includes example analyses for three signal pairs:

1. Speed & RPM

speed_xmap_rpm <- ccm(speed_brake_rpm, lib_column = "speed", 
                      target_column = "rpm", E = 9, 
                      lib_sizes = seq(0, 15000, by = 3000), 
                      num_samples = 10, random_libs = TRUE, 
                      replace = TRUE, silent = TRUE)

Expected Result: Strong bidirectional coupling (both converge above ρ=0.86)

2. Brake Pressure & RPM

brake_xmap_rpm <- ccm(speed_brake_rpm, lib_column = "brake", 
                      target_column = "rpm", E = 9, 
                      lib_sizes = seq(0, 15000, by = 3000), 
                      num_samples = 10, random_libs = TRUE, 
                      replace = TRUE, silent = TRUE)

Expected Result: Weaker coupling (ρ ≈ 0.35-0.49), asymmetric causality

3. Speed & Brake Pressure

speed_xmap_brake <- ccm(speed_brake_rpm, lib_column = "speed", 
                        target_column = "brake", E = 9, 
                        lib_sizes = seq(0, 15000, by = 3000), 
                        num_samples = 10, random_libs = TRUE, 
                        replace = TRUE, silent = TRUE)

Expected Result: Moderate coupling (ρ ≈ 0.6-0.9), driver behavior mediated

Understanding .Rda Output Files

.Rda files store R data frames for later reuse:

# Save current analysis
save(speed_xmap_rpm, file = "speed_xmap_rpm.Rda")

# Load previous analysis
load(file = "speed_xmap_rpm.Rda")

Benefits:

Avoid re-running expensive CCM calculations
Share results without sharing raw data
Reproducible analysis workflow

Cross-Correlation Analysis

For simpler correlation without causality:

R/cross-correlation_speed_rpm.R

for (ccm_from in vars) { 
  for (ccm_to in vars[vars != ccm_from]) { 
    cf_temp <- ccf(speed_brake_rpm[, ccm_from], 
                   speed_brake_rpm[, ccm_to], 
                   type = "correlation", 
                   lag.max = 500, 
                   plot = FALSE)$acf
    corr_matrix[ccm_from, ccm_to] <- max(abs(cf_temp)) 
  } 
}

This finds maximum absolute correlation across all time lags up to 500 samples.

Visualizing Causal Relationships

Create publication-ready plots:

pdf(file = "ccm_one_row.pdf", width = 5.5, height = 2.0, family = "Helvetica")
op <- par(mfrow = c(1,3), cex.axis=1, cex.lab=1, 
          oma = c(3,3,1,0) + 0.1, mar = c(0,1,1,1) + 0.1)

plot(speed_xmap_rpm_mean$lib_size, pmax(0, speed_xmap_rpm_mean$rho), 
     type = "l", col = "red", main="Speed & RPM", 
     xlim = c(5000, 15000), ylim = c(0.86, 0.95))
lines(rpm_xmap_speed_mean$lib_size, pmax(0, rpm_xmap_speed_mean$rho), 
      col = "blue")
legend(x = "bottomright", legend = c("S xmap R", "R xmap S"), 
       col = c("red", "blue"), lwd = 1)

title(xlab = "Library Size", ylab = "Cross Map Skill (rho)", outer = TRUE)
par(op)
dev.off()

Best Practices

Sufficient data length: Use at least 5000+ samples for reliable CCM
Proper train-test split: 50-50 or 70-30 split recommended
Multiple library sizes: Test convergence with varying library sizes
Save intermediate results: EDM computations are expensive
Validate with domain knowledge: CCM results should match physical expectations

For vehicle CAN data, expect:

Strong: Speed ↔ RPM, Speed ↔ Wheel speeds
Moderate: Brake ↔ Speed, Throttle ↔ RPM
Weak: Brake ↔ Steering, RPM ↔ Temperature

Next Steps

Validation

Quantify pipeline consistency with train-test validation

Multi-File Processing

Process multiple CAN samples in batch

Get Started

Core Concepts

Pipeline

Advanced

Overview

When to Use EDM vs Built-in Correlation

Installation and Setup

R Package Installation

Set Working Directory

CSV Data Format Requirements

EDM Analysis Workflow

Step 1: Load Data

Step 2: Define Train-Test Split

Step 3: Simplex Projection (Find Optimal Embedding Dimension)

Step 4: S-Map (Detect Nonlinearity)

Step 5: Convergent Cross Mapping (CCM)

Interpreting CCM Results

Reading CCM Plots

Example Interpretation

Real Example Analyses from commands_list.txt

1. Speed & RPM

2. Brake Pressure & RPM

3. Speed & Brake Pressure

Understanding .Rda Output Files

Cross-Correlation Analysis

Visualizing Causal Relationships

Best Practices

Further Reading

Next Steps

Validation

Multi-File Processing

Build docs developers (and LLMs) love

Get Started

Core Concepts

Pipeline

Advanced

​Overview

​When to Use EDM vs Built-in Correlation

​Installation and Setup

​R Package Installation

​Set Working Directory

​CSV Data Format Requirements

​EDM Analysis Workflow

​Step 1: Load Data

​Step 2: Define Train-Test Split

​Step 3: Simplex Projection (Find Optimal Embedding Dimension)

​Step 4: S-Map (Detect Nonlinearity)

​Step 5: Convergent Cross Mapping (CCM)

​Interpreting CCM Results

​Reading CCM Plots

​Example Interpretation

​Real Example Analyses from commands_list.txt

​1. Speed & RPM

​2. Brake Pressure & RPM

​3. Speed & Brake Pressure

​Understanding .Rda Output Files

​Cross-Correlation Analysis

​Visualizing Causal Relationships

​Best Practices

​Further Reading

​Next Steps

Validation

Multi-File Processing

Build docs developers (and LLMs) love

Overview

When to Use EDM vs Built-in Correlation

Installation and Setup

R Package Installation

Set Working Directory

CSV Data Format Requirements

EDM Analysis Workflow

Step 1: Load Data

Step 2: Define Train-Test Split

Step 3: Simplex Projection (Find Optimal Embedding Dimension)

Step 4: S-Map (Detect Nonlinearity)

Step 5: Convergent Cross Mapping (CCM)

Interpreting CCM Results

Reading CCM Plots

Example Interpretation

Real Example Analyses from commands_list.txt

1. Speed & RPM

2. Brake Pressure & RPM

3. Speed & Brake Pressure

Understanding .Rda Output Files

Cross-Correlation Analysis

Visualizing Causal Relationships

Best Practices

Further Reading

Next Steps