Overview
Data processing functions handle the entire pipeline from Excel file upload to creating aggregated edges and indices ready for visualization.
File Handling
handleFileSelect
async function handleFileSelect(event)
Handles Excel file selection from the file input, reads the workbook, and initiates data parsing.
Parameters
Change event from the file input element containing the selected file
Behavior
Validate File
Checks if a file was selected and if XLSX library is loaded
Update State
Stores the file object and filename in appState
Read Workbook
Reads the file as an ArrayBuffer and parses it with XLSX.read()
Sheet Selection
If multiple sheets exist, prompts user to select one
Parse Data
Calls parseExcelData() to extract headers and rows
Error Handling
- Shows error message if XLSX library is not loaded
- Shows error message if file reading fails
- Shows error message if sheet selection is cancelled
Example
// Attach to file input
const fileInput = document.getElementById('fileInput');
fileInput.addEventListener('change', handleFileSelect);
parseExcelData
function parseExcelData(workbook)
Parses the selected Excel worksheet and extracts headers and data rows.
Defined in app.js:133-178
Parameters
Parsed workbook object from XLSX library
Behavior
Get Worksheet
Retrieves the worksheet using appState.sheet name
Convert to JSON
Uses XLSX.utils.sheet_to_json() with empty cells as empty strings
Extract Headers
Gets column names from the keys of the first data object
Validate Data
Checks that headers and rows are not empty
Update State
Populates appState.headers, appState.rows, and sets isValid = true
Populate UI
Calls populateColumnSelects() to fill dropdown menus
Updates State
appState.headers - Array of column names
appState.rows - Array of row objects
appState.isValid - Set to true if data is valid
Example
// Called internally by handleFileSelect
const workbook = XLSX.read(arrayBuffer, { type: 'array' });
parseExcelData(workbook);
// After parsing, state contains:
console.log(appState.headers); // ["Origin", "Destination", "Weight", ...]
console.log(appState.rows.length); // 1500
Column Configuration
handleColumnChange
function handleColumnChange()
Handles changes to column selections (origin, destination, weight, coordinates) and triggers data processing.
Defined in app.js:220-293
Parameters
None - reads values from DOM select elements
Behavior
Read Selection
Gets selected values from all column dropdowns
Update State
Stores column names in appState.originCol, destCol, etc.
Validate Selection
Ensures origin and destination are not the same column
Process Data
Calls processAggregatedEdges() if both origin and dest are selected
Populate Filters
Fills filter dropdowns with unique values from selected columns
Render Visualizations
Calls render functions for Sankey, Network, and Map (if coordinates exist)
Validation
// Prevents selecting the same column for origin and destination
if (originCol === destCol && originCol !== '') {
showConfig('⚠ Origin and Destination cannot be the same column.', 'warning');
// Resets selection
}
Example Usage
// Attach to column select elements
const originCol = document.getElementById('originCol');
const destCol = document.getElementById('destCol');
originCol.addEventListener('change', handleColumnChange);
destCol.addEventListener('change', handleColumnChange);
Data Processing
processAggregatedEdges
function processAggregatedEdges()
Processes raw data rows into aggregated edges with summed weights and counts.
Defined in app.js:339-378
Parameters
None - uses appState.rows, originCol, destCol, weightCol
Algorithm
Aggregation
Node Collection
Edge Array
// Creates a Map with key "source|target"
const edgesMap = new Map();
rows.forEach(row => {
const source = String(row[originCol] ?? '').trim();
const target = String(row[destCol] ?? '').trim();
const key = `${source}|${target}`;
const weight = weightCol ? parseFloat(row[weightCol]) || 1 : 1;
if (edgesMap.has(key)) {
edge.value += weight; // Sum weights
edge.count += 1; // Count occurrences
} else {
edgesMap.set(key, { source, target, value: weight, count: 1 });
}
});
// Collects unique node names
appState.uniqueNodes.add(source);
appState.uniqueNodes.add(target);
// Converts Map to array
appState.aggregatedEdges = Array.from(edgesMap.values());
Returns
Array of aggregated edges with structure:{
source: string, // Origin node name
target: string, // Destination node name
value: number, // Sum of weights
count: number // Number of rows aggregated
}
Set of all unique node names encountered
Side Effects
- Calls
buildIndices() to create index structures
- Logs processing statistics to console
Example
// After processing, state contains aggregated edges
processAggregatedEdges();
console.log(appState.aggregatedEdges);
// [
// { source: "NYC", target: "LA", value: 150, count: 3 },
// { source: "NYC", target: "SF", value: 200, count: 5 },
// ...
// ]
console.log(appState.uniqueNodes.size); // 25
buildIndices
Builds outgoing and incoming edge indices for fast node lookups.
Defined in app.js:383-400
Parameters
None - uses appState.aggregatedEdges
Creates Indices
Maps each source node to its outgoing edges:appState.outIndex = new Map();
// "NYC" -> [
// { source: "NYC", target: "LA", value: 150 },
// { source: "NYC", target: "SF", value: 200 }
// ]
Maps each target node to its incoming edges:appState.inIndex = new Map();
// "LA" -> [
// { source: "NYC", target: "LA", value: 150 },
// { source: "SF", target: "LA", value: 100 }
// ]
Indices enable O(1) lookup of edges for a given node, critical for:
- Ego network generation
- Node detail popups
- Neighborhood highlighting
Example
// Build indices after processing edges
buildIndices();
// Fast lookup of outgoing edges
const outEdges = appState.outIndex.get("NYC") || [];
console.log(`NYC has ${outEdges.length} outgoing connections`);
// Fast lookup of incoming edges
const inEdges = appState.inIndex.get("LA") || [];
console.log(`LA receives from ${inEdges.length} sources`);
function extractNodeCoordinates()
Extracts geographic coordinates for nodes from configured lat/lng columns.
Defined in app.js:1619-1676
Parameters
None - uses appState.rows and coordinate column configuration
Behavior
Check Configuration
Verifies that at least origin or destination coordinate columns are set
Clear Previous Data
Clears appState.nodeCoordinates Map
Extract Origin Coords
If originLatCol and originLngCol are set, extracts coordinates for origin nodes
Extract Dest Coords
If destLatCol and destLngCol are set, extracts coordinates for destination nodes
Validate Values
Parses lat/lng as floats and checks for NaN
Store Coordinates
Stores valid coordinates in nodeCoordinates Map (first occurrence wins)
Updates State
appState.nodeCoordinates
Map<string, {lat: number, lng: number}>
Map of node names to their geographic coordinates
appState.nodesWithOriginCoords
Set of nodes that have coordinates as origins
appState.nodesWithDestCoords
Set of nodes that have coordinates as destinations
Example
// Configure coordinate columns
appState.originLatCol = "Origin_Lat";
appState.originLngCol = "Origin_Lng";
appState.destLatCol = "Dest_Lat";
appState.destLngCol = "Dest_Lng";
// Extract coordinates
extractNodeCoordinates();
// Check results
console.log(`Extracted coordinates for ${appState.nodeCoordinates.size} nodes`);
const nycCoords = appState.nodeCoordinates.get("NYC");
if (nycCoords) {
console.log(`NYC: ${nycCoords.lat}, ${nycCoords.lng}`);
}
Helper Functions
populateColumnSelects
function populateColumnSelects()
Populates all column selection dropdown menus with headers from the loaded data.
Defined in app.js:183-215
Updates Elements
originCol select
destCol select
weightCol select (includes ”— Automatic (count) —” option)
originLatCol, originLngCol, destLatCol, destLngCol selects (include ”— Optional —”)
- Ego network destination selects (4 panels)
populateEgoDestinations
function populateEgoDestinations()
Populates ego network destination selects with unique values from the destination column.
Defined in app.js:298-334
Parameters
None - uses appState.destCol and appState.rows
Behavior
- Extracts unique values from the selected destination column
- Sorts values alphabetically
- Populates all 4 ego network select elements
- Includes HTML escaping for safety
Data Flow Diagram