Documentation Index
Fetch the complete documentation index at: https://mintlify.com/RaviTejaMedarametla/nba-data-preprocessing/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The pipeline includes two optimized configuration templates that represent common deployment scenarios. Templates are JSON files that can be loaded via the--config-template CLI argument.
Edge Template
Optimized for resource-constrained edge devices with limited memory and compute resources. Location:configs/pipeline.edge.template.json
Edge Template Characteristics
Small chunks minimize memory footprint for edge devices
Conservative batch size to avoid memory exhaustion
Single-threaded execution to reduce overhead on limited cores
Strict 256MB memory limit for edge deployment
Throttled to 40% to leave resources for other processes
Fewer benchmark iterations to reduce processing time
Enabled - Critical for handling datasets larger than available RAM
Use Cases
- Raspberry Pi or similar single-board computers
- IoT devices with limited resources
- Mobile or embedded systems
- Environments where memory is <512MB
Server Template
Optimized for high-performance server environments with ample resources. Location:configs/pipeline.server.template.json
Server Template Characteristics
Large chunks maximize throughput on powerful hardware
Large batches leverage vectorization for faster processing
Multi-threaded execution for parallel processing
Generous 4GB memory allocation for complex operations
Full compute resources available (100%)
More iterations for statistically robust benchmarks
Disabled - Keep all data in memory for maximum performance
Use Cases
- Cloud compute instances (AWS, GCP, Azure)
- On-premise data processing servers
- Development workstations
- Environments with >8GB RAM
Using Templates
Load Template via CLI
Override Template Values
CLI arguments take precedence over template values:chunk_size to 128 and n_jobs to 2.
Load Template in Python
Creating Custom Templates
You can create your own templates for specific environments:--config-template path/to/your/template.json.
Template Selection Guide
| Criteria | Edge Template | Server Template |
|---|---|---|
| Available RAM | <512MB | >4GB |
| CPU Cores | 1-2 | 4+ |
| Dataset Size | <100MB | Any size |
| Priority | Resource efficiency | Maximum performance |
| Disk Spilling | Enabled | Disabled |
| Processing Time | Slower, conservative | Faster, aggressive |
Next Steps
Configuration Overview
Learn about all configuration options
CLI Reference
See all command-line arguments