Documentation Index
Fetch the complete documentation index at: https://mintlify.com/TracingInsights/tif1/llms.txt
Use this file to discover all available pages before exploring further.
Overview
tif1 is a high-performance Python library designed for accessing Formula 1 timing data from TracingInsights (2018-current). It provides a fastf1-compatible API with significant performance improvements through modern Python tooling and optimized data pipelines.Design Principles
Performance First
Optimize for speed at every layer - network, parsing, and caching
API Compatibility
Maintain fastf1-like interface for easy migration
Reliability
Robust error handling and automatic retry logic
Flexibility
Support multiple backends (pandas/polars) and async operations
System Architecture
Core Components
1. Public API Layer
Location:src/tif1/__init__.py, src/tif1/core.py
Purpose: User-facing interface for data access
Key Functions:
get_events(year): List available events for a seasonget_sessions(year, event): List sessions for an eventget_session(year, event, session): Load session data
- Simple, intuitive function names
- Automatic session type normalization
- Lazy loading for performance
- Lazy
__getattr__exports in__init__.pyfor fast imports
2. Session Management
Location:src/tif1/core.py (monolithic ~4900 lines)
Purpose: Core session data container and operations
Key Classes:
Session: Main session object with laps, drivers, telemetryDriver: Driver-specific data and operationsLap: Individual lap with telemetry accessLaps: Collection of laps with filtering methodsTelemetry: Telemetry data with processing methods
- Lazy property loading (data fetched on demand)
- Async data fetching support (
laps_async,get_fastest_laps_tels) - Backend flexibility (pandas/polars)
- Optimized fastest lap queries
3. Data Loading Pipeline
Location:src/tif1/http_session.py, src/tif1/async_fetch.py
Purpose: Fetch and parse data from CDN
Key Components:
http_session.py: niquests-based HTTP sessionasync_fetch.py: Parallel async fetching- HTTP/2 support (20-30% faster)
- Connection pooling and reuse
- Automatic retry with exponential backoff
- Parallel requests for multiple data files
- Circuit breaker pattern for resilience
4. Cache System
Location:src/tif1/cache.py
Purpose: Local caching with SQLite
Features:
- Automatic cache key generation
- Fast SQL-based lookups
- JSON storage format (orjson for speed)
- Separate telemetry cache table
5. Validation Layer
Location:src/tif1/validation.py
Purpose: Data integrity and type safety with Pydantic
Schemas:
- Event schedule validation (
schedule_schema.py) - Runtime data validation (optional, configurable)
- Type checking for DataFrames
6. Backend Abstraction
Location:src/tif1/core_utils/backend_conversion.py
Purpose: Support multiple DataFrame libraries
Implementations:
- Pandas: Default backend (pandas >=2.3)
- Polars: High-performance alternative (polars >=1.36, optional)
- Lazy polars loading (
_ensure_polars_available) - Automatic backend conversion
- Type optimization (categorical, nullable types)
- 2x faster for large datasets with polars
7. Exception Hierarchy
Location:src/tif1/exceptions.py
Purpose: Clear, actionable error messages
Data Flow
Cold Cache (First Request)
Warm Cache (Subsequent Requests)
Async Flow (Parallel Loading)
Performance Optimizations
Network Layer
HTTP/2 with niquests
HTTP/2 with niquests
- Multiplexing: Multiple requests over single connection
- Header compression: HPACK algorithm
- 20-30% faster than HTTP/1.1
Connection Pooling
Connection Pooling
- Reuse TCP connections
- Avoid handshake overhead
- Managed by niquests session
CDN Strategy
CDN Strategy
- jsDelivr global edge network
- Automatic CDN fallback (
cdn.py) - Never use raw.githubusercontent.com (rate limits)
Retry Logic
Retry Logic
- Exponential backoff: 1s, 2s, 4s
- Circuit breaker pattern
- Max 3 retries (configurable)
Data Processing
Categorical Types
Categorical Types
- 50% memory reduction for repeated strings
- Applied to Driver, Team, Compound columns
- Faster filtering and grouping
Nullable Integer Types
Nullable Integer Types
- Use Int64 instead of float64 for integers with NaN
- Proper null handling without object dtype
- Better type safety
Lazy Loading
Lazy Loading
- Only load data when accessed
- No upfront session.load() required
- Direct lap telemetry access
Polars Backend
Polars Backend
- 2x faster for large datasets
- Apache Arrow memory format
- Lazy evaluation and query optimization
Caching Strategy
SQLite Storage
SQLite Storage
- Fast SQL-based lookups
- ACID guarantees
- Built-in Python support (no dependencies)
JSON Encoding
JSON Encoding
- orjson for fast serialization (not stdlib json)
- Compact storage format
- Easy inspection and debugging
Separate Telemetry Table
Separate Telemetry Table
- Optimized schema for telemetry queries
- Composite primary key: (year, gp, session, driver, lap)
- Fast lap-specific lookups
Async Operations
Parallel Fetching
Parallel Fetching
- asyncio.gather() for concurrent requests
- 4-5x faster than sequential
- Especially beneficial with cold cache
Non-blocking I/O
Non-blocking I/O
- No thread overhead
- Efficient CPU utilization
- Scales to many concurrent requests
Batch Telemetry Loading
Batch Telemetry Loading
get_fastest_laps_tels()loads multiple drivers in parallel- ~0.13s for 3 drivers vs ~0.4s sequential
- Critical for data analysis workflows
Configuration
Environment Variables
Configuration File (~/.tif1rc)
Runtime Configuration
Testing Strategy
Test Structure
Coverage Goals
- Line coverage: >80% (enforced)
- Branch coverage: >85%
- Critical paths: 100%
Running Tests
Deployment
Package Structure
Dependencies
Core:niquests(HTTP/2)pandas>=2.3pydantic(validation)orjson(fast JSON)
polars>=1.36 (performance)
pytest+pytest-xdist(testing)ruff(linting + formatting)ty(type checking)prek(git hooks)
Security
No credentials required - all data is fetched from public CDN. No authentication or API keys needed.
Monitoring
Logging Levels
- DEBUG: Cache hits/misses, HTTP requests, detailed operations
- INFO: High-level operations, data loading progress
- WARNING: Retry attempts, fallback operations, non-critical issues
- ERROR: Failed operations, invalid data, critical errors
Key Metrics
Cache Hit Rate
Measure effectiveness of caching strategy
Request Latency
Track p50, p95, p99 for CDN requests
Memory Usage
Monitor DataFrame memory consumption
Error Rates
Track errors by type and frequency
Future Enhancements
Short Term
- Weather data integration
- Track status information
- Radio messages
- Pit stop data
Medium Term
- Real-time data support (live timing)
- Advanced analytics (tire degradation, pace analysis)
- Visualization helpers
- Export formats (CSV, Parquet)
Long Term
- Machine learning features
- Predictive analytics
- Multi-season analysis tools
- Custom data sources
Contributing
See the Contributing Guide for detailed information on:- Development setup
- Code style guidelines
- Testing requirements
- Pull request process