Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TracingInsights/tif1/llms.txt

Use this file to discover all available pages before exploring further.

Overview

tif1 is a high-performance Python library designed for accessing Formula 1 timing data from TracingInsights (2018-current). It provides a fastf1-compatible API with significant performance improvements through modern Python tooling and optimized data pipelines.

Design Principles

Performance First

Optimize for speed at every layer - network, parsing, and caching

API Compatibility

Maintain fastf1-like interface for easy migration

Reliability

Robust error handling and automatic retry logic

Flexibility

Support multiple backends (pandas/polars) and async operations

System Architecture

Core Components

1. Public API Layer

Location: src/tif1/__init__.py, src/tif1/core.py Purpose: User-facing interface for data access Key Functions:
  • get_events(year): List available events for a season
  • get_sessions(year, event): List sessions for an event
  • get_session(year, event, session): Load session data
Design Decisions:
  • Simple, intuitive function names
  • Automatic session type normalization
  • Lazy loading for performance
  • Lazy __getattr__ exports in __init__.py for fast imports
import tif1

# Events and sessions
events = tif1.get_events(2025)
sessions = tif1.get_sessions(2025, "Abu Dhabi Grand Prix")

# Get session (lazy loading)
session = tif1.get_session(2025, "Abu Dhabi Grand Prix", "Race")
laps = session.laps  # Data loads on first access

2. Session Management

Location: src/tif1/core.py (monolithic ~4900 lines) Purpose: Core session data container and operations Key Classes:
  • Session: Main session object with laps, drivers, telemetry
  • Driver: Driver-specific data and operations
  • Lap: Individual lap with telemetry access
  • Laps: Collection of laps with filtering methods
  • Telemetry: Telemetry data with processing methods
Features:
  • Lazy property loading (data fetched on demand)
  • Async data fetching support (laps_async, get_fastest_laps_tels)
  • Backend flexibility (pandas/polars)
  • Optimized fastest lap queries
session = tif1.get_session(2025, "Abu Dhabi Grand Prix", "Race")

# Access data (lazy loaded)
driver = session.get_driver("VER")
lap = driver.get_lap(19)
telemetry = lap.telemetry

# Async loading (4-5x faster)
laps = await session.laps_async()

3. Data Loading Pipeline

Location: src/tif1/http_session.py, src/tif1/async_fetch.py Purpose: Fetch and parse data from CDN Key Components:
  • http_session.py: niquests-based HTTP session
  • async_fetch.py: Parallel async fetching
  • HTTP/2 support (20-30% faster)
Optimization Strategies:
  • Connection pooling and reuse
  • Automatic retry with exponential backoff
  • Parallel requests for multiple data files
  • Circuit breaker pattern for resilience
# Parallel telemetry fetching
tels = session.get_fastest_laps_tels(
    by_driver=True, 
    drivers=["VER", "HAM", "LEC"]
)  # ~0.13s for 3 drivers (parallel)

4. Cache System

Location: src/tif1/cache.py Purpose: Local caching with SQLite Features:
  • Automatic cache key generation
  • Fast SQL-based lookups
  • JSON storage format (orjson for speed)
  • Separate telemetry cache table
Storage Schema:
CREATE TABLE cache (
    key TEXT PRIMARY KEY,
    data TEXT  -- JSON data
);

CREATE TABLE telemetry_cache (
    year INTEGER,
    gp TEXT,
    session TEXT,
    driver TEXT,
    lap INTEGER,
    data TEXT,  -- JSON data
    PRIMARY KEY (year, gp, session, driver, lap)
);
Usage:
cache = tif1.get_cache()
print(f"Cache location: {cache.cache_dir}")

# Clear cache
cache.clear()

# Disable caching for specific session
session = tif1.get_session(..., enable_cache=False)

5. Validation Layer

Location: src/tif1/validation.py Purpose: Data integrity and type safety with Pydantic Schemas:
  • Event schedule validation (schedule_schema.py)
  • Runtime data validation (optional, configurable)
  • Type checking for DataFrames
Configuration:
# Enable validation
config = tif1.get_config()
config.set("validate_data", True)
config.save()

6. Backend Abstraction

Location: src/tif1/core_utils/backend_conversion.py Purpose: Support multiple DataFrame libraries Implementations:
  • Pandas: Default backend (pandas >=2.3)
  • Polars: High-performance alternative (polars >=1.36, optional)
Features:
  • Lazy polars loading (_ensure_polars_available)
  • Automatic backend conversion
  • Type optimization (categorical, nullable types)
  • 2x faster for large datasets with polars
# Use Polars backend
session = tif1.get_session(
    2025, 
    "Abu Dhabi Grand Prix", 
    "Race",
    lib="polars"
)

# Convert between backends
laps_pandas = laps_polars.to_pandas()

7. Exception Hierarchy

Location: src/tif1/exceptions.py Purpose: Clear, actionable error messages
TIF1Error (base)
├── DataNotFoundError
│   ├── DriverNotFoundError
│   └── LapNotFoundError
├── NetworkError
├── InvalidDataError
├── CacheError
└── SessionNotLoadedError
Usage:
try:
    session = tif1.get_session(2025, "Invalid GP", "Race")
    laps = session.laps
except tif1.DataNotFoundError:
    print("Data not available")
except tif1.NetworkError:
    print("Network error")
except tif1.TIF1Error:
    print("General error")

Data Flow

Cold Cache (First Request)

1

User calls get_session()

Session object created with lazy loading
2

User accesses .laps property

Triggers data loading pipeline
3

Cache check (miss)

No cached data found
4

HTTP fetch from jsDelivr CDN

Parallel async requests for lap data
5

JSON parsing (orjson)

Fast JSON deserialization
6

DataFrame construction

Create pandas/polars DataFrame
7

Type optimization

Apply categorical and nullable types
8

Cache store

Save to SQLite for future requests
9

Return data

Data available to user (10-100x faster from cache next time)

Warm Cache (Subsequent Requests)

1

User calls get_session()

Session object created
2

User accesses .laps property

Triggers lazy loading
3

Cache check (hit)

Cached data found in SQLite
4

Deserialize DataFrame

Fast reconstruction from cache
5

Return data

10-100x faster than cold cache

Async Flow (Parallel Loading)

# Parallel telemetry fetching
tels = session.get_fastest_laps_tels(
    by_driver=True, 
    drivers=["VER", "HAM", "LEC"]
)
1

Identify fastest laps

Find lap numbers for each driver
2

Create parallel tasks

One task per driver telemetry fetch
3

Execute asyncio.gather()

Concurrent HTTP/2 requests
4

Process in parallel

DataFrame construction for each lap
5

Return combined data

4-5x faster than sequential

Performance Optimizations

Network Layer

  • Multiplexing: Multiple requests over single connection
  • Header compression: HPACK algorithm
  • 20-30% faster than HTTP/1.1
  • Reuse TCP connections
  • Avoid handshake overhead
  • Managed by niquests session
  • jsDelivr global edge network
  • Automatic CDN fallback (cdn.py)
  • Never use raw.githubusercontent.com (rate limits)
  • Exponential backoff: 1s, 2s, 4s
  • Circuit breaker pattern
  • Max 3 retries (configurable)

Data Processing

  • 50% memory reduction for repeated strings
  • Applied to Driver, Team, Compound columns
  • Faster filtering and grouping
  • Use Int64 instead of float64 for integers with NaN
  • Proper null handling without object dtype
  • Better type safety
  • Only load data when accessed
  • No upfront session.load() required
  • Direct lap telemetry access
  • 2x faster for large datasets
  • Apache Arrow memory format
  • Lazy evaluation and query optimization

Caching Strategy

  • Fast SQL-based lookups
  • ACID guarantees
  • Built-in Python support (no dependencies)
  • orjson for fast serialization (not stdlib json)
  • Compact storage format
  • Easy inspection and debugging
  • Optimized schema for telemetry queries
  • Composite primary key: (year, gp, session, driver, lap)
  • Fast lap-specific lookups

Async Operations

  • asyncio.gather() for concurrent requests
  • 4-5x faster than sequential
  • Especially beneficial with cold cache
  • No thread overhead
  • Efficient CPU utilization
  • Scales to many concurrent requests
  • get_fastest_laps_tels() loads multiple drivers in parallel
  • ~0.13s for 3 drivers vs ~0.4s sequential
  • Critical for data analysis workflows

Configuration

Environment Variables

TIF1_CACHE_DIR=~/.tif1/cache    # Cache location
TIF1_LOG_LEVEL=INFO              # Logging level
TIF1_TIMEOUT=30                  # Request timeout (seconds)
TIF1_MAX_RETRIES=3               # Retry attempts

Configuration File (~/.tif1rc)

{
  "max_retries": 5,
  "validate_data": true,
  "backend": "polars",
  "cache_enabled": true,
  "log_level": "INFO"
}

Runtime Configuration

import tif1
import logging

# Logging
tif1.setup_logging(logging.DEBUG)

# Cache management
cache = tif1.get_cache()
cache.clear()

# Backend selection
session = tif1.get_session(..., lib="polars")

# Disable caching
session = tif1.get_session(..., enable_cache=False)

Testing Strategy

Test Structure

tests/
├── unit/              # Component tests (mocked)
├── integration/       # End-to-end tests (real data)
├── property/          # Property-based tests
└── benchmarks/        # Performance tests

Coverage Goals

  • Line coverage: >80% (enforced)
  • Branch coverage: >85%
  • Critical paths: 100%

Running Tests

# All tests (parallel with xdist)
uv run pytest tests/ -v

# Unit tests only
uv run pytest tests/unit/ -v

# Integration tests (serial)
uv run pytest -o addopts='' tests/integration/ -v -n 0

# Benchmarks (serial for stable timing)
uv run pytest -o addopts='' tests/benchmarks/ -v -m benchmark --benchmark-only --no-cov -n 0

Deployment

Package Structure

tif1/
├── src/tif1/
│   ├── __init__.py              # Lazy exports
│   ├── core.py                  # Monolith (~4900 lines)
│   ├── cache.py                 # SQLite cache
│   ├── http_session.py          # HTTP client
│   ├── async_fetch.py           # Async loading
│   ├── exceptions.py            # Error types
│   ├── validation.py            # Pydantic schemas
│   ├── events.py                # Event schedule
│   ├── config.py                # Configuration
│   ├── cdn.py                   # CDN fallback
│   ├── retry.py                 # Retry logic
│   └── core_utils/              # Shared helpers
├── tests/                       # Test suite
├── examples/                    # Usage examples
└── pyproject.toml              # Project metadata

Dependencies

Core:
  • niquests (HTTP/2)
  • pandas >=2.3
  • pydantic (validation)
  • orjson (fast JSON)
Optional:
  • polars >=1.36 (performance)
Dev:
  • pytest + pytest-xdist (testing)
  • ruff (linting + formatting)
  • ty (type checking)
  • prek (git hooks)

Security

All user inputs are validated before processing. Cache paths are sanitized to prevent path traversal attacks.
No credentials required - all data is fetched from public CDN. No authentication or API keys needed.

Monitoring

Logging Levels

  • DEBUG: Cache hits/misses, HTTP requests, detailed operations
  • INFO: High-level operations, data loading progress
  • WARNING: Retry attempts, fallback operations, non-critical issues
  • ERROR: Failed operations, invalid data, critical errors

Key Metrics

Cache Hit Rate

Measure effectiveness of caching strategy

Request Latency

Track p50, p95, p99 for CDN requests

Memory Usage

Monitor DataFrame memory consumption

Error Rates

Track errors by type and frequency

Future Enhancements

Short Term

  • Weather data integration
  • Track status information
  • Radio messages
  • Pit stop data

Medium Term

  • Real-time data support (live timing)
  • Advanced analytics (tire degradation, pace analysis)
  • Visualization helpers
  • Export formats (CSV, Parquet)

Long Term

  • Machine learning features
  • Predictive analytics
  • Multi-season analysis tools
  • Custom data sources

Contributing

See the Contributing Guide for detailed information on:
  • Development setup
  • Code style guidelines
  • Testing requirements
  • Pull request process
Performance is critical to tif1. Always benchmark changes that affect data loading, parsing, or caching.

Build docs developers (and LLMs) love