Architecture

Overview

tif1 is a high-performance Python library designed for accessing Formula 1 timing data from TracingInsights (2018-current). It provides a fastf1-compatible API with significant performance improvements through modern Python tooling and optimized data pipelines.

Design Principles

Performance First

Optimize for speed at every layer - network, parsing, and caching

API Compatibility

Maintain fastf1-like interface for easy migration

Reliability

Robust error handling and automatic retry logic

Flexibility

Support multiple backends (pandas/polars) and async operations

System Architecture

Core Components

1. Public API Layer

Location: src/tif1/__init__.py, src/tif1/core.py Purpose: User-facing interface for data access Key Functions:

get_events(year): List available events for a season
get_sessions(year, event): List sessions for an event
get_session(year, event, session): Load session data

Design Decisions:

Simple, intuitive function names
Automatic session type normalization
Lazy loading for performance
Lazy __getattr__ exports in __init__.py for fast imports

import tif1

# Events and sessions
events = tif1.get_events(2025)
sessions = tif1.get_sessions(2025, "Abu Dhabi Grand Prix")

# Get session (lazy loading)
session = tif1.get_session(2025, "Abu Dhabi Grand Prix", "Race")
laps = session.laps  # Data loads on first access

2. Session Management

Location: src/tif1/core.py (monolithic ~4900 lines) Purpose: Core session data container and operations Key Classes:

Session: Main session object with laps, drivers, telemetry
Driver: Driver-specific data and operations
Lap: Individual lap with telemetry access
Laps: Collection of laps with filtering methods
Telemetry: Telemetry data with processing methods

Features:

Lazy property loading (data fetched on demand)
Async data fetching support (laps_async, get_fastest_laps_tels)
Backend flexibility (pandas/polars)
Optimized fastest lap queries

session = tif1.get_session(2025, "Abu Dhabi Grand Prix", "Race")

# Access data (lazy loaded)
driver = session.get_driver("VER")
lap = driver.get_lap(19)
telemetry = lap.telemetry

# Async loading (4-5x faster)
laps = await session.laps_async()

3. Data Loading Pipeline

Location: src/tif1/http_session.py, src/tif1/async_fetch.py Purpose: Fetch and parse data from CDN Key Components:

http_session.py: niquests-based HTTP session
async_fetch.py: Parallel async fetching
HTTP/2 support (20-30% faster)

Optimization Strategies:

Connection pooling and reuse
Automatic retry with exponential backoff
Parallel requests for multiple data files
Circuit breaker pattern for resilience

# Parallel telemetry fetching
tels = session.get_fastest_laps_tels(
    by_driver=True, 
    drivers=["VER", "HAM", "LEC"]
)  # ~0.13s for 3 drivers (parallel)

4. Cache System

Location: src/tif1/cache.py Purpose: Local caching with SQLite Features:

Automatic cache key generation
Fast SQL-based lookups
JSON storage format (orjson for speed)
Separate telemetry cache table

Storage Schema:

CREATE TABLE cache (
    key TEXT PRIMARY KEY,
    data TEXT  -- JSON data
);

CREATE TABLE telemetry_cache (
    year INTEGER,
    gp TEXT,
    session TEXT,
    driver TEXT,
    lap INTEGER,
    data TEXT,  -- JSON data
    PRIMARY KEY (year, gp, session, driver, lap)
);

Usage:

cache = tif1.get_cache()
print(f"Cache location: {cache.cache_dir}")

# Clear cache
cache.clear()

# Disable caching for specific session
session = tif1.get_session(..., enable_cache=False)

5. Validation Layer

Location: src/tif1/validation.py Purpose: Data integrity and type safety with Pydantic Schemas:

Event schedule validation (schedule_schema.py)
Runtime data validation (optional, configurable)
Type checking for DataFrames

Configuration:

# Enable validation
config = tif1.get_config()
config.set("validate_data", True)
config.save()

6. Backend Abstraction

Location: src/tif1/core_utils/backend_conversion.py Purpose: Support multiple DataFrame libraries Implementations:

Pandas: Default backend (pandas >=2.3)
Polars: High-performance alternative (polars >=1.36, optional)

Features:

Lazy polars loading (_ensure_polars_available)
Automatic backend conversion
Type optimization (categorical, nullable types)
2x faster for large datasets with polars

# Use Polars backend
session = tif1.get_session(
    2025, 
    "Abu Dhabi Grand Prix", 
    "Race",
    lib="polars"
)

# Convert between backends
laps_pandas = laps_polars.to_pandas()

7. Exception Hierarchy

Location: src/tif1/exceptions.py Purpose: Clear, actionable error messages

TIF1Error (base)
├── DataNotFoundError
│   ├── DriverNotFoundError
│   └── LapNotFoundError
├── NetworkError
├── InvalidDataError
├── CacheError
└── SessionNotLoadedError

Usage:

try:
    session = tif1.get_session(2025, "Invalid GP", "Race")
    laps = session.laps
except tif1.DataNotFoundError:
    print("Data not available")
except tif1.NetworkError:
    print("Network error")
except tif1.TIF1Error:
    print("General error")

Data Flow

Cold Cache (First Request)

User calls get_session()

Session object created with lazy loading

User accesses .laps property

Triggers data loading pipeline

Cache check (miss)

No cached data found

HTTP fetch from jsDelivr CDN

Parallel async requests for lap data

JSON parsing (orjson)

Fast JSON deserialization

DataFrame construction

Create pandas/polars DataFrame

Type optimization

Apply categorical and nullable types

Cache store

Save to SQLite for future requests

Return data

Data available to user (10-100x faster from cache next time)

Warm Cache (Subsequent Requests)

User calls get_session()

Session object created

User accesses .laps property

Triggers lazy loading

Cache check (hit)

Cached data found in SQLite

Deserialize DataFrame

Fast reconstruction from cache

Return data

10-100x faster than cold cache

Async Flow (Parallel Loading)

# Parallel telemetry fetching
tels = session.get_fastest_laps_tels(
    by_driver=True, 
    drivers=["VER", "HAM", "LEC"]
)

Identify fastest laps

Find lap numbers for each driver

Create parallel tasks

One task per driver telemetry fetch

Execute asyncio.gather()

Concurrent HTTP/2 requests

Process in parallel

DataFrame construction for each lap

Return combined data

4-5x faster than sequential

Performance Optimizations

Network Layer

HTTP/2 with niquests

Multiplexing: Multiple requests over single connection
Header compression: HPACK algorithm
20-30% faster than HTTP/1.1

Connection Pooling

Reuse TCP connections
Avoid handshake overhead
Managed by niquests session

CDN Strategy

jsDelivr global edge network
Automatic CDN fallback (cdn.py)
Never use raw.githubusercontent.com (rate limits)

Retry Logic

Exponential backoff: 1s, 2s, 4s
Circuit breaker pattern
Max 3 retries (configurable)

Data Processing

Categorical Types

50% memory reduction for repeated strings
Applied to Driver, Team, Compound columns
Faster filtering and grouping

Nullable Integer Types

Use Int64 instead of float64 for integers with NaN
Proper null handling without object dtype
Better type safety

Lazy Loading

Only load data when accessed
No upfront session.load() required
Direct lap telemetry access

Polars Backend

2x faster for large datasets
Apache Arrow memory format
Lazy evaluation and query optimization

Caching Strategy

SQLite Storage

Fast SQL-based lookups
ACID guarantees
Built-in Python support (no dependencies)

JSON Encoding

orjson for fast serialization (not stdlib json)
Compact storage format
Easy inspection and debugging

Separate Telemetry Table

Optimized schema for telemetry queries
Composite primary key: (year, gp, session, driver, lap)
Fast lap-specific lookups

Async Operations

Parallel Fetching

asyncio.gather() for concurrent requests
4-5x faster than sequential
Especially beneficial with cold cache

Non-blocking I/O

No thread overhead
Efficient CPU utilization
Scales to many concurrent requests

Batch Telemetry Loading

get_fastest_laps_tels() loads multiple drivers in parallel
~0.13s for 3 drivers vs ~0.4s sequential
Critical for data analysis workflows

Configuration

Environment Variables

TIF1_CACHE_DIR=~/.tif1/cache    # Cache location
TIF1_LOG_LEVEL=INFO              # Logging level
TIF1_TIMEOUT=30                  # Request timeout (seconds)
TIF1_MAX_RETRIES=3               # Retry attempts

Configuration File (~/.tif1rc)

{
  "max_retries": 5,
  "validate_data": true,
  "backend": "polars",
  "cache_enabled": true,
  "log_level": "INFO"
}

Runtime Configuration

import tif1
import logging

# Logging
tif1.setup_logging(logging.DEBUG)

# Cache management
cache = tif1.get_cache()
cache.clear()

# Backend selection
session = tif1.get_session(..., lib="polars")

# Disable caching
session = tif1.get_session(..., enable_cache=False)

Testing Strategy

Test Structure

tests/
├── unit/              # Component tests (mocked)
├── integration/       # End-to-end tests (real data)
├── property/          # Property-based tests
└── benchmarks/        # Performance tests

Coverage Goals

Line coverage: >80% (enforced)
Branch coverage: >85%
Critical paths: 100%

Running Tests

# All tests (parallel with xdist)
uv run pytest tests/ -v

# Unit tests only
uv run pytest tests/unit/ -v

# Integration tests (serial)
uv run pytest -o addopts='' tests/integration/ -v -n 0

# Benchmarks (serial for stable timing)
uv run pytest -o addopts='' tests/benchmarks/ -v -m benchmark --benchmark-only --no-cov -n 0

Deployment

Package Structure

tif1/
├── src/tif1/
│   ├── __init__.py              # Lazy exports
│   ├── core.py                  # Monolith (~4900 lines)
│   ├── cache.py                 # SQLite cache
│   ├── http_session.py          # HTTP client
│   ├── async_fetch.py           # Async loading
│   ├── exceptions.py            # Error types
│   ├── validation.py            # Pydantic schemas
│   ├── events.py                # Event schedule
│   ├── config.py                # Configuration
│   ├── cdn.py                   # CDN fallback
│   ├── retry.py                 # Retry logic
│   └── core_utils/              # Shared helpers
├── tests/                       # Test suite
├── examples/                    # Usage examples
└── pyproject.toml              # Project metadata

Dependencies

Core:

niquests (HTTP/2)
pandas >=2.3
pydantic (validation)
orjson (fast JSON)

Optional:

polars >=1.36 (performance)

Dev:

pytest + pytest-xdist (testing)
ruff (linting + formatting)
ty (type checking)
prek (git hooks)

Security

All user inputs are validated before processing. Cache paths are sanitized to prevent path traversal attacks.

No credentials required - all data is fetched from public CDN. No authentication or API keys needed.

Monitoring

Logging Levels

DEBUG: Cache hits/misses, HTTP requests, detailed operations
INFO: High-level operations, data loading progress
WARNING: Retry attempts, fallback operations, non-critical issues
ERROR: Failed operations, invalid data, critical errors

Key Metrics

Cache Hit Rate

Measure effectiveness of caching strategy

Request Latency

Track p50, p95, p99 for CDN requests

Memory Usage

Monitor DataFrame memory consumption

Error Rates

Track errors by type and frequency

Future Enhancements

Short Term

Weather data integration
Track status information
Radio messages
Pit stop data

Medium Term

Real-time data support (live timing)
Advanced analytics (tire degradation, pace analysis)
Visualization helpers
Export formats (CSV, Parquet)

Long Term

Machine learning features
Predictive analytics
Multi-season analysis tools
Custom data sources

Contributing

See the Contributing Guide for detailed information on:

Development setup
Code style guidelines
Testing requirements
Pull request process

Performance is critical to tif1. Always benchmark changes that affect data loading, parsing, or caching.

Additional Resources

Documentation Index

​Overview

​Design Principles

Performance First

API Compatibility

Reliability

Flexibility

​System Architecture

​Core Components

​1. Public API Layer

​2. Session Management

​3. Data Loading Pipeline

​4. Cache System

​5. Validation Layer

​6. Backend Abstraction

​7. Exception Hierarchy

​Data Flow

​Cold Cache (First Request)

​Warm Cache (Subsequent Requests)

​Async Flow (Parallel Loading)

​Performance Optimizations

​Network Layer

​Data Processing

​Caching Strategy

​Async Operations

​Configuration

​Environment Variables

​Configuration File (~/.tif1rc)

​Runtime Configuration

​Testing Strategy

​Test Structure

​Coverage Goals

​Running Tests

​Deployment

​Package Structure

​Dependencies

​Security

​Monitoring

​Logging Levels

​Key Metrics

Cache Hit Rate

Request Latency

Memory Usage

Error Rates

​Future Enhancements

​Short Term

​Medium Term

​Long Term

​Contributing

Build docs developers (and LLMs) love

Overview

Design Principles

System Architecture

Core Components

1. Public API Layer

2. Session Management

3. Data Loading Pipeline

4. Cache System

5. Validation Layer

6. Backend Abstraction

7. Exception Hierarchy

Data Flow

Cold Cache (First Request)

Warm Cache (Subsequent Requests)

Async Flow (Parallel Loading)

Performance Optimizations

Network Layer

Data Processing

Caching Strategy

Async Operations

Configuration

Environment Variables

Configuration File (~/.tif1rc)

Runtime Configuration

Testing Strategy

Test Structure

Coverage Goals

Running Tests

Deployment

Package Structure

Dependencies

Security

Monitoring

Logging Levels

Key Metrics

Future Enhancements

Short Term

Medium Term

Long Term

Contributing