Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TracingInsights/tif1/llms.txt

Use this file to discover all available pages before exploring further.

Overview

tif1 is built with performance as the top priority. Every layer of the library is optimized for speed - from network fetching to data parsing to DataFrame construction. This page covers the key optimization strategies and how to leverage them.
Performance First: The entire existence of this library is to focus on optimization, speed and performance. Performance is critical to everything we do.

Lazy Loading

All session data is loaded lazily - data is only fetched when you actually access it.

How It Works

import tif1

# Session object created instantly - no network requests yet
session = tif1.get_session(2024, "Bahrain", "Race")

# Network request happens here when laps are accessed
laps = session.laps  # Fetches lap data from CDN

# Telemetry fetched on-demand per driver/lap
tel = session.laps.pick_driver("VER").pick_lap(1).get_telemetry()

Benefits

  • Instant initialization: Create sessions without waiting for data
  • Selective loading: Only fetch what you need
  • Reduced memory: Don’t load unused data

Caching System

tif1 uses a high-performance SQLite-backed cache with in-memory LRU layer.

Cache Architecture

┌─────────────────────────────────────┐
│      In-Memory LRU Cache            │
│  • Lock-free reads                  │
│  • OrderedDict (1024 items)         │
│  • <1ms access time                 │
└─────────────────────────────────────┘
              ↓ (on miss)
┌─────────────────────────────────────┐
│      SQLite Cache (WAL mode)        │
│  • Persistent storage               │
│  • Batched writes (every 25 ops)    │
│  • 64MB cache_size                  │
└─────────────────────────────────────┘

Cache Configuration

The cache system is highly tunable via configuration:
from tif1 import get_config

config = get_config()

# In-memory cache size (default: 1024 items)
config.set("memory_cache_max_items", 2048)

# Telemetry cache size (default: 2048 items)  
config.set("memory_telemetry_cache_max_items", 4096)

# SQLite commit interval (default: 25)
config.set("cache_commit_interval", 50)

# SQLite timeout (default: 30.0s)
config.set("sqlite_timeout", 60.0)

Cache Location

The cache is stored in platform-specific directories:
  • Linux: ~/.cache/tif1/ (or ~/.tif1/ if ~/.cache doesn’t exist)
  • macOS: ~/Library/Caches/tif1/
  • Windows: %LOCALAPPDATA%/Temp/tif1/
You can override with environment variable:
export TIF1_CACHE_DIR="/custom/cache/path"

Cache Performance

The dual-layer cache provides dramatic speedups:
  • Memory hit: <1ms (lock-free read from OrderedDict)
  • SQLite hit: 5-10ms (database query + JSON deserialize)
  • Network fetch: 200-500ms (CDN request + validation)
Result: Cache hits are 10-100x faster than network requests.

Managing Cache

from tif1.cache import get_cache

cache = get_cache()

# Check if session data is cached
has_data = cache.has_session_data(2024, "Bahrain", "Race")

# Clear all cache
cache.clear()

# Close cache connection
cache.close()

Async Fetching

tif1 supports parallel data fetching with asyncio for 4-5x speedups.

Parallel Requests

import asyncio
import tif1
from tif1.async_fetch import fetch_multiple_async

# Fetch multiple sessions in parallel
async def load_multiple_sessions():
    requests = [
        (2024, "Bahrain", "Race", "laptimes.json"),
        (2024, "Bahrain", "Race", "drivers.json"),
        (2024, "Bahrain", "Race", "weather.json"),
    ]
    
    results = await fetch_multiple_async(
        requests,
        max_concurrent_requests=10,  # Control parallelism
    )
    return results

# Run async code
results = asyncio.run(load_multiple_sessions())

Concurrency Control

Control how many parallel requests to make:
from tif1 import get_config

config = get_config()

# Max concurrent requests (default: 20)
config.set("max_concurrent_requests", 50)

# Max worker threads (default: 20)
config.set("max_workers", 50)

# Telemetry prefetch concurrency (default: 32)
config.set("telemetry_prefetch_max_concurrent_requests", 64)

Rate Limiting

Use semaphores to prevent overwhelming the CDN:
import asyncio
from tif1.async_fetch import fetch_with_rate_limit, fetch_json_async

async def fetch_limited():
    # Create semaphore for max 5 concurrent requests
    semaphore = asyncio.Semaphore(5)
    
    result = await fetch_with_rate_limit(
        fetch_json_async,
        2024, "Bahrain", "Race", "drivers.json",
        semaphore=semaphore
    )
    return result

Backend Selection

tif1 supports both pandas and polars backends, with polars offering 2x faster performance for large datasets.

Pandas (Default)

import tif1

# Uses pandas by default
session = tif1.get_session(2024, "Bahrain", "Race")
print(type(session.laps))  # pandas.DataFrame

Polars

Switch to polars for better performance:
from tif1 import get_config

config = get_config()
config.set("lib", "polars")

# Or via environment variable
import os
os.environ["TIF1_LIB"] = "polars"

session = tif1.get_session(2024, "Bahrain", "Race")
print(type(session.laps))  # polars.DataFrame

Performance Comparison

OperationpandaspolarsSpeedup
Load laps150ms75ms2.0x
Filter laps20ms8ms2.5x
Aggregations50ms20ms2.5x
Memory usage100MB60MB1.67x
See Polars Backend for detailed comparison.

Memory Optimization

Categorical Types

tif1 automatically converts repeated strings to categorical types, saving 50% memory:
# Driver codes: "VER", "HAM", "LEC" etc. stored once
# Each lap references category index instead of full string
laps = session.laps
print(laps["Driver"].dtype)  # category (pandas)

Nullable Types

Proper null handling without object dtype overhead:
# Uses pandas nullable types: Int64, Float64, boolean
# Not generic object dtype
print(laps["LapNumber"].dtype)  # Int64 (not object)

Ultra Cold Start Mode

Optimize first request with minimal retries:
config.set("ultra_cold_start", True)  # default
config.set("ultra_cold_skip_retries", True)  # default

# First request uses zero retries for fastest possible response
# Subsequent requests use normal retry logic

Network Optimization

HTTP/2 Connection Pooling

tif1 uses niquests for HTTP/2 support:
  • Connection pooling: Reuse TCP connections
  • Header compression: HPACK compression
  • Multiplexing: Multiple requests per connection
config = get_config()

# HTTP/2 multiplexing (default: True)
config.set("http_multiplexed", True)

# Connection keepalive (default: 120s)
config.set("keepalive_timeout", 180)

# Max requests per connection (default: 1000)
config.set("keepalive_max_requests", 2000)

CDN Optimization

jsDelivr CDN provides global edge network with:
  • Compression: Gzip/Brotli support
  • Edge caching: Serve from nearest location
  • High availability: 99.9% uptime
# Enable CDN minification (experimental)
config.set("cdn_use_minification", True)  # 20-40% smaller files

Connection Statistics

Monitor connection reuse:
from tif1.http_session import get_connection_stats

stats = get_connection_stats()
print(f"Total requests: {stats['total']}")
print(f"Reused connections: {stats['reused']}")
print(f"Reuse rate: {stats['reuse_rate']:.1%}")

Validation Trade-offs

Validation adds safety but costs performance. Disable for speed:
config = get_config()

# Disable all validation (fastest)
config.set("validate_data", False)  # default: False
config.set("validate_lap_times", False)  # default: False  
config.set("validate_telemetry", False)  # default: False
Performance impact:
  • Validation disabled: ~5-10% faster data loading
  • Validation enabled: Catches data corruption early
See Validation for details on validation options.

Prefetching Strategies

Driver Laps Prefetch

Automatically prefetch all laps when getting a driver:
config.set("prefetch_driver_laps_on_get_driver", True)  # default

# When you access a driver, all their laps are fetched
driver = session.laps.pick_driver("VER")
# All laps already loaded - no additional requests

Telemetry Prefetch

Prefetch all telemetry data in parallel:
# Prefetch after loading laps (background)
config.set("prefetch_all_telemetry_after_laps_load", False)  # default

# Prefetch on first telemetry request
config.set("prefetch_all_telemetry_on_first_lap_request", False)  # default

# Set to True for "download everything" mode
config.set("prefetch_all_telemetry_after_laps_load", True)

Benchmarking

Measure Your Code

import time
import tif1

# Cold start (no cache)
start = time.time()
session = tif1.get_session(2024, "Bahrain", "Race")
laps = session.laps
print(f"Cold: {time.time() - start:.2f}s")

# Warm start (cached)
start = time.time()
session = tif1.get_session(2024, "Bahrain", "Race")  
laps = session.laps
print(f"Warm: {time.time() - start:.2f}s")

Enable Debug Logging

import logging
logging.basicConfig(level=logging.DEBUG)

# See cache hits, network requests, timing info
session = tif1.get_session(2024, "Bahrain", "Race")

Best Practices

Use Caching

Leave caching enabled (default) for automatic performance.

Batch Operations

Use async APIs to fetch multiple resources in parallel.

Choose Backend

Use polars for 2x speedup on large datasets (>10k laps).

Monitor Stats

Check connection reuse stats to verify optimization.

Performance Checklist

  • Cache enabled: Default, provides 10-100x speedup
  • HTTP/2 pooling: Automatic with niquests
  • Lazy loading: Only fetch what you need
  • Polars backend: 2x faster for large datasets
  • Categorical types: 50% memory reduction
  • Ultra cold start: Fastest first request
  • Async fetching: 4-5x speedup for parallel loads

Next Steps

Polars Backend

Learn about the high-performance polars backend

Circuit Breaker

Understand retry logic and failure handling

Validation

Configure data validation trade-offs

Build docs developers (and LLMs) love