Documentation Index Fetch the complete documentation index at: https://mintlify.com/TracingInsights/tif1/llms.txt
Use this file to discover all available pages before exploring further.
Overview
tif1 supports polars as an alternative to pandas, offering 2x faster performance for large datasets with 40% less memory usage.
Zero-copy conversion : tif1 uses Apache Arrow for zero-copy conversion between pandas and polars, making backend switching nearly free.
Installation
Install polars as an optional dependency:
Or install polars separately:
Enabling Polars
Via Configuration
from tif1 import get_config
config = get_config()
config.set( "lib" , "polars" )
# Now all DataFrames will be polars.DataFrame
import tif1
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
print ( type (session.laps)) # <class 'polars.dataframe.frame.DataFrame'>
Via Environment Variable
Then use tif1 normally:
import tif1
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
print ( type (session.laps)) # polars.DataFrame
Via Config File
Add to ~/.tif1rc:
Load Time
Loading lap data from cache:
import time
import tif1
from tif1 import get_config
# Test pandas
get_config().set( "lib" , "pandas" )
start = time.time()
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
laps = session.laps
pandas_time = time.time() - start
# Test polars
get_config().set( "lib" , "polars" )
start = time.time()
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
laps = session.laps
polars_time = time.time() - start
print ( f "pandas: { pandas_time :.3f} s" )
print ( f "polars: { polars_time :.3f} s" )
print ( f "speedup: { pandas_time / polars_time :.2f} x" )
Typical results :
pandas: 150ms
polars: 75ms
Speedup: 2.0x
Filtering
Filtering laps by driver and compound:
# pandas
fast_laps = session.laps[
(session.laps[ "Driver" ] == "VER" ) &
(session.laps[ "Compound" ] == "SOFT" )
]
# 20ms
# polars
fast_laps = session.laps.filter(
(pl.col( "Driver" ) == "VER" ) &
(pl.col( "Compound" ) == "SOFT" )
)
# 8ms - 2.5x faster
Aggregations
Calculating average lap times per driver:
# pandas
avg_times = session.laps.groupby( "Driver" )[ "LapTime" ].mean()
# 50ms
# polars
avg_times = session.laps.group_by( "Driver" ).agg(pl.col( "LapTime" ).mean())
# 20ms - 2.5x faster
Memory Usage
Dataset pandas polars Reduction 1000 laps 10 MB 6 MB 40% 5000 laps 50 MB 30 MB 40% 20000 laps 200 MB 120 MB 40%
Backend Conversion
tif1 provides zero-copy conversion between pandas and polars using Apache Arrow.
Convert to Polars
from tif1.core_utils.backend_conversion import pandas_to_polars
import pandas as pd
pandas_df = pd.DataFrame({ "a" : [ 1 , 2 , 3 ]})
polars_df = pandas_to_polars(pandas_df, rechunk = False )
print ( type (polars_df)) # polars.DataFrame
Convert to Pandas
from tif1.core_utils.backend_conversion import polars_to_pandas
import polars as pl
polars_df = pl.DataFrame({ "a" : [ 1 , 2 , 3 ]})
pandas_df = polars_to_pandas(polars_df, use_pyarrow = True )
print ( type (pandas_df)) # pandas.DataFrame
Generic Conversion
from tif1.core_utils.backend_conversion import convert_backend
# Automatically detect and convert
df = session.laps # Could be pandas or polars
pandas_df = convert_backend(df, "pandas" )
polars_df = convert_backend(df, "polars" )
Zero-Copy Benefits
Zero-copy means no data duplication in memory. The conversion uses Apache Arrow to share the same memory buffer between pandas and polars.
# Traditional copy (slow, 2x memory)
polars_df = pl.from_pandas(pandas_df, rechunk = True ) # Copies data
# Zero-copy (fast, no extra memory)
polars_df = pl.from_pandas(pandas_df, rechunk = False ) # Shares memory
The backend_conversion.py module always uses zero-copy by default:
# From backend_conversion.py:22-44
def pandas_to_polars ( df : pd.DataFrame, * , rechunk : bool = False ) -> pl.DataFrame:
"""Convert pandas DataFrame to polars using zero-copy Arrow conversion.
Args:
df: Pandas DataFrame to convert
rechunk: Whether to rechunk the result. False for zero-copy (default).
"""
return pl.from_pandas(df, rechunk = rechunk) # rechunk=False = zero-copy
Working with Polars
Accessing Data
Polars DataFrames work similarly to pandas:
import tif1
import polars as pl
from tif1 import get_config
get_config().set( "lib" , "polars" )
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
# Access columns
lap_times = session.laps[ "LapTime" ]
print ( type (lap_times)) # polars.Series
# Filter rows
fast_laps = session.laps.filter(pl.col( "LapTime" ) < 90.0 )
# Select columns
subset = session.laps.select([ "Driver" , "LapNumber" , "LapTime" ])
Polars Expressions
Polars uses a powerful expression API:
import polars as pl
# Find fastest lap per driver
fastest = session.laps.group_by( "Driver" ).agg(
pl.col( "LapTime" ).min().alias( "FastestLap" ),
pl.col( "LapNumber" ).first().alias( "FirstLap" ),
pl.col( "Compound" ).n_unique().alias( "Compounds" )
)
# Chain operations
result = (
session.laps
.filter(pl.col( "LapTime" ).is_not_null())
.with_columns([
(pl.col( "LapTime" ) - pl.col( "LapTime" ).mean()).alias( "Delta" )
])
.sort( "LapTime" )
.head( 10 )
)
Lazy Evaluation
Polars supports lazy evaluation for query optimization:
import polars as pl
# Lazy query (not executed yet)
query = (
session.laps.lazy()
.filter(pl.col( "Driver" ) == "VER" )
.select([ "LapNumber" , "LapTime" ])
.sort( "LapTime" )
)
# Execute when needed
result = query.collect()
API Compatibility
tif1 provides a consistent API regardless of backend:
# Same code works with both backends
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
# Pick driver (works for pandas and polars)
ver_laps = session.laps.pick_driver( "VER" )
# Pick lap
fastest = session.laps.pick_fastest()
# Get telemetry
tel = fastest.get_telemetry()
Backend Detection
Check which backend is being used:
import pandas as pd
import polars as pl
laps = session.laps
if isinstance (laps, pd.DataFrame):
print ( "Using pandas backend" )
elif isinstance (laps, pl.DataFrame):
print ( "Using polars backend" )
Categorical Columns
Polars handles categorical data differently than pandas.
pandas Categorical
get_config().set( "lib" , "pandas" )
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
print (session.laps[ "Driver" ].dtype)
# category
print (session.laps[ "Compound" ].dtype)
# category
polars Categorical
Polars uses Utf8 (string) by default. Enable categorical:
get_config().set( "lib" , "polars" )
get_config().set( "polars_lap_categorical" , True )
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
print (session.laps[ "Driver" ].dtype)
# Categorical (if enabled)
Polars categorical is currently opt-in due to compatibility considerations. Set polars_lap_categorical=True to enable.
When to Use Polars
Use Polars When:
Large Datasets Processing >10k laps or multiple sessions
Complex Queries Heavy filtering, grouping, and aggregations
Memory Constrained Need to reduce memory usage by 40%
Performance Critical Every millisecond counts in your workflow
Use Pandas When:
Ecosystem Tools Need matplotlib, seaborn, or other pandas tools
Small Datasets Processing single sessions (<1000 laps)
Familiar API Team is experienced with pandas
Compatibility Integrating with pandas-dependent code
1. Use Lazy Evaluation
# Lazy query - optimizes entire pipeline
result = (
session.laps.lazy()
.filter(pl.col( "Driver" ).is_in([ "VER" , "HAM" ]))
.select([ "LapNumber" , "LapTime" ])
.sort( "LapTime" )
.collect() # Execute optimized query
)
2. Avoid Row Iteration
# Slow - row iteration
for row in session.laps.iter_rows():
process(row)
# Fast - vectorized operations
result = session.laps.with_columns(
pl.col( "LapTime" ).apply(process)
)
3. Use Streaming
For very large datasets:
# Stream processing (low memory)
result = (
session.laps.lazy()
.filter(pl.col( "LapTime" ).is_not_null())
.collect( streaming = True )
)
Limitations
Current Limitations
Plotting : Some plotting libraries expect pandas DataFrames
Categorical : Polars categorical is opt-in (polars_lap_categorical)
Interop : Some third-party libraries may not support polars
Workarounds
Convert to pandas when needed:
# Get data as polars for speed
get_config().set( "lib" , "polars" )
laps = session.laps
# Convert to pandas for plotting
import pandas as pd
laps_pd = laps.to_pandas()
laps_pd.plot( x = "LapNumber" , y = "LapTime" )
Or use convert_backend:
from tif1.core_utils.backend_conversion import convert_backend
laps_pd = convert_backend(laps, "pandas" )
Benchmark Results
Real-world performance comparison on Bahrain 2024 Race (20 drivers, 57 laps, 1140 rows):
Operation pandas polars Speedup Load from cache 152ms 76ms 2.0x Filter by driver 18ms 7ms 2.6x Group by driver 45ms 18ms 2.5x Sort by lap time 12ms 5ms 2.4x Select columns 8ms 3ms 2.7x Unique compounds 15ms 6ms 2.5x
Memory : pandas 89MB, polars 54MB (39% reduction )
Migration Guide
Pandas to Polars
Common operations translated:
# pandas
df.head( 10 )
df[df[ "column" ] > 5 ]
df.groupby( "col" ).mean()
df.sort_values( "col" )
df[[ "col1" , "col2" ]]
# polars
df.head( 10 )
df.filter(pl.col( "column" ) > 5 )
df.group_by( "col" ).agg(pl.all().mean())
df.sort( "col" )
df.select([ "col1" , "col2" ])
Testing Both Backends
import pytest
from tif1 import get_config
@pytest.mark.parametrize ( "backend" , [ "pandas" , "polars" ])
def test_lap_loading ( backend ):
get_config().set( "lib" , backend)
session = tif1.get_session( 2024 , "Bahrain" , "Race" )
assert len (session.laps) > 0
Next Steps
Performance Learn more performance optimization strategies
Polars Docs Official Polars documentation
Circuit Breaker Understand retry and failure handling