Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TracingInsights/tif1/llms.txt

Use this file to discover all available pages before exploring further.

Overview

tif1 implements a robust circuit breaker pattern with exponential backoff retry logic to handle network failures gracefully. This ensures your application remains responsive even when CDN endpoints experience issues.
The circuit breaker prevents cascading failures by “opening” after repeated failures and temporarily blocking requests to give the system time to recover.

Circuit Breaker Pattern

The circuit breaker has three states:
┌─────────────────────────────────────────────────────────┐
│                      CLOSED                             │
│  • Normal operation                                     │
│  • Requests pass through                                │
│  • Failures increment counter                           │
└─────────────────────────────────────────────────────────┘
                    ↓ (threshold failures)
┌─────────────────────────────────────────────────────────┐
│                       OPEN                              │
│  • Requests fail immediately                            │
│  • No network calls made                                │
│  • Timeout timer started                                │
└─────────────────────────────────────────────────────────┘
                    ↓ (after timeout)
┌─────────────────────────────────────────────────────────┐
│                    HALF-OPEN                            │
│  • Test request allowed                                 │
│  • Success → CLOSED                                     │
│  • Failure → OPEN                                       │
└─────────────────────────────────────────────────────────┘

Configuration

Circuit Breaker Settings

from tif1 import get_config

config = get_config()

# Failure threshold before opening (default: 5)
config.set("circuit_breaker_threshold", 10)

# Timeout before trying again in seconds (default: 60)
config.set("circuit_breaker_timeout", 120)
Or via environment variables:
export TIF1_CIRCUIT_BREAKER_THRESHOLD=10
export TIF1_CIRCUIT_BREAKER_TIMEOUT=120
Or in ~/.tif1rc:
{
  "circuit_breaker_threshold": 10,
  "circuit_breaker_timeout": 120
}

Default Values

From config.py:50-51:
"circuit_breaker_threshold": 5,   # Open after 5 failures
"circuit_breaker_timeout": 60,    # Try again after 60 seconds

How It Works

Thread-Safe Implementation

The circuit breaker uses atomic operations with threading locks:
# From retry.py:17-28
class CircuitBreaker:
    def __init__(self, threshold: int = 5, timeout: int = 60):
        self.threshold = threshold
        self.timeout = timeout
        self._failures = 0
        self.last_failure_time: datetime | None = None
        self._last_failure_monotonic: float | None = None
        self._state = "closed"  # closed, open, half-open
        self._lock = threading.RLock()  # Reentrant lock for nested calls

Recording Failures

Failures increment atomically:
# From retry.py:87-104
def record_failure(self) -> None:
    """Record failed request with atomic counter increment."""
    now_dt = datetime.now()
    now_mono = time.monotonic()
    with self._lock:
        # Atomic increment - all reads and writes protected by same lock
        self._failures += 1
        self.last_failure_time = now_dt
        self._last_failure_monotonic = now_mono
        
        # Atomic transition: * -> open (if threshold reached)
        if self._failures >= self.threshold:
            self._state = "open"
            logger.warning(f"Circuit breaker opened after {self._failures} failures")

Recording Success

Successes reset the counter:
# From retry.py:76-85
def record_success(self) -> None:
    """Record successful request with atomic state transition."""
    with self._lock:
        # Only reset failures if circuit is not open
        if self._state != "open":
            self._failures = 0
        if self._state == "half-open":
            self._state = "closed"
            logger.info("Circuit breaker closed")

Checking State

Before each request, check if circuit allows the call:
# From retry.py:61-74
def check_and_update_state(self) -> tuple[bool, str]:
    """Check circuit breaker state and update if needed.
    
    Returns:
        Tuple of (should_proceed, state)
    """
    with self._lock:
        if self._state == "open":
            if self._is_timeout_elapsed():
                self._state = "half-open"
                logger.info("Circuit breaker entering half-open state")
                return True, "half-open"
            return False, "open"
        return True, self._state

Retry Logic

Exponential Backoff

tif1 uses exponential backoff with jitter for retry delays:
delay = backoff_factor ** attempt
if jitter:
    delay *= 0.5 + random.random()

Retry Configuration

config = get_config()

# Maximum retry attempts (default: 3)
config.set("max_retries", 5)

# Backoff multiplier (default: 2.0)
config.set("retry_backoff_factor", 3.0)

# Enable jitter (default: True)
config.set("retry_jitter", True)

# Max jitter amount in seconds (default: 0.0)
config.set("retry_jitter_max", 1.0)

# Maximum retry delay in seconds (default: 60.0)
config.set("max_retry_delay", 120.0)

Retry Decorator

The @retry_with_backoff decorator wraps functions with retry logic:
# From retry.py:191-227
@retry_with_backoff(
    max_retries=3,
    backoff_factor=2.0,
    jitter=True,
    exceptions=(Exception,)
)
def fetch_data():
    # Network request here
    pass

Retry Flow

# From retry.py:204-223
for attempt in range(max_retries):
    try:
        return _circuit_breaker.call(func, *args, **kwargs)
    except exceptions as e:
        last_exception = e
        
        if attempt == max_retries - 1:
            break
        
        # Calculate backoff with jitter
        backoff = backoff_factor ** attempt
        if jitter:
            backoff *= 0.5 + random.random()
        
        logger.warning(f"Retry {attempt + 1}/{max_retries} after {backoff:.2f}s: {e}")
        time.sleep(backoff)

CDN Fallback

tif1 supports multiple CDN sources with automatic fallback.

CDN Manager

The CDNManager tracks multiple CDN sources:
# From cdn.py:35-77
class CDNManager:
    def __init__(self):
        config = get_config()
        default_sources = [
            "https://cdn.jsdelivr.net/gh/TracingInsights",
        ]
        configured_sources = config.get("cdns", default_sources)
        
        self.sources = []  # List of CDNSource objects
        self._failure_counts = {}
        self._max_failures = 3  # Disable after 3 failures

CDN Source

# From cdn.py:13-32
@dataclass
class CDNSource:
    name: str
    base_url: str
    priority: int = 0
    enabled: bool = True
    use_minification: bool = False
    
    def format_url(self, year: int, gp: str, session: str, path: str) -> str:
        """Format URL for this CDN."""
        if self.use_minification and path.endswith(".json"):
            path = path.replace(".json", ".min.json")
        return f"{self.base_url}/{year}@main/{gp}/{session}/{path}"

Fallback Logic

When a CDN fails, try the next one:
# From cdn.py:118-147
def try_sources(self, year, gp, session, path, fetch_func):
    sources = self.get_sources()  # Only enabled sources
    
    for source in sources:
        try:
            url = source.format_url(year, gp, session, path)
            result = fetch_func(url)
            self.mark_success(source.name)  # Reset failure count
            return result
        except DataNotFoundError:
            raise  # 404 means data doesn't exist
        except Exception as e:
            logger.warning(f"CDN {source.name} failed: {e}")
            self.mark_failure(source.name)  # Increment failure count
    
    raise NetworkError(...)

Disabling Failed CDNs

# From cdn.py:102-108
def mark_failure(self, source_name: str):
    self._failure_counts[source_name] += 1
    if self._failure_counts[source_name] >= self._max_failures:
        logger.warning(
            f"CDN source '{source_name}' disabled after {self._max_failures} failures"
        )

Configuring CDNs

config = get_config()

# Set custom CDN list
config.set("cdns", [
    "https://cdn.jsdelivr.net/gh/TracingInsights",
    "https://cdn.example.com/f1data",
])

# Enable minification (experimental)
config.set("cdn_use_minification", True)
Never use raw.githubusercontent.com - it has rate limits. tif1 will automatically skip these URLs. Only use cdn.jsdelivr.net or other proper CDNs.

Checking Status

Circuit Breaker State

from tif1.retry import get_circuit_breaker

cb = get_circuit_breaker()

# Check current state
print(f"State: {cb.state}")  # "closed", "open", or "half-open"
print(f"Failures: {cb.failures}")
print(f"Last failure: {cb.last_failure_time}")

# Check if requests will be blocked
should_proceed, state = cb.check_and_update_state()
if not should_proceed:
    print(f"Circuit breaker is {state}, requests blocked")

CDN Health

from tif1.cdn import get_cdn_manager

cdm = get_cdn_manager()

# Get enabled sources
sources = cdm.get_sources()
for source in sources:
    print(f"{source.name}: {source.base_url} (priority {source.priority})")

# Check failure counts
print(cdm._failure_counts)

Reset Circuit Breaker

Manually reset the circuit breaker:
from tif1.retry import reset_circuit_breaker

reset_circuit_breaker()
# Creates fresh circuit breaker with zero failures

Reset CDN Manager

from tif1.cdn import get_cdn_manager

cdm = get_cdn_manager()
cdm.reset()  # Reset all failure counts

Ultra Cold Start Mode

For the fastest possible first request, skip retries:
config = get_config()

# Skip retries on first request (default: True)
config.set("ultra_cold_skip_retries", True)

# This sets max_retries=0 for cold start
config.set("ultra_cold_start", True)  # default
When ultra_cold_skip_retries=True and cache is cold:
# From async_fetch.py:418-492
if max_retries == 0:
    # Fast path: try first CDN only, no retries
    cdn_source = cdn_sources[0]
    try:
        # Single attempt
        response = await fetch(url)
        return data
    except Exception as e:
        # Try remaining CDNs without delay
        for cdn_source in cdn_sources[1:]:
            # Fallback attempts
This provides the fastest possible response for the first request at the cost of less resilience.

Pool Exhaustion Handling

tif1 detects and handles connection pool exhaustion:
# From async_fetch.py:568-595
is_pool_exhaustion = False
error_msg = str(e).lower()
if any(keyword in error_msg for keyword in 
       ["pool", "connection pool", "max retries", "pool timeout"]):
    is_pool_exhaustion = True
    logger.warning(
        f"Connection pool exhaustion detected: {e}. Will retry with backoff."
    )

if is_pool_exhaustion:
    # Add immediate backoff
    pool_backoff = min(
        pool_backoff_base * (2 ** attempt_num), 
        pool_backoff_max
    )
    if use_jitter:
        pool_backoff += random.uniform(0, pool_backoff_jitter)
    await asyncio.sleep(pool_backoff)

Pool Configuration

config = get_config()

# Pool exhaustion backoff base in seconds (default: 0.01)
config.set("pool_exhaustion_backoff_base", 0.5)

# Max pool backoff in seconds (default: 0.5)
config.set("pool_exhaustion_backoff_max", 5.0)

# Jitter amount for pool backoff (default: 0.01)
config.set("pool_exhaustion_backoff_jitter", 0.5)

Error Handling

Exception Types

tif1 raises specific exceptions for different failure scenarios:
from tif1.exceptions import (
    NetworkError,      # Network/connection failures
    DataNotFoundError, # 404 - data doesn't exist
    InvalidDataError,  # Corrupted/invalid data
)

try:
    session = tif1.get_session(2024, "Bahrain", "Race")
except DataNotFoundError:
    print("Session data not found (404)")
except NetworkError as e:
    print(f"Network error: {e}")
    print(f"Status code: {e.status_code}")
except InvalidDataError:
    print("Data validation failed")

Fatal vs Retryable Errors

Fatal (no retry):
  • DataNotFoundError (404) - data doesn’t exist
  • InvalidDataError - corrupted data
Retryable:
  • NetworkError - connection issues
  • TimeoutError - request timeout
  • HTTP 5xx errors - server issues
# From async_fetch.py:547-549
except (DataNotFoundError, InvalidDataError) as e:
    # Fatal errors - don't retry
    return None, e

Best Practices

Use Defaults

Default settings (5 failures, 60s timeout) work well for most cases

Monitor Logs

Watch for circuit breaker warnings in logs

Handle 404s

DataNotFoundError means data genuinely doesn’t exist

Don't Disable

Keep circuit breaker enabled for resilience

Troubleshooting

Circuit Breaker Keeps Opening

import logging
logging.basicConfig(level=logging.WARNING)

# Check what's failing
from tif1.retry import get_circuit_breaker
cb = get_circuit_breaker()
print(f"Failures: {cb.failures}/{cb.threshold}")
print(f"Last failure: {cb.last_failure_time}")

# Increase threshold if network is flaky
config.set("circuit_breaker_threshold", 10)

Retries Taking Too Long

# Reduce max retries
config.set("max_retries", 2)

# Reduce max delay
config.set("max_retry_delay", 30.0)

# Reduce backoff factor
config.set("retry_backoff_factor", 1.5)

All CDNs Failing

from tif1.cdn import get_cdn_manager

cdm = get_cdn_manager()
print("Enabled sources:", cdm.get_sources())
print("Failure counts:", cdm._failure_counts)

# Reset and try again
cdm.reset()

Next Steps

Performance

Learn about performance optimization

Validation

Configure data validation

Configuration

Full configuration reference

Build docs developers (and LLMs) love