Spider Blocked Request Detection
The Spider class has built-in support for detecting and handling blocked requests:Default Blocked Status Codes
Scrapling automatically detects these HTTP status codes as blocked:scrapling/spiders/spider.py:16
Custom Block Detection
Override theis_blocked() method to implement custom detection logic:
scrapling/spiders/spider.py:190-194
Retry Configuration
Max Blocked Retries
Control how many times a blocked request is retried:scrapling/spiders/spider.py:79
Retry with Modified Request
Customize the request before retrying:scrapling/spiders/spider.py:196-198
Proxy Error Detection
Scrapling automatically detects proxy-related errors:scrapling/engines/toolbelt/proxy_rotation.py:7-15
Automatic Proxy Rotation on Failure
When usingProxyRotator, Scrapling automatically rotates proxies on errors:
scrapling/engines/_browsers/_stealth.py:269-283
Fetcher-Level Retry Logic
Automatic Retries
Fetchers automatically retry failed requests:scrapling/engines/_browsers/_validators.py:88-89
Retry Loop Implementation
Here’s how Scrapling retries internally:scrapling/engines/_browsers/_stealth.py:478-539
Error Handling Hooks
On Error Callback
Handle errors in spiders:scrapling/spiders/spider.py:178-184
Complete Example
Combining all features:Session-Level Error Handling
For standalone fetchers:Best Practices
Implement Custom Block Detection
Implement Custom Block Detection
Don’t rely only on status codes:
Use Escalating Retry Strategy
Use Escalating Retry Strategy
Start simple, escalate to advanced techniques:
Monitor Error Rates
Monitor Error Rates
Track and alert on high error rates:
Combine with Proxy Rotation
Combine with Proxy Rotation
Proxies help avoid IP-based blocking:
Related Documentation
Anti-Bot Bypass
Bypass anti-bot systems
Cloudflare Turnstile
Solve Cloudflare challenges
Error Handling
Complete error handling guide
Performance
Optimize retry performance