Job board websites rate-limit requests from the same IP address. When you exceed their threshold, you receive aDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/speedyapply/JobSpy/llms.txt
Use this file to discover all available pages before exploring further.
429 Too Many Requests response and scraping stops. Proxies let you rotate IP addresses to reduce this risk.
Which boards need proxies
| Board | Rate limiting | Recommendation |
|---|---|---|
| Indeed | Minimal | Not required |
| ZipRecruiter | Moderate | Optional |
| Glassdoor | Moderate | Optional |
| Highly restrictive | Strongly recommended |
LinkedIn typically rate-limits around the 10th page of results from a single IP address. If you need more than ~100 LinkedIn results per run, proxies are effectively required.
The proxies parameter
Pass a list of proxy strings to scrape_jobs() via the proxies parameter. Each scraper rotates through the list in round-robin order.
Proxy format
Proxies are strings in the formatuser:pass@host:port. You can also use "localhost" to represent a direct (no-proxy) connection slot in the rotation.
Single proxy
You can pass a single proxy as a string instead of a list:How rotation works
Each scraper instance gets its own rotating proxy session. When a scraper makes a request, it advances to the next proxy in the cycle. If one proxy is blocked, the next request will use a different proxy automatically. This means if you are scraping four sites simultaneously, each site rotates through the proxy list independently.CA certificate for proxies
Some corporate or SSL-intercepting proxies require a custom CA certificate for HTTPS inspection. Pass the path to the certificate file viaca_cert.
Overriding the user agent
The default user agent string may become outdated as job boards update their bot detection. Useuser_agent to override it with a current browser user agent.
Handling rate limit errors
A429 response means the job board has temporarily blocked your IP. When this happens:
- Wait before scraping again. The required wait time is site-dependent — LinkedIn may need several minutes, while other boards recover faster.
- Add more proxies to the rotation to distribute requests across more IP addresses.
- Reduce
results_wantedto make fewer requests per run.
