Overview
Concurrent requests allow KrakenD to make the same backend call multiple times in parallel, returning the fastest successful response. This improves performance and provides fault tolerance when backends have variable latency.Example: Market Data
The/market/concurrent endpoint demonstrates concurrent calls to fetch cryptocurrency market data.
Try It
Configuration
With Concurrent Calls
Fromconfig/krakend/krakend.json:
Without Concurrent Calls
concurrent_calls parameter.
How It Works
Concurrent Calls Parameter
- Make 3 identical requests to the same backend simultaneously
- Wait for all responses
- Return the fastest successful response
- Discard the slower responses
Execution Timeline
Without Concurrent Calls:Performance Benefits
Latency Reduction
With variable backend latency, concurrent calls reduce the 95th percentile response time:| Metric | Single Call | 3 Concurrent Calls |
|---|---|---|
| Min | 150ms | 150ms |
| Avg | 200ms | 170ms |
| P95 | 300ms | 220ms |
| P99 | 400ms | 250ms |
Fault Tolerance
If one backend call fails, the others can still succeed:When to Use Concurrent Calls
✅ Good Use Cases
-
Variable Latency Backends
- Cloud services with inconsistent performance
- Geographically distributed backends
- Services with occasional slow responses
-
Critical Endpoints
- High-value transactions
- Real-time data requirements
- User-facing APIs where latency matters
-
Unreliable Networks
- Mobile backends
- Cross-region communication
- Microservices with network jitter
❌ When Not to Use
-
Write Operations
- Creates duplicate records
- Causes race conditions
- Wastes backend resources
-
Rate-Limited APIs
- Consumes rate limit quota faster
- May trigger rate limiting
- Expensive API calls
-
Heavy Operations
- Database writes
- File uploads
- Complex computations
Trade-offs
Advantages
- Lower Latency - Returns fastest response
- Higher Reliability - Tolerates individual failures
- Better User Experience - Faster perceived performance
Disadvantages
- Increased Backend Load - N times more requests
- Higher Network Usage - Multiple redundant calls
- Cost Implications - More API calls = higher costs
Choosing the Right Number
Recommendation
Start withconcurrent_calls: 3 and adjust based on:
- Backend latency variance
- Cost sensitivity
- Performance requirements
Comparison Table
| Feature | Concurrent Requests | Concurrent Backends |
|---|---|---|
| Purpose | Same call multiple times | Different calls in parallel |
| Backends | 1 backend | Multiple backends |
| Response | Fastest response | Aggregated response |
| Use Case | Reduce latency | Merge data sources |
| Config | concurrent_calls: 3 | Multiple backend entries |
Combining Strategies
You can use concurrent requests WITH data aggregation:/users/{id} AND 2 concurrent calls to /orders/{id}, then aggregates the fastest responses.
Monitoring
Use Jaeger tracing to visualize concurrent requests:Best Practices
1. Use for Read Operations Only
2. Consider Backend Costs
If your backend charges per request:3. Monitor Backend Load
Concurrent calls triple backend load. Ensure your backend can handle it:4. Set Appropriate Timeouts
Learn More
- KrakenD Concurrent Requests Docs
- Data Aggregation - Calling multiple different backends
- Backend Caching - Alternative latency reduction strategy