Concurrent Requests

Overview

Concurrent requests allow KrakenD to make the same backend call multiple times in parallel, returning the fastest successful response. This improves performance and provides fault tolerance when backends have variable latency.

Example: Market Data

The /market/concurrent endpoint demonstrates concurrent calls to fetch cryptocurrency market data.

Try It

# Concurrent endpoint (3 parallel calls)
curl http://localhost:8080/market/concurrent

# Regular endpoint (1 call) for comparison
curl http://localhost:8080/market/simple

The concurrent endpoint should respond faster, especially under load.

Configuration

With Concurrent Calls

From config/krakend/krakend.json:

{
  "endpoint": "/market/concurrent",
  "concurrent_calls": 3,
  "backend": [
    {
      "host": ["https://api.coingecko.com"],
      "url_pattern": "/api/v3/coins/markets?vs_currency=eur&ids=bitcoin%2Cethereum&order=market_cap_desc&per_page=100&page=1&sparkline=false",
      "encoding": "safejson",
      "mapping": {
        "collection": "coins"
      }
    }
  ]
}

Without Concurrent Calls

{
  "endpoint": "/market/simple",
  "backend": [
    {
      "host": ["https://api.coingecko.com"],
      "url_pattern": "/api/v3/coins/markets?vs_currency=eur&ids=bitcoin%2Cethereum&order=market_cap_desc&per_page=100&page=1&sparkline=false",
      "encoding": "safejson",
      "mapping": {
        "collection": "coins"
      }
    }
  ]
}

The only difference is the concurrent_calls parameter.

How It Works

Concurrent Calls Parameter

"concurrent_calls": 3

This tells KrakenD to:

Make 3 identical requests to the same backend simultaneously
Wait for all responses
Return the fastest successful response
Discard the slower responses

Execution Timeline

Without Concurrent Calls:

Request → Backend (250ms) → Response

With Concurrent Calls (3):

Request → Backend Call 1 (250ms) → ✗ Discarded
       → Backend Call 2 (180ms) → ✓ Returned
       → Backend Call 3 (220ms) → ✗ Discarded

The client receives the response in 180ms instead of 250ms.

Performance Benefits

Latency Reduction

With variable backend latency, concurrent calls reduce the 95th percentile response time:

Metric	Single Call	3 Concurrent Calls
Min	150ms	150ms
Avg	200ms	170ms
P95	300ms	220ms
P99	400ms	250ms

Fault Tolerance

If one backend call fails, the others can still succeed:

Request → Backend Call 1 (timeout) → ✗ Failed
       → Backend Call 2 (200ms)    → ✓ Returned
       → Backend Call 3 (210ms)    → ✗ Discarded

The request succeeds even though one call timed out.

When to Use Concurrent Calls

✅ Good Use Cases

Variable Latency Backends
- Cloud services with inconsistent performance
- Geographically distributed backends
- Services with occasional slow responses
Critical Endpoints
- High-value transactions
- Real-time data requirements
- User-facing APIs where latency matters
Unreliable Networks
- Mobile backends
- Cross-region communication
- Microservices with network jitter

❌ When Not to Use

Write Operations
- Creates duplicate records
- Causes race conditions
- Wastes backend resources
Rate-Limited APIs
- Consumes rate limit quota faster
- May trigger rate limiting
- Expensive API calls
Heavy Operations
- Database writes
- File uploads
- Complex computations

Trade-offs

Advantages

Lower Latency - Returns fastest response
Higher Reliability - Tolerates individual failures
Better User Experience - Faster perceived performance

Disadvantages

Increased Backend Load - N times more requests
Higher Network Usage - Multiple redundant calls
Cost Implications - More API calls = higher costs

Choosing the Right Number

"concurrent_calls": 2  // Minimal overhead, some improvement
"concurrent_calls": 3  // Balanced approach (recommended)
"concurrent_calls": 5  // High reliability, higher cost

Recommendation

Start with concurrent_calls: 3 and adjust based on:

Backend latency variance
Cost sensitivity
Performance requirements

Comparison Table

Feature	Concurrent Requests	Concurrent Backends
Purpose	Same call multiple times	Different calls in parallel
Backends	1 backend	Multiple backends
Response	Fastest response	Aggregated response
Use Case	Reduce latency	Merge data sources
Config	`concurrent_calls: 3`	Multiple `backend` entries

Example of Concurrent Backends (different concept):

{
  "endpoint": "/user-profile",
  "backend": [
    {
      "url_pattern": "/users/{id}",
      "group": "user"
    },
    {
      "url_pattern": "/orders/{id}",
      "group": "orders"
    }
  ]
}

This calls two different endpoints in parallel and merges results. This is data aggregation, not concurrent requests.

Combining Strategies

You can use concurrent requests WITH data aggregation:

{
  "endpoint": "/user-dashboard",
  "concurrent_calls": 2,
  "backend": [
    {
      "url_pattern": "/users/{id}",
      "group": "user"
    },
    {
      "url_pattern": "/orders/{id}",
      "group": "orders"
    }
  ]
}

This makes 2 concurrent calls to /users/{id} AND 2 concurrent calls to /orders/{id}, then aggregates the fastest responses.

Monitoring

Use Jaeger tracing to visualize concurrent requests:

GET /market/concurrent
├── Backend Call 1 (250ms)
├── Backend Call 2 (180ms) ← Used
└── Backend Call 3 (220ms)

Traces show all parallel calls and which was returned.

Best Practices

1. Use for Read Operations Only

// ✓ Safe - GET request
{
  "endpoint": "/product/{id}",
  "concurrent_calls": 3,
  "backend": [{
    "url_pattern": "/products/{id}"
  }]
}

// ✗ Dangerous - POST request
{
  "endpoint": "/orders",
  "method": "POST",
  "concurrent_calls": 3,  // Creates 3 orders!
  "backend": [{
    "url_pattern": "/orders"
  }]
}

2. Consider Backend Costs

If your backend charges per request:

Single call cost: $0.001
Concurrent calls (3): $0.003
1M requests/month: $3,000 vs $1,000

Balance cost against performance needs.

3. Monitor Backend Load

Concurrent calls triple backend load. Ensure your backend can handle it:

Regular load: 1,000 req/s
With concurrent_calls: 3: 3,000 req/s

4. Set Appropriate Timeouts

{
  "endpoint": "/market/concurrent",
  "timeout": "2s",
  "concurrent_calls": 3,
  "backend": [
    {
      "host": ["https://api.example.com"],
      "url_pattern": "/data"
    }
  ]
}

If all 3 calls exceed the timeout, the request fails.

Learn More

KrakenD Concurrent Requests Docs
Data Aggregation - Calling multiple different backends
Backend Caching - Alternative latency reduction strategy

Get Started

Services

Features & Examples

Configuration

Operations

Overview

Example: Market Data

Try It

Configuration

With Concurrent Calls

Without Concurrent Calls

How It Works

Concurrent Calls Parameter

Execution Timeline

Performance Benefits

Latency Reduction

Fault Tolerance

When to Use Concurrent Calls

✅ Good Use Cases

❌ When Not to Use

Trade-offs

Advantages

Disadvantages

Choosing the Right Number

Recommendation

Comparison Table

Combining Strategies

Monitoring

Best Practices

1. Use for Read Operations Only

2. Consider Backend Costs

3. Monitor Backend Load

4. Set Appropriate Timeouts

Learn More

Build docs developers (and LLMs) love

Get Started

Services

Features & Examples

Configuration

Operations

​Overview

​Example: Market Data

​Try It

​Configuration

​With Concurrent Calls

​Without Concurrent Calls

​How It Works

​Concurrent Calls Parameter

​Execution Timeline

​Performance Benefits

​Latency Reduction

​Fault Tolerance

​When to Use Concurrent Calls

​✅ Good Use Cases

​❌ When Not to Use

​Trade-offs

​Advantages

​Disadvantages

​Choosing the Right Number

​Recommendation

​Comparison Table

​Combining Strategies

​Monitoring

​Best Practices

​1. Use for Read Operations Only

​2. Consider Backend Costs

​3. Monitor Backend Load

​4. Set Appropriate Timeouts

​Learn More

Build docs developers (and LLMs) love

Overview

Example: Market Data

Try It

Configuration

With Concurrent Calls

Without Concurrent Calls

How It Works

Concurrent Calls Parameter

Execution Timeline

Performance Benefits

Latency Reduction

Fault Tolerance

When to Use Concurrent Calls

✅ Good Use Cases

❌ When Not to Use

Trade-offs

Advantages

Disadvantages

Choosing the Right Number

Recommendation

Comparison Table

Combining Strategies

Monitoring

Best Practices

1. Use for Read Operations Only

2. Consider Backend Costs

3. Monitor Backend Load

4. Set Appropriate Timeouts

Learn More