Skip to main content

Overview

Concurrent requests allow KrakenD to make the same backend call multiple times in parallel, returning the fastest successful response. This improves performance and provides fault tolerance when backends have variable latency.

Example: Market Data

The /market/concurrent endpoint demonstrates concurrent calls to fetch cryptocurrency market data.

Try It

# Concurrent endpoint (3 parallel calls)
curl http://localhost:8080/market/concurrent

# Regular endpoint (1 call) for comparison
curl http://localhost:8080/market/simple
The concurrent endpoint should respond faster, especially under load.

Configuration

With Concurrent Calls

From config/krakend/krakend.json:
{
  "endpoint": "/market/concurrent",
  "concurrent_calls": 3,
  "backend": [
    {
      "host": ["https://api.coingecko.com"],
      "url_pattern": "/api/v3/coins/markets?vs_currency=eur&ids=bitcoin%2Cethereum&order=market_cap_desc&per_page=100&page=1&sparkline=false",
      "encoding": "safejson",
      "mapping": {
        "collection": "coins"
      }
    }
  ]
}

Without Concurrent Calls

{
  "endpoint": "/market/simple",
  "backend": [
    {
      "host": ["https://api.coingecko.com"],
      "url_pattern": "/api/v3/coins/markets?vs_currency=eur&ids=bitcoin%2Cethereum&order=market_cap_desc&per_page=100&page=1&sparkline=false",
      "encoding": "safejson",
      "mapping": {
        "collection": "coins"
      }
    }
  ]
}
The only difference is the concurrent_calls parameter.

How It Works

Concurrent Calls Parameter

"concurrent_calls": 3
This tells KrakenD to:
  1. Make 3 identical requests to the same backend simultaneously
  2. Wait for all responses
  3. Return the fastest successful response
  4. Discard the slower responses

Execution Timeline

Without Concurrent Calls:
Request → Backend (250ms) → Response
With Concurrent Calls (3):
Request → Backend Call 1 (250ms) → ✗ Discarded
       → Backend Call 2 (180ms) → ✓ Returned
       → Backend Call 3 (220ms) → ✗ Discarded
The client receives the response in 180ms instead of 250ms.

Performance Benefits

Latency Reduction

With variable backend latency, concurrent calls reduce the 95th percentile response time:
MetricSingle Call3 Concurrent Calls
Min150ms150ms
Avg200ms170ms
P95300ms220ms
P99400ms250ms

Fault Tolerance

If one backend call fails, the others can still succeed:
Request → Backend Call 1 (timeout) → ✗ Failed
       → Backend Call 2 (200ms)    → ✓ Returned
       → Backend Call 3 (210ms)    → ✗ Discarded
The request succeeds even though one call timed out.

When to Use Concurrent Calls

✅ Good Use Cases

  1. Variable Latency Backends
    • Cloud services with inconsistent performance
    • Geographically distributed backends
    • Services with occasional slow responses
  2. Critical Endpoints
    • High-value transactions
    • Real-time data requirements
    • User-facing APIs where latency matters
  3. Unreliable Networks
    • Mobile backends
    • Cross-region communication
    • Microservices with network jitter

❌ When Not to Use

  1. Write Operations
    • Creates duplicate records
    • Causes race conditions
    • Wastes backend resources
  2. Rate-Limited APIs
    • Consumes rate limit quota faster
    • May trigger rate limiting
    • Expensive API calls
  3. Heavy Operations
    • Database writes
    • File uploads
    • Complex computations

Trade-offs

Advantages

  • Lower Latency - Returns fastest response
  • Higher Reliability - Tolerates individual failures
  • Better User Experience - Faster perceived performance

Disadvantages

  • Increased Backend Load - N times more requests
  • Higher Network Usage - Multiple redundant calls
  • Cost Implications - More API calls = higher costs

Choosing the Right Number

"concurrent_calls": 2  // Minimal overhead, some improvement
"concurrent_calls": 3  // Balanced approach (recommended)
"concurrent_calls": 5  // High reliability, higher cost

Recommendation

Start with concurrent_calls: 3 and adjust based on:
  • Backend latency variance
  • Cost sensitivity
  • Performance requirements

Comparison Table

FeatureConcurrent RequestsConcurrent Backends
PurposeSame call multiple timesDifferent calls in parallel
Backends1 backendMultiple backends
ResponseFastest responseAggregated response
Use CaseReduce latencyMerge data sources
Configconcurrent_calls: 3Multiple backend entries
Example of Concurrent Backends (different concept):
{
  "endpoint": "/user-profile",
  "backend": [
    {
      "url_pattern": "/users/{id}",
      "group": "user"
    },
    {
      "url_pattern": "/orders/{id}",
      "group": "orders"
    }
  ]
}
This calls two different endpoints in parallel and merges results. This is data aggregation, not concurrent requests.

Combining Strategies

You can use concurrent requests WITH data aggregation:
{
  "endpoint": "/user-dashboard",
  "concurrent_calls": 2,
  "backend": [
    {
      "url_pattern": "/users/{id}",
      "group": "user"
    },
    {
      "url_pattern": "/orders/{id}",
      "group": "orders"
    }
  ]
}
This makes 2 concurrent calls to /users/{id} AND 2 concurrent calls to /orders/{id}, then aggregates the fastest responses.

Monitoring

Use Jaeger tracing to visualize concurrent requests:
GET /market/concurrent
├── Backend Call 1 (250ms)
├── Backend Call 2 (180ms) ← Used
└── Backend Call 3 (220ms)
Traces show all parallel calls and which was returned.

Best Practices

1. Use for Read Operations Only

// ✓ Safe - GET request
{
  "endpoint": "/product/{id}",
  "concurrent_calls": 3,
  "backend": [{
    "url_pattern": "/products/{id}"
  }]
}

// ✗ Dangerous - POST request
{
  "endpoint": "/orders",
  "method": "POST",
  "concurrent_calls": 3,  // Creates 3 orders!
  "backend": [{
    "url_pattern": "/orders"
  }]
}

2. Consider Backend Costs

If your backend charges per request:
Single call cost: $0.001
Concurrent calls (3): $0.003
1M requests/month: $3,000 vs $1,000
Balance cost against performance needs.

3. Monitor Backend Load

Concurrent calls triple backend load. Ensure your backend can handle it:
Regular load: 1,000 req/s
With concurrent_calls: 3: 3,000 req/s

4. Set Appropriate Timeouts

{
  "endpoint": "/market/concurrent",
  "timeout": "2s",
  "concurrent_calls": 3,
  "backend": [
    {
      "host": ["https://api.example.com"],
      "url_pattern": "/data"
    }
  ]
}
If all 3 calls exceed the timeout, the request fails.

Learn More

Build docs developers (and LLMs) love