Skip to main content

Overview

Carrier supports dynamic visibility timeouts, allowing your worker application to control when messages are re-delivered from SQS. This enables sophisticated distributed backoff strategies, particularly useful for:
  • Rate limiting and throttling
  • Exponential backoff for external API calls
  • Scheduled retries based on business logic
  • Handling temporary failures gracefully
Unlike traditional retry mechanisms that rely on SQS’s fixed visibility timeout, dynamic timeouts give your application fine-grained control over message re-delivery timing.

How It Works

Headers Sent by Carrier

Carrier includes metadata headers with every webhook request to help you implement intelligent retry logic:
HeaderDescriptionSource
X-Carrier-Receive-CountNumber of times this message has been received from SQSApproximateReceiveCount attribute
X-Carrier-First-Receive-TimeUnix timestamp (seconds) of the first time this message was receivedApproximateFirstReceiveTimestamp attribute
These headers are defined in transmitter/webhook/transmitter.go:15-22:
const (
    // HeaderPrefix is the prefix used for all HTTP request headers sent by the Transmitter.
    HeaderPrefix = "X-Carrier-"
    // HeaderRetryAfter is the standard Retry-After header.
    HeaderRetryAfter = "Retry-After"
    // HeaderContentType is the standard Content-Type header.
    HeaderContentType = "Content-Type"
)
The headers are populated from SQS message attributes in receiver/sqs/sqs.go:135-155:
func (h *handler) generateAttributes(m *message) transmitter.TransmitAttributes {
    attributes := make(transmitter.TransmitAttributes)
    for k, v := range m.Attributes {
        switch k {
        case SQSAttributeApproxomiteReceiveCount:
            attributes[TransmitAttributeReceiveCount] = v
        case SQSAttributeApproxomiteFirstReceiveTimestamp:
            attributes[TransmitAttributeFirstReceiveTime] = v
        }
    }
    // ... additional attribute handling
    return attributes
}

Setting Visibility Timeout

To delay message re-delivery, your webhook endpoint should:
  1. Return HTTP 429 (Too Many Requests)
  2. Include the Retry-After header with the delay in seconds
Carrier will automatically update the message’s visibility timeout so it won’t be received again until after the specified time.
func webhookHandler(w http.ResponseWriter, r *http.Request) {
    receiveCount := r.Header.Get("X-Carrier-Receive-Count")
    
    // Calculate exponential backoff
    count, _ := strconv.Atoi(receiveCount)
    backoffSeconds := int(math.Pow(2, float64(count))) // 1, 2, 4, 8, 16...
    
    // Set retry delay
    w.Header().Set("Retry-After", strconv.Itoa(backoffSeconds))
    w.WriteHeader(http.StatusTooManyRequests)
}

Implementation Details

Carrier Processing

When Carrier receives a 429 response, it extracts the Retry-After header value and updates the SQS message visibility timeout (transmitter/webhook/transmitter.go:113-132):
switch res.StatusCode {
case http.StatusOK:
    // transmit successful
    return nil
case http.StatusTooManyRequests:
    // return a retryable error with the retry-after header value
    retryAfter := res.Header.Get(HeaderRetryAfter)
    if retryAfter != "" {
        seconds, err := strconv.Atoi(retryAfter)
        if err != nil {
            // cannot retry if we cannot parse the Retry-After header
            return fmt.Errorf("%w: %w: %w", transmitter.ErrTransmitFailed, ErrStatusCode429, err)
        }
        return transmitter.NewTransmitRetryableError(ErrStatusCode429, time.Duration(seconds*int(time.Second)))
    }
    return fmt.Errorf("%w: %w: %w", transmitter.ErrTransmitFailed, ErrStatusCode429, ErrNoRetryAfterHeader)
default:
    // return a non-retryable error
    return fmt.Errorf("%w: %w: %d", transmitter.ErrTransmitFailed, ErrNon200StatusCode, res.StatusCode)
}

SQS Visibility Update

The receiver processes retryable errors and batches visibility timeout updates (receiver/sqs/sqs.go:248-293):
for range len(res.Messages) {
    r := <-p.results
    if r.err != nil {
        var err *transmitter.TransmitRetryableError
        if errors.As(r.err, &err) {
            // update visibility timeout on retryable errors
            retryEntries = append(retryEntries, types.ChangeMessageVisibilityBatchRequestEntry{
                Id:                r.MessageID,
                ReceiptHandle:     r.ReceiptHandle,
                VisibilityTimeout: int32(err.RetryAfter.Seconds()),
            })
        } else {
            p.log.Error("failed to transmit message", "error", r.err)
        }
    } else {
        deleteEntries = append(deleteEntries, types.DeleteMessageBatchRequestEntry{
            Id:            r.MessageID,
            ReceiptHandle: r.ReceiptHandle,
        })
    }
}

if len(retryEntries) > 0 {
    // update visibility timeout on messages that received retryable errors
    _, err := p.client.ChangeMessageVisibilityBatch(p.ctx, &sqs.ChangeMessageVisibilityBatchInput{
        QueueUrl: &p.queueURL,
        Entries:  retryEntries,
    })
    if err != nil {
        p.log.Error("failed to update message visibility", "error", err)
    } else {
        p.log.Debug("updated message visibility", "count", len(retryEntries))
    }
}
If the Retry-After header is missing or cannot be parsed as an integer, Carrier treats it as a permanent failure and the message will not be retried.

Common Backoff Strategies

Exponential Backoff

Double the delay with each retry attempt:
func calculateExponentialBackoff(receiveCount int, baseSeconds int) int {
    return baseSeconds * int(math.Pow(2, float64(receiveCount-1)))
}

// Examples:
// Attempt 1: 2s
// Attempt 2: 4s
// Attempt 3: 8s
// Attempt 4: 16s

Exponential Backoff with Cap

Prevent extremely long delays by capping the maximum backoff:
func calculateCappedBackoff(receiveCount int, baseSeconds int, maxSeconds int) int {
    backoff := baseSeconds * int(math.Pow(2, float64(receiveCount-1)))
    if backoff > maxSeconds {
        return maxSeconds
    }
    return backoff
}

// With base=2, max=300 (5 minutes):
// Attempt 1: 2s
// Attempt 2: 4s
// Attempt 3: 8s
// ...
// Attempt 8: 256s
// Attempt 9+: 300s (capped)

Linear Backoff

Increase delay by a fixed amount:
func calculateLinearBackoff(receiveCount int, incrementSeconds int) int {
    return receiveCount * incrementSeconds
}

// With increment=30:
// Attempt 1: 30s
// Attempt 2: 60s
// Attempt 3: 90s
// Attempt 4: 120s

Fibonacci Backoff

Use Fibonacci sequence for more gradual scaling:
func calculateFibonacciBackoff(receiveCount int) int {
    if receiveCount <= 1 {
        return 1
    }
    a, b := 1, 1
    for i := 2; i < receiveCount; i++ {
        a, b = b, a+b
    }
    return b
}

// Examples:
// Attempt 1: 1s
// Attempt 2: 1s
// Attempt 3: 2s
// Attempt 4: 3s
// Attempt 5: 5s
// Attempt 6: 8s

Real-World Use Cases

Rate Limiting External APIs

func webhookHandler(w http.ResponseWriter, r *http.Request) {
    // Call external API
    resp, err := http.Get("https://api.example.com/data")
    
    if resp != nil && resp.StatusCode == 429 {
        // API is rate limited
        retryAfter := resp.Header.Get("Retry-After")
        if retryAfter != "" {
            // Forward the API's retry-after to Carrier
            w.Header().Set("Retry-After", retryAfter)
            w.WriteHeader(http.StatusTooManyRequests)
            return
        }
        
        // Default backoff if no Retry-After header
        receiveCount, _ := strconv.Atoi(r.Header.Get("X-Carrier-Receive-Count"))
        backoff := int(math.Pow(2, float64(receiveCount)))
        w.Header().Set("Retry-After", strconv.Itoa(backoff))
        w.WriteHeader(http.StatusTooManyRequests)
        return
    }
    
    // Process normally
    w.WriteHeader(http.StatusOK)
}

Time-Based Processing

Schedule processing for a specific time:
func webhookHandler(w http.ResponseWriter, r *http.Request) {
    // Parse message to get scheduled time
    var msg Message
    json.NewDecoder(r.Body).Decode(&msg)
    
    scheduledTime := msg.ProcessAt // Unix timestamp
    now := time.Now().Unix()
    
    if now < scheduledTime {
        // Not yet time to process
        delay := scheduledTime - now
        w.Header().Set("Retry-After", strconv.FormatInt(delay, 10))
        w.WriteHeader(http.StatusTooManyRequests)
        return
    }
    
    // Process the message
    processMessage(msg)
    w.WriteHeader(http.StatusOK)
}

Circuit Breaker Pattern

Protect downstream services with automatic backoff:
type CircuitBreaker struct {
    failures      int
    lastFailure   time.Time
    state         string // "closed", "open", "half-open"
    threshold     int
    resetTimeout  time.Duration
}

func (cb *CircuitBreaker) webhookHandler(w http.ResponseWriter, r *http.Request) {
    if cb.state == "open" {
        // Circuit is open, calculate backoff
        timeSinceFailure := time.Since(cb.lastFailure)
        if timeSinceFailure < cb.resetTimeout {
            remaining := cb.resetTimeout - timeSinceFailure
            w.Header().Set("Retry-After", strconv.Itoa(int(remaining.Seconds())))
            w.WriteHeader(http.StatusTooManyRequests)
            return
        }
        // Try half-open state
        cb.state = "half-open"
    }
    
    // Attempt to process
    err := processMessage(r.Body)
    if err != nil {
        cb.failures++
        cb.lastFailure = time.Now()
        
        if cb.failures >= cb.threshold {
            cb.state = "open"
        }
        
        // Exponential backoff
        backoff := int(math.Pow(2, float64(cb.failures)))
        w.Header().Set("Retry-After", strconv.Itoa(backoff))
        w.WriteHeader(http.StatusTooManyRequests)
        return
    }
    
    // Success - reset circuit breaker
    cb.failures = 0
    cb.state = "closed"
    w.WriteHeader(http.StatusOK)
}

Best Practices

Cap Maximum Backoff

Prevent indefinite delays by setting a maximum backoff value (e.g., 5-15 minutes).

Use Receive Count

Always base backoff calculations on X-Carrier-Receive-Count for consistency.

Add Jitter

Add random variation to prevent thundering herd problems when multiple messages retry simultaneously.

Monitor Retry Patterns

Track receive counts and retry delays to identify systemic issues.

Adding Jitter

Prevent synchronized retries by adding randomness:
import "math/rand"

func calculateBackoffWithJitter(receiveCount int, baseSeconds int) int {
    backoff := baseSeconds * int(math.Pow(2, float64(receiveCount-1)))
    jitter := rand.Intn(backoff / 4) // Add up to 25% jitter
    return backoff + jitter
}

Limitations

SQS has a maximum visibility timeout of 12 hours (43,200 seconds). Attempting to set a longer timeout will result in an error.
The Retry-After header must be an integer representing seconds. HTTP date formats are not supported.

Troubleshooting

  • Verify you’re returning HTTP 429 status code
  • Check that the Retry-After header is present and contains a valid integer
  • Review Carrier logs for “failed to update message visibility” errors
  • Ensure the retry delay is less than 12 hours
  • Verify your backoff calculation logic
  • Check that you’re reading the X-Carrier-Receive-Count header correctly
  • Add logging to see what Retry-After values you’re returning
The receive count is managed by SQS and only increments when:
  • The message becomes visible again (after visibility timeout expires)
  • The message is received from the queue
It will NOT increment if the message is deleted (successful processing).

Health Checks

Monitor webhook endpoint availability

Monitoring

Track message processing and retry patterns

Build docs developers (and LLMs) love