Skip to content

Rate Limiting

This guide explains FLUID Network's rate limiting system, including limits for different API types, how to handle rate limit errors, and strategies for high-volume integrations.

Overview

FLUID Network implements rate limiting to ensure fair resource allocation and maintain service quality for all users. Rate limits are enforced per API endpoint and vary based on your account type and the operation being performed.

Rate Limit Tiers

Default Limits

Endpoint TypeRequests/MinuteRequests/HourBurst Allowance
Payment Provider API1,00060,000100
Bank API20012,00050
General API1006,00020
Webhook DeliveryN/AUnlimitedN/A

Rate Limit Scopes

Rate limits are enforced at multiple levels:

1. IP-Based Rate Limiting

  • Applied to unauthenticated requests
  • Prevents abuse from single source
  • Error code: 1453

2. Partner-Based Rate Limiting

  • Applied per payment partner account
  • Enforced on payment processing endpoints
  • Error code: 1454

3. Bank-Based Rate Limiting

  • Applied per bank account
  • Enforced on transaction queries and reports
  • Error code: 1455

4. Payment Processing Rate Limiting

  • Special high-priority limit for transaction initiation
  • Stricter enforcement to prevent duplicate processing
  • Error code: 1456

Rate Limit Headers

Every API response includes rate limit information in headers:

http
HTTP/1.1 200 OK
RateLimit-Limit: 1000
RateLimit-Remaining: 847
RateLimit-Reset: 1672531200
Retry-After: 45

Header Definitions

HeaderDescriptionExample
RateLimit-LimitTotal requests allowed in window1000
RateLimit-RemainingRequests left in current window847
RateLimit-ResetUnix timestamp when limit resets1672531200
Retry-AfterSeconds to wait before retrying (on 429)45

Rate Limit Errors

Error Response Structure

When you exceed a rate limit, you'll receive a 429 Too Many Requests response:

json
{
  "error": {
    "code": 1454,
    "message": "Rate limit exceeded for payment partner",
    "category": "security",
    "severity": "medium",
    "details": {
      "limit": 1000,
      "window": "1 minute",
      "retry_after": 45
    }
  }
}

Rate Limit Error Codes

CodeErrorScopeLimitAction
1429Too Many RequestsGeneral100/minWait and retry
1453Rate Limit by IPIP Address100/minImplement backoff
1454Rate Limit by PartnerPayment Partner1000/minWait or upgrade plan
1455Rate Limit by BankBank Account200/minReduce query frequency
1456Payment Rate LimitPayment Processing1000/minCritical: implement backoff

Handling Rate Limits

Basic Implementation

Detect rate limit errors and wait before retrying:

Implementation Steps:

  1. Make API request with proper authentication headers
  2. Check response status code for 429 (Too Many Requests)
  3. Extract Retry-After header value (seconds to wait)
  4. Extract error code from response body for specific rate limit type
  5. Log the rate limit event with error code and retry time
  6. Wait for specified duration from Retry-After header
  7. Retry the original request

Monitor Rate Limit Headers:

  • Extract RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset from every response
  • Log current usage: {remaining}/{limit} (resets at {timestamp})
  • Warn when remaining requests < 10% of limit
  • Use reset timestamp to understand when limit refreshes

Key Considerations:

  • Always use Retry-After header value, not arbitrary delays
  • Log rate limit warnings for monitoring and alerting
  • Track remaining requests to proactively throttle
  • Handle nested retry attempts to avoid infinite loops

Exponential Backoff Strategy

For more resilient handling, implement exponential backoff with jitter:

Strategy Components:

  1. Maximum Retries: Set limit (e.g., 5 attempts) to prevent infinite loops
  2. Error Detection: Check for HTTP 429 status or rate limit error codes (1429, 1453-1456)
  3. Retry-After Priority: Use Retry-After header value if provided by API
  4. Exponential Calculation: Delay = 2^attempt × 1000ms (2s, 4s, 8s, 16s, 32s)
  5. Jitter: Add random 0-1000ms to prevent thundering herd problem
  6. Maximum Cap: Cap delay at 60 seconds to avoid excessive waits

Backoff Calculation Logic:

  • If Retry-After header present: use that value (in seconds) × 1000
  • Otherwise: calculate exponential delay = 2^attempt × 1000ms
  • Add random jitter: 0-1000ms randomized
  • Apply maximum cap: min(calculatedDelay, 60000ms)

Usage Pattern:

  • Wrap API calls in backoff handler
  • Handler recursively retries on rate limit errors
  • Stops after max retries or permanent error
  • Logs each retry attempt with delay duration

Proactive Rate Limit Management

Monitor your rate limit usage and throttle requests proactively:

Monitoring Strategy:

  • Track current rate limit from response headers
  • Calculate utilization rate: 1 - (remaining / limit)
  • Set target utilization threshold (e.g., 80%)
  • Throttle requests when utilization exceeds threshold

Throttle Calculation:

  • Determine time until rate limit reset (from RateLimit-Reset header)
  • Calculate available requests (RateLimit-Remaining)
  • Distribute remaining requests evenly over time until reset
  • Calculate delay per request: timeUntilReset / remainingRequests
  • Apply minimum delay (e.g., 100ms) to prevent excessive throttling

Implementation Flow:

  1. Update rate limit metrics from every API response header
  2. Before each request, check if throttling is needed
  3. If utilization > target threshold: calculate throttle delay
  4. Wait for calculated delay before making request
  5. Log throttle actions for monitoring

Benefits:

  • Prevents hitting rate limits by staying under threshold
  • Smoother request distribution over time
  • Reduces retry overhead and delays
  • Better predictability for high-volume operations

Batch Processing Strategies

Queue-Based Processing

Implement a queue to process requests within rate limits:

Queue Architecture:

  • Maintain FIFO (first-in-first-out) queue of pending requests
  • Process queue sequentially at controlled rate
  • Calculate request interval: (60 seconds / requestsPerMinute) × 1000ms
  • Example: 1000 req/min = 60ms between requests

Processing Logic:

  1. Add requests to queue with promise handlers (resolve/reject)
  2. Start queue processor if not already running
  3. Dequeue next request and execute
  4. On rate limit error (429): reinsert at front of queue and wait longer
  5. Wait for request interval before processing next request
  6. Continue until queue is empty

Error Handling:

  • On 429 error: extract Retry-After header, wait specified time, requeue request
  • On other errors: reject promise, continue to next request
  • Track queue length for monitoring

Benefits:

  • Maintains steady request rate automatically
  • Self-adjusts on rate limit errors
  • Preserves request order
  • Prevents overwhelming the API

Parallel Processing with Concurrency Control

Process multiple requests in parallel while respecting rate limits:

Concurrency Strategy:

  • Set maximum concurrent requests (e.g., 10 simultaneous requests)
  • Set total requests per minute limit (e.g., 1000)
  • Track active requests and request timestamps
  • Split tasks into chunks of max concurrent size
  • Process chunks sequentially, requests within chunks in parallel

Rate Limit Tracking:

  • Record timestamp of each request
  • Clean old timestamps (older than 1 minute)
  • Count recent requests in last 60 seconds
  • Wait if recent requests >= requests per minute limit
  • Calculate wait time: 60000ms - (now - oldestRequestTime)

Processing Flow:

  1. Split all tasks into chunks (size = max concurrent)
  2. For each chunk:
    • Check if rate limit would be exceeded
    • Wait if necessary until window refreshes
    • Execute all tasks in chunk in parallel using Promise.allSettled
    • Record request timestamp for each task
  3. Collect results (fulfilled/rejected) from all chunks
  4. Report success/failure counts

Benefits:

  • Maximizes throughput with parallel processing
  • Respects rate limits by tracking request history
  • Gracefully handles mixed success/failure with allSettled
  • Provides progress visibility with chunk-based processing

Monitoring and Alerting

Track Rate Limit Metrics

Implement metrics tracking to monitor rate limit health:

Key Metrics to Track:

  • total_requests - Total API requests made
  • rate_limited - Count of 429 responses received
  • by_error_code - Breakdown by specific error code (1429, 1453-1456)
  • peak_usage - Highest concurrent usage observed
  • current_usage - Current usage calculated from headers

Metric Collection:

  • Record every API response with rate limit headers
  • Extract limit, remaining, reset from headers
  • Calculate current usage: limit - remaining
  • Track peak usage as maximum of all observations
  • Increment rate_limited counter on 429 errors
  • Group rate limit errors by error code

Analysis Calculations:

  • Rate limit rate: rate_limited / total_requests
  • Utilization: current_usage / limit
  • Error breakdown: percentage by error code

Alerting Thresholds:

  • High Priority: Rate limit rate > 5% of requests
  • Warning: Utilization consistently > 80%
  • Critical: Specific error code appears >10 times per minute

Reporting:

  • Log metrics every minute
  • Reset counters after reporting (preserve peak values)
  • Include utilization percentage in logs
  • Send alerts when thresholds exceeded

Dashboard Visualization:

  • Current utilization gauge (0-100%)
  • Rate limit rate trend line
  • Error code distribution pie chart
  • Peak usage vs limit comparison

Best Practices

1. Respect Rate Limit Headers

Always check and respect the Retry-After header:

Good Practice: Extract Retry-After header value (in seconds) from 429 response, convert to milliseconds, wait for exact duration before retry.

Bad Practice: Using arbitrary fixed delays (e.g., always wait 5 seconds) ignores API guidance and may wait too long or retry too soon.

2. Implement Exponential Backoff

Use exponential backoff for retries, not fixed delays:

Good Practice: Calculate delay exponentially - delay = 2^attempt × 1000ms (grows as: 2s, 4s, 8s, 16s, 32s).

Bad Practice: Using same delay every time (e.g., always 5 seconds) doesn't adapt to severity or back off appropriately.

3. Cache When Possible

Cache responses to reduce API calls:

Implementation:

  • Cache frequently accessed data (e.g., account balance, settings)
  • Set appropriate TTL (time-to-live) for each data type
  • Example: Cache balance for 5 minutes (300 seconds)
  • Check cache timestamp before making API call
  • Return cached data if timestamp within TTL
  • Refresh cache on miss or expiration

Cache Storage:

  • Store data with timestamp
  • Use Map or similar key-value structure
  • Structure: { data: responseData, timestamp: Date.now() }

4. Batch Operations

Group multiple operations into fewer API calls:

Good Practice: Query multiple transactions in single call using array of IDs or filter parameters. One API call returns multiple results.

Bad Practice: Making individual API calls for each transaction ID. N transactions = N API calls (wasteful and slow).

Batching Strategy:

  • Collect IDs or operations to batch
  • Use bulk endpoints when available
  • Send single request with array of items
  • Process all results together

5. Use Webhooks Over Polling

Rely on webhooks for status updates instead of polling:

Good Practice: Configure webhook endpoint to receive transaction status notifications. API pushes updates when status changes.

Bad Practice: Polling transaction status every 5 seconds with repeated API calls. Wastes rate limit quota and adds latency.

Webhook Benefits:

  • Zero API calls for status checks
  • Real-time notifications
  • Preserves rate limit for other operations
  • More efficient and responsive

6. Monitor and Alert

Set up monitoring for rate limit issues:

Monitoring Approach:

  • Set utilization threshold (e.g., 80% of limit)
  • Extract rate limit headers from every response
  • Calculate current utilization: 1 - (remaining / limit)
  • Alert when utilization exceeds threshold
  • Include specific metrics in alert: utilization percentage, remaining requests, reset time

Alert Severity Levels:

  • Warning: >80% utilization
  • High: >90% utilization or rate limit hit
  • Critical: Sustained rate limiting >5% of requests

Upgrading Your Rate Limits

If you consistently hit rate limits, contact FLUID Network to discuss upgrading:

Enterprise Plans offer:

  • Higher rate limits (custom based on volume)
  • Dedicated API endpoints
  • Priority support
  • Custom SLAs

Contact: enterprise@fluidnetwork.africa

Testing Rate Limits

Sandbox Testing

Test your rate limit handling implementation in the sandbox environment:

Test Scenario Design:

  • Generate batch of requests exceeding rate limit (e.g., 150 requests for 100/min limit)
  • Include proper authentication and valid request parameters
  • Track successful vs rate-limited responses
  • Verify retry mechanism activates correctly

Test Execution:

  1. Send rapid burst of API requests to trigger rate limiting
  2. Monitor response status codes (200 vs 429)
  3. Verify Retry-After header presence in 429 responses
  4. Confirm retry logic waits for specified duration
  5. Validate successful retry after wait period

Success Criteria:

  • Rate limit errors (429) occur as expected when limit exceeded
  • Retry logic correctly extracts and honors Retry-After header
  • Requests eventually succeed after backoff period
  • No infinite retry loops or cascading failures
  • Metrics tracking captures rate limit events accurately

Quick Reference

Rate Limit Error Handling Checklist

  • ✅ Check Retry-After header in 429 responses
  • ✅ Implement exponential backoff for retries
  • ✅ Log rate limit events for monitoring
  • ✅ Alert when rate limit hit > 5% of requests
  • ✅ Use queue or throttling for bulk operations
  • ✅ Monitor RateLimit-Remaining header proactively
  • ✅ Cache responses when appropriate
  • ✅ Use webhooks instead of polling
  • ✅ Batch operations where possible
  • ✅ Test rate limit handling in sandbox

Common Patterns Summary

PatternWhen to UseImplementation
Simple RetryLow-volume, occasional limitsWait for Retry-After, retry once
Exponential BackoffMedium-volume, transient errorsExponential delay with jitter
Queue-BasedHigh-volume, predictable loadProcess requests in queue at steady rate
Proactive ThrottlingCritical operationsMonitor headers, throttle before limit
Parallel with ConcurrencyBulk operationsProcess in chunks with rate tracking

Support

Questions about rate limits?