Rate Limiting Strategies

RateThrottle implements five proven rate limiting algorithms, each suited for different use cases.

Overview

Rate limiting strategies determine how requests are counted and when limits reset. The choice of strategy affects:

How bursts are handled
How smoothly limits are enforced
Memory and computational overhead
Behavior at window boundaries

Available Strategies

Strategy	Best For	Allows Bursts	Complexity
Token Bucket	APIs with variable load	Yes (configurable)	Medium
Leaky Bucket	Constant-rate processing	No	Medium
Fixed Window	Simple high-volume APIs	Yes (at boundaries)	Low
Sliding Window Counter	Accurate rate limiting	Minimal	Low
Sliding Window	Smooth enforcement	Partially	High

Token Bucket Strategy

How It Works

Imagine a bucket that holds tokens. Tokens are added at a constant rate, and each request consumes one token. If no tokens are available, the request is blocked.

rule = RateThrottleRule(
    name="api_burst",
    limit=100,        # Refill rate: 100 tokens per window
    window=60,        # Window: 60 seconds (= ~1.67 tokens/sec)
    strategy="token_bucket",
    burst=150         # Bucket capacity: 150 tokens
)

Key Characteristics:

Tokens refill at rate: limit / window per second
Maximum tokens: burst (defaults to limit)
Allows traffic bursts up to burst capacity
Smooth long-term rate limiting

Example Behavior

Given: limit=100, window=60, burst=150

Time    | Tokens | Request | Result
--------|--------|---------|--------
0s      | 150    | 50      | ✓ (100 remaining)
1s      | 101.67 | 50      | ✓ (51.67 remaining)
2s      | 53.34  | 60      | ✗ Blocked (need 60, have 53.34)
30s     | 103.34 | 100     | ✓ (3.34 remaining)
60s     | 53.34  | 50      | ✓ (3.34 remaining)

Best For

APIs that allow occasional bursts
Services with varying load patterns
User-facing APIs where responsiveness matters
Scenarios where some burst is acceptable

Implementation

from ratethrottle import RateThrottleCore, RateThrottleRule

limiter = RateThrottleCore()

# Standard configuration
rule = RateThrottleRule(
    name="api_standard",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=120  # 20% burst capacity
)
limiter.add_rule(rule)

# Strict configuration (no burst)
strict_rule = RateThrottleRule(
    name="api_strict",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=100  # Same as limit = no burst
)

Leaky Bucket Strategy

How It Works

Imagine a bucket with a hole at the bottom that leaks at a constant rate. Requests fill the bucket. If the bucket overflows, requests are rejected.

rule = RateThrottleRule(
    name="api_steady",
    limit=100,        # Leak rate: 100 requests per window
    window=60,        # Window: 60 seconds
    strategy="leaky_bucket"
)

Key Characteristics:

Processes requests at constant rate: limit / window per second
Queue capacity: limit requests
Enforces steady, predictable rate
No bursts allowed

Example Behavior

Given: limit=100, window=60 (leak rate = 1.67/sec)

Time    | Queue  | Request | Result
--------|--------|---------|--------
0s      | 0      | 10      | ✓ (queue: 10)
1s      | 8.33   | 10      | ✓ (queue: 18.33)
5s      | 10     | 95      | ✗ Blocked (overflow)
10s     | 0      | 50      | ✓ (queue: 50)
60s     | 0      | 10      | ✓ (queue: 10)

Best For

Background processing systems
Message queues and workers
Rate-sensitive third-party API calls
Ensuring predictable backend load

Implementation

# Constant-rate API calls
api_rule = RateThrottleRule(
    name="external_api",
    limit=60,          # 60 requests
    window=60,         # per minute = 1 req/sec
    strategy="leaky_bucket"
)

# Database operations
db_rule = RateThrottleRule(
    name="database",
    limit=1000,        # 1000 operations
    window=60,         # per minute
    strategy="leaky_bucket"
)

Fixed Window Strategy

How It Works

Counts requests in fixed time windows. When a window ends, the counter resets to zero.

rule = RateThrottleRule(
    name="api_simple",
    limit=100,
    window=60,
    strategy="fixed_window"
)

Key Characteristics:

Simple counter per time window
Resets at window boundaries
Fast and memory-efficient
Subject to boundary condition issues

Example Behavior

Given: limit=100, window=60

Time      | Window      | Count | Request | Result
----------|-------------|-------|---------|--------
00:00  | 00:00-00:59 | 0     | 60      | ✓ (60/100)
00:30  | 00:00-00:59 | 60    | 40      | ✓ (100/100)
00:45  | 00:00-00:59 | 100   | 10      | ✗ Blocked
01:00  | 00:01-01:59 | 0     | 60      | ✓ (60/100) - Reset!
01:01  | 00:01-01:59 | 60    | 60      | ✓ (120/100)

Boundary Condition Problem

A client can make up to 2 * limit requests in window seconds:

Window 1: [00:00:00 - 00:00:59]
- At 00:00:59: Make 100 requests ✓

Window 2: [00:01:00 - 00:01:59]
- At 00:01:00: Make 100 requests ✓

Total: 200 requests in 2 seconds!

Best For

High-performance scenarios where speed is critical
Internal APIs with trusted clients
Cases where slight overage is acceptable
Simple, straightforward rate limiting

Implementation

# High-volume public API
public_rule = RateThrottleRule(
    name="public_api",
    limit=10000,
    window=60,
    strategy="fixed_window"
)

# Per-user limit
user_rule = RateThrottleRule(
    name="user_api",
    limit=1000,
    window=3600,  # 1 hour
    strategy="fixed_window",
    scope="user"
)

Sliding Window Counter Strategy

How It Works

A hybrid approach that combines the efficiency of Fixed Window with the accuracy of Sliding Window. Uses weighted average of current and previous window counters.

rule = RateThrottleRule(
    name="api_balanced",
    limit=100,
    window=60,
    strategy="sliding_counter"
)

Key Characteristics:

Uses only 2 counters (current + previous window)
Calculates weighted average based on time elapsed
Solves Fixed Window boundary problem
Memory: O(1) like Fixed Window
Accuracy: ~95-98% of true Sliding Window

Formula:

weighted_count = (previous_count × (1 - window_progress)) + current_count

Where window_progress = time_elapsed_in_current_window / window_size

Example Behavior

Given: limit=100, window=60

Previous window (0-60s): 80 requests
Current window (60-120s): 40 requests
Current time: 90s (30 seconds into window = 50% progress)

Weighted count = (80 × (1 - 0.5)) + 40
               = (80 × 0.5) + 40
               = 40 + 40
               = 80 requests

Remaining: 100 - 80 = 20 requests allowed

Time progression example:

Time   | Progress | Previous × Weight | Current | Weighted Total
-------|----------|-------------------|---------|---------------
60s    | 0%       | 80 × 1.0 = 80    | 0       | 80 (block new)
75s    | 25%      | 80 × 0.75 = 60   | 5       | 65
90s    | 50%      | 80 × 0.5 = 40    | 15      | 55
105s   | 75%      | 80 × 0.25 = 20   | 30      | 50
119s   | 98%      | 80 × 0.02 = 1.6  | 50      | 51.6

Solves Boundary Problem

Unlike Fixed Window, Sliding Window Counter prevents burst abuse at boundaries:

Fixed Window Problem:
├─ Window 1 (0-60s): 100 requests at t=59s ✓
├─ Window 2 (60-120s): 100 requests at t=61s ✓
└─ Total: 200 requests in 2 seconds! ✗

Sliding Window Counter Solution:
├─ Window 1 (0-60s): 100 requests at t=59s ✓
├─ At t=61s (1.67% into window 2):
│   └─ Weighted: (100 × 0.983) + 0 = 98.3 ≈ 99
│   └─ Limit: 100
│   └─ Can only make ~1 request ✓
└─ Prevents burst! ✓

Best For

Most production APIs - Best all-around choice
APIs needing better accuracy than Fixed Window
When memory is limited (can’t store all timestamps)
High-traffic applications where performance matters
Recommended default strategy

Advantages over alternatives:

More accurate than Fixed Window (no boundary bursts)
Much faster than Sliding Window (O(1) vs O(n))
Less memory than Sliding Window (2 counters vs n timestamps)
Industry standard (used by Cloudflare, Kong, AWS)

Implementation

from ratethrottle import RateThrottleCore, RateThrottleRule

limiter = RateThrottleCore()

# Standard configuration (recommended)
rule = RateThrottleRule(
    name="api_default",
    limit=100,
    window=60,
    strategy="sliding_counter"
)
limiter.add_rule(rule)

# High-traffic API
high_traffic = RateThrottleRule(
    name="public_api",
    limit=10000,
    window=60,
    strategy="sliding_counter"
)

# Per-user limit
user_rule = RateThrottleRule(
    name="user_api",
    limit=1000,
    window=3600,  # 1 hour
    strategy="sliding_counter",
    scope="user"
)

When to Use vs Other Strategies

Choose Sliding Window Counter when:
├─ You need accuracy better than Fixed Window
├─ You can't afford Sliding Window's memory cost
├─ You want industry-proven algorithm
└─ You want the best general-purpose strategy

Choose Fixed Window instead when:
├─ You need absolute maximum performance
└─ Boundary bursts are acceptable

Choose Sliding Window instead when:
├─ You need perfect accuracy (SLA/legal requirements)
├─ Memory cost is not a concern
└─ You can tolerate slower performance

Sliding Window Strategy

How It Works

Uses a sliding time window that moves with each request. Provides smooth rate limiting without boundary issues.

rule = RateThrottleRule(
    name="api_smooth",
    limit=100,
    window=60,
    strategy="sliding_window"
)

Key Characteristics:

Tracks timestamps of individual requests
Window slides with current time
No boundary condition issues
Higher memory usage

Example Behavior

Given: limit=100, window=60

Current Time: 00:01:00
Looking back 60 seconds: 00:00:00 - 00:01:00

Requests in window:
- 00:00:05: 20 requests
- 00:00:30: 40 requests
- 00:00:55: 30 requests
Total: 90 requests

New request at 00:01:00: ✓ Allowed (90/100)

Current Time: 00:01:30
Looking back 60 seconds: 00:00:30 - 00:01:30

Requests in window:
- 00:00:30: 40 requests (still in window)
- 00:00:55: 30 requests
- 00:01:00: 1 request
Total: 71 requests

Note: Requests at 00:00:05 are now outside window

Best For

Premium APIs requiring fair enforcement
Financial applications
Rate limits with legal/SLA requirements
Scenarios where accuracy is critical

Implementation

# Precise rate limiting
premium_rule = RateThrottleRule(
    name="premium_api",
    limit=5000,
    window=3600,
    strategy="sliding_window",
    scope="user"
)

# Payment endpoints
payment_rule = RateThrottleRule(
    name="payment",
    limit=10,
    window=60,
    strategy="sliding_window"
)

Strategy Comparison

Performance

Strategy	Speed	Memory Usage	Accuracy
Fixed Window	★★★★★	★★★★★	★★★☆☆
Sliding Window Counter	★★★★★	★★★★★	★★★★☆
Token Bucket	★★★★☆	★★★★☆	★★★★☆
Leaky Bucket	★★★★☆	★★★★☆	★★★★☆
Sliding Window	★★★☆☆	★★★☆☆	★★★★★

Burst Handling

# Token Bucket: Full burst support
limiter.check_rate_limit("client", "token_bucket_rule")
# Can burst up to 'burst' parameter

# Leaky Bucket: No bursts
limiter.check_rate_limit("client", "leaky_bucket_rule")
# Strictly enforces constant rate

# Fixed Window: Bursts at boundaries
limiter.check_rate_limit("client", "fixed_window_rule")
# Can burst at window edges (up to 2x limit)

# Sliding Window Counter: Minimal bursts
limiter.check_rate_limit("client", "sliding_counter_rule")
# Prevents boundary bursts, allows small variance

# Sliding Window: Limited bursts
limiter.check_rate_limit("client", "sliding_window_rule")
# Smooth enforcement prevents large bursts

Choosing a Strategy

Decision Tree

Need burst support?
├─ Yes → Do you need precise control?
│        ├─ Yes → Token Bucket
│        └─ No  → Fixed Window (simpler, faster)
│
└─ No  → Need perfect accuracy?
         ├─ Yes → Sliding Window (most accurate)
         └─ No  → Need good balance?
                  ├─ Yes → Sliding Window Counter (recommended)
                  └─ No  → Leaky Bucket (constant rate)

Use Case Matrix

Use Case	Recommended Strategy
Public REST API	Sliding Window Counter (best balance)
Internal microservices	Fixed Window (fast, simple)
Third-party API client	Leaky Bucket (respects their limits)
Payment processing	Sliding Window (perfect accuracy)
Search/autocomplete	Token Bucket (responsive bursts)
File uploads	Leaky Bucket (constant rate)
GraphQL API	Sliding Window Counter (accurate, fast)
WebSocket connections	Token Bucket (bursty traffic)
High-volume API	Sliding Window Counter (efficient, accurate)
SaaS applications	Sliding Window Counter (fair, performant)
Mobile app backend	Sliding Window Counter (handles bursts better)

Combining Strategies

You can use multiple strategies for different endpoints:

from ratethrottle import RateThrottleCore, RateThrottleRule

limiter = RateThrottleCore()

# Public endpoints: Best balance of accuracy and performance
public_rule = RateThrottleRule(
    name="public",
    limit=100,
    window=60,
    strategy="sliding_counter"
)

# Authenticated: Smooth limiting
auth_rule = RateThrottleRule(
    name="authenticated",
    limit=1000,
    window=60,
    strategy="sliding_counter"
)

# Expensive operations: Constant rate
expensive_rule = RateThrottleRule(
    name="expensive",
    limit=10,
    window=60,
    strategy="leaky_bucket"
)

# Background jobs: Simple counting
background_rule = RateThrottleRule(
    name="background",
    limit=10000,
    window=3600,
    strategy="fixed_window"
)

# Burst-friendly endpoints: Allow traffic spikes
burst_rule = RateThrottleRule(
    name="search",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=150
)

limiter.add_rule(public_rule)
limiter.add_rule(auth_rule)
limiter.add_rule(expensive_rule)
limiter.add_rule(background_rule)
limiter.add_rule(burst_rule)

Advanced Configuration

Token Bucket Fine-Tuning

# Generous burst for good UX
rule = RateThrottleRule(
    name="user_facing",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=200  # 2x burst capacity
)

# Strict: minimal burst
rule = RateThrottleRule(
    name="api_strict",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=105  # Only 5% burst
)

# No burst: same as limit
rule = RateThrottleRule(
    name="no_burst",
    limit=100,
    window=60,
    strategy="token_bucket",
    burst=100
)

Testing Strategies

You can test different strategies to find what works best:

import time
from ratethrottle import RateThrottleCore, RateThrottleRule

def test_strategy(strategy_name):
    limiter = RateThrottleCore()
    rule = RateThrottleRule(
        name="test",
        limit=10,
        window=60,
        strategy=strategy_name,
        burst=15 if strategy_name == "token_bucket" else None
    )
    limiter.add_rule(rule)

    # Make burst of requests
    allowed = 0
    for i in range(20):
        status = limiter.check_rate_limit("test_client", "test")
        if status.allowed:
            allowed += 1

    print(f"{strategy_name}: {allowed}/20 requests allowed in burst")

    # Wait and try again
    time.sleep(10)
    status = limiter.check_rate_limit("test_client", "test")
    print(f"  After 10s: {'Allowed' if status.allowed else 'Blocked'}")

# Test all strategies
strategies = [
    "fixed_window",
    "sliding_counter",
    "token_bucket",
    "leaky_bucket",
    "sliding_window"
]

for strategy in strategies:
    test_strategy(strategy)

Next Steps

Configure Storage Backends backends for your strategy
Learn about Configuration options
Explore Analytics and Reporting to monitor strategy performance