Rate Limiting Algo — Sliding Window


This piece follows Rate Limiting Algo — Fixed Window. It was originally published on Medium.
After the Fixed Window failure, the lesson was obvious: time boundaries are artificial; traffic is not.
So we looked for something mathematically correct — something that actually enforces “requests in the last N seconds,” not “requests in the same minute on a clock.”
That led us to Sliding Window rate limiting.
Instead of counting requests in rigid, clock-aligned buckets, sliding window rate limiting answers a simpler question:
How many requests has this client made in the last 60 seconds, right now?
No boundaries. No resets. Just continuous time.
There are two ways to implement this.
Correct — and completely impractical
The most accurate approach is also the simplest to explain.
Mathematically, this is perfect.
Now put real numbers on it.
Example
10M × 100 = 1 billion timestampsThat’s a billion list entries sitting in Redis.
At that scale:
This is why Sliding Window Log is academically correct but operationally unusable.
Nobody runs this at scale.
The practical compromise
To fix the explosion problem, we relax precision slightly and introduce buckets.
Instead of storing every timestamp:
Example
Buckets:
B1: T-60 → T-50
B2: T-50 → T-40
B3: T-40 → T-30
B4: T-30 → T-20
B5: T-20 → T-10
B6: T-10 → T-0Allow the request if:
B1 + B2 + B3 + B4 + B5 + B6 ≤ 100This is why sliding window counters are widely used in real systems.
The compromise introduces a subtle but dangerous flaw: microburst amplification at bucket boundaries.
Let’s walk through it carefully.
Assume the system can safely handle:
100 RPM ≈ 1.67 RPSBucket state:
T-60 → T-50 (B1): 0
T-50 → T-40 (B2): 0
T-40 → T-30 (B3): 0
T-30 → T-20 (B4): 0
T-20 → T-10 (B5): 50
T-10 → T-0 (B6): 50Now imagine:
From the algorithm’s perspective:
From the system’s perspective:
The limit is respected. The system still takes a massive hit.
Sliding Window Counter:
But it does not smooth traffic.
It enforces how much work is allowed, not how fast it arrives.
That’s why sliding window counters still allow sharp spikes — and why systems that depend on smooth load (DBs, auth, payments) can still fail under microbursts.
Sliding windows improve fairness and accounting, but they do not protect backend stability under bursty traffic.
And that realisation leads directly to the next algorithm.
The follow-up is Rate Limiting — Token Bucket & Leaky Bucket — token bucket, leaky bucket, and why layering them is what production systems actually do.