system-design rate-limiting distributed-systems scalability backend

Rate Limiting Algo — Fixed Window

February 4, 20263 min read

Rate Limiting Algo — Fixed Window | Anil Gurindapalli

This piece continues from Rate Limiting: Introduction. It was originally published on Medium.

Two Days After the Launch (Again)

Two days after the launch incident, things looked calm on the dashboards. CPU was steady. Latency was flat. No alerts.

So we did what most teams do at this stage — we added some rate limiting. Simple. Quick. Safe. Or so it seemed.

We chose the easiest option available: Fixed Window.

Fixed Window Rate Limiting

The idea behind Fixed Window rate limiting is straightforward:

Count the number of requests in a fixed time window and reject anything beyond the limit.

For example, when we say 100 RPM (Requests Per Minute), the system divides time into rigid windows:

W1: 12:00:00 → 12:00:59
W2: 12:01:00 → 12:01:59

Within each window, requests are counted independently.

At first glance, this feels reasonable. The math is simple. The implementation is trivial. The dashboards look clean.

Where Fixed Window Breaks Down

Now consider what actually happens in real traffic.

A user sends:

100 requests at 12:00:59
100 more requests at 12:01:00

From the rate limiter’s perspective:

W1: 100 requests → allowed
W2: 100 requests → allowed

From the system’s perspective:

200 requests were processed in ~2 seconds

We declared a limit of 100 RPM, yet the system just accepted 200 requests almost back-to-back.

The contract was technically respected. The intent was completely violated.

Why This Is Dangerous

That burst doesn’t just affect the API layer.

It ripples outward:

Databases see sudden connection spikes
Caches get hammered simultaneously
Lock contention increases
Retries amplify the burst
Downstream dependencies slow down

What looks like “within limits” at the edge becomes overload inside the system.

This is how:

DBs get overwhelmed
Cache stampedes happen
Retry storms begin
Latency spikes propagate

All without any single window ever exceeding its quota.

The Fundamental Flaw

The core problem with Fixed Window is this:

It enforces limits on paper, not in reality.

Traffic is not aligned to clock boundaries.
Humans don’t send requests neatly inside minute buckets.
Marketing campaigns, mobile reconnects, retries, and background refreshes all create bursts that straddle window edges.

Fixed windows ignore burst behaviour entirely.

Verdict

Fixed Window rate limiting is:

Easy to implement
Cheap to run
Easy to explain

And unsafe for any serious production API.

If your system has:

Shared databases
Stateful operations
External dependencies
Real users (not just scripts)

Then Fixed Window limits are not protection — they are false confidence.

The follow-up is Rate Limiting Algo — Sliding Window — how sliding windows try (and partially fail) to fix this exact problem.

Because once you’ve seen a system fail at the window boundary, you never forget it.

#Two Days After the Launch (Again)

#Fixed Window Rate Limiting

#Where Fixed Window Breaks Down

#Why This Is Dangerous

#The Fundamental Flaw

#Verdict