#distributed-systems

5 articles with this tag.

52 Minutes a Year: What 99.99% Availability Actually Costs

Everyone puts 99.9% in their SLA. Few know what it actually costs to hit it — and almost nobody distinguishes between a service that's 'up' and one that's actually reliable.

Apr 5, 20267 min read

Scalability Isn't a Feature, It's a Tax: Lessons from 20K CCU

Everyone talks about scaling. Few talk about the specific moment their system broke, why it broke, and the non-obvious fix that actually worked. Here's ours.

Apr 5, 20267 min read

Rate Limiting — Token Bucket & Leaky Bucket

Fixed and sliding windows argue about counting; token and leaky buckets argue about bursts. Here’s the contract each one enforces — and why production systems layer both.

Feb 5, 20266 min read

Rate Limiting Algo — Fixed Window

Fixed window rate limiting looks simple and behaves well on dashboards — until traffic piles up at window boundaries. Here’s why the contract is met while the system still overloads.

Feb 4, 20263 min read

Rate Limiting Algo — Sliding Window

Sliding windows answer the right question — how many requests in the last N seconds — but log-based precision doesn’t scale, and bucketed counters still miss microbursts.

Feb 4, 20264 min read