API Rate Limiting: Complete Implementation Guide for Developers
Rate limiting is the unsung hero of API stability. Too lenient, and a single misbehaving client can take down your entire infrastructure. Too strict, and you alienate legitimate users and break integrations.
๐ Protect your API keys from abuse
1Password allows you to securely store and share API credentials across your dev team while maintaining strict access control.
Try 1Password Free โWhether you're building a public API for thousands of developers or an internal microservice architecture, implementing a robust rate limiting strategy is non-negotiable for production reliability.
๐ก Monitor your APIs โ know when they go down before your users do
Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.
Affiliate link โ we may earn a commission at no extra cost to you
What is API Rate Limiting?
Rate limiting is the process of controlling the number of requests a client can make to an API within a specific timeframe. It prevents resource exhaustion, mitigates DoS attacks, and ensures fair usage across all clients.
๐ก Monitor your rate limits in real-time
Better Stack provides detailed monitoring and alerting for your API endpoints, letting you know before your rate limits cause an outage.
Try Better Stack Free โCommon Rate Limiting Algorithms
1. Fixed Window
The simplest approach. A counter is reset at the start of every window (e.g., 1,000 requests per hour).
Pros: Easy to implement.
Cons: The "burst" problem. A client can send 1,000 requests at the end of window A and 1,000 at the start of window B, effectively doubling the rate for a short period.
2. Sliding Window Log
Tracks every request timestamp in a log. When a new request comes in, it filters out timestamps older than the current window.
Pros: Extremely accurate.
Cons: High memory overhead to store every request timestamp.
3. Token Bucket
Tokens are added to a bucket at a fixed rate. Each request consumes a token. If the bucket is empty, the request is rate-limited.
Pros: Allows for controlled bursts while maintaining a long-term average rate.
Cons: Slightly more complex to implement than fixed windows.
4. Leaky Bucket
Requests enter a bucket and are processed (leak) at a constant, steady rate. If the bucket overflows, requests are dropped.
Pros: Smoothes out traffic spikes completely.
Cons: Can add latency to requests even when the system is under-utilized.
Implementation Example: Token Bucket in TypeScript
class TokenBucket {
private tokens: number;
private lastRefill: number;
private readonly capacity: number;
private readonly refillRate: number; // tokens per ms
constructor(capacity: number, refillRate: number) {
this.capacity = capacity;
this.refillRate = refillRate;
this.tokens = capacity;
this.lastRefill = Date.now();
}
refill() {
const now = Date.now();
const delta = now - this.lastRefill;
this.tokens = Math.min(this.capacity, this.tokens + delta * this.refillRate);
this.lastRefill = now;
}
async take(): Promise<boolean> {
this.refill();
if (this.tokens >= 1) {
this.tokens -= 1;
return true;
}
return false;
}
}
// Usage: 10 requests burst, refills at 1 request per second
const limiter = new TokenBucket(10, 1 / 1000);
const allowed = await limiter.take();
if (!allowed) {
// Return 429 Too Many Requests
}Best Practices for Rate Limiting
- Return Clear Headers: Always include
X-RateLimit-Limit,X-RateLimit-Remaining, andX-RateLimit-Resetheaders. - Use HTTP 429: Use the
429 Too Many Requestsstatus code. - Retry-After Header: Tell the client exactly how long to wait before retrying.
- Tiered Limiting: Implement different limits for free vs. paid users.
- Distributed Rate Limiting: Use Redis or similar for shared state across multiple API nodes.
Stop guessing your API health
Combine your rate limiting strategy with proactive monitoring. Know exactly when your users are hitting limits and when your system is under stress.
Visit API Status Check โ