API Caching Strategies: Complete Guide to Faster, Cheaper APIs
๐ก Monitor your APIs โ know when they go down before your users do
Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.
Affiliate link โ we may earn a commission at no extra cost to you
Caching is the single most effective technique for improving API performance and reducing infrastructure costs. A well-implemented caching strategy can reduce response times from seconds to milliseconds, cut database load by 90%+, and handle 10x more traffic with the same infrastructure.
This guide covers production-ready caching patterns used by major APIs like Stripe, GitHub, and AWS, including HTTP caching headers, Redis patterns, CDN configuration, cache invalidation strategies, and common mistakes to avoid.
Table of Contents
- Why API Caching Matters
- The Four Layers of API Caching
- HTTP Caching Headers
- Browser Caching
- CDN Caching
- Application-Level Caching with Redis
- Database Query Caching
- Cache Invalidation Strategies
- Cache Key Design
- Monitoring & Debugging
- Real-World Examples
- Common Mistakes
- Production Checklist
Why API Caching Matters
Performance improvements:
- Response times drop from 500ms to 5ms (100x faster)
- 90% reduction in database queries
- Handle 10x more requests per server
Cost savings:
- Reduce server count by 50-80%
- Lower database read units (AWS RDS, DynamoDB)
- Reduced API costs for third-party services
Better user experience:
- Instant page loads (perceived performance)
- Reduced time-to-interactive (TTI)
- Lower bounce rates
Infrastructure resilience:
- Continue serving cached data during database outages
- Graceful degradation under load
- Protection against traffic spikes
Real-world impact example: A typical e-commerce API serving product catalog data sees:
- Without caching: 500ms response time, 1,000 req/sec capacity, $5,000/month infrastructure
- With caching: 10ms response time, 10,000 req/sec capacity, $1,000/month infrastructure
That's 50x faster response times and 80% cost reduction.
The Four Layers of API Caching
Effective caching happens at multiple layers, each with different characteristics:
1. Browser Cache (Client-Side)
- Location: User's browser
- Scope: Single user
- TTL: Minutes to days
- Best for: Static assets (images, CSS, JS), public API responses
- Control: HTTP headers (
Cache-Control,ETag)
2. CDN Cache (Edge Network)
- Location: Global edge servers (Cloudflare, AWS CloudFront)
- Scope: All users in a region
- TTL: Minutes to hours
- Best for: Public API endpoints, GET requests, high-traffic routes
- Control: HTTP headers + CDN config
3. Application Cache (Server-Side)
- Location: In-memory store (Redis, Memcached)
- Scope: All users (shared cache)
- TTL: Seconds to hours
- Best for: Database query results, computed values, session data
- Control: Application code
4. Database Cache (Query Cache)
- Location: Database server memory
- Scope: Database-level
- TTL: Automatic (LRU eviction)
- Best for: Frequently-run queries
- Control: Database configuration
Layering strategy:
- Browser cache serves repeat requests (same user)
- CDN cache serves geographically distributed users
- Application cache reduces database load
- Database cache optimizes query execution
HTTP Caching Headers
HTTP caching headers tell browsers and CDNs how to cache responses. These are the foundation of any caching strategy.
Cache-Control Header
The Cache-Control header controls caching behavior:
// Example: Cache for 1 hour (public, shareable)
res.setHeader('Cache-Control', 'public, max-age=3600');
// Example: Cache for 5 minutes (private, user-specific)
res.setHeader('Cache-Control', 'private, max-age=300');
// Example: Never cache (always revalidate)
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
Common directives:
| Directive | Meaning | Use Case |
|---|---|---|
public |
Can be cached by browsers + CDNs | Public data (product catalog) |
private |
Only browser can cache (not CDNs) | User-specific data (profile) |
max-age=3600 |
Cache valid for 3600 seconds | Time-based expiration |
s-maxage=7200 |
CDN-specific max-age (overrides max-age for shared caches) | Different TTL for CDN vs browser |
no-cache |
Must revalidate with server before using cached copy | Ensure freshness |
no-store |
Never cache (sensitive data) | Payment info, personal data |
must-revalidate |
Once expired, must revalidate (can't serve stale) | Critical data accuracy |
stale-while-revalidate=60 |
Serve stale content for 60s while fetching fresh data | Performance + freshness balance |
stale-if-error=86400 |
Serve stale content for 24h if origin is down | Resilience during outages |
ETag (Entity Tag)
An ETag is a unique identifier for a specific version of a resource. When the resource changes, the ETag changes.
How it works:
- Server generates ETag (hash of content)
- Client stores ETag with cached response
- On subsequent request, client sends
If-None-Match: <etag> - Server compares ETag:
- If unchanged: Return
304 Not Modified(no body, save bandwidth) - If changed: Return
200 OKwith new content + new ETag
- If unchanged: Return
Implementation:
import { createHash } from 'crypto';
function generateETag(content: string): string {
return createHash('md5').update(content).digest('hex');
}
app.get('/api/product/:id', async (req, res) => {
const product = await db.product.findUnique({
where: { id: req.params.id }
});
const content = JSON.stringify(product);
const etag = generateETag(content);
// Check if client has current version
if (req.headers['if-none-match'] === etag) {
return res.status(304).end(); // Not modified
}
res.setHeader('ETag', etag);
res.setHeader('Cache-Control', 'private, max-age=300');
res.json(product);
});
Benefits:
- Save bandwidth (304 responses have no body)
- Guaranteed freshness (client always has latest version)
- Works with dynamic content
When to use:
- User-specific data that changes frequently
- Resources where staleness is unacceptable
- When you need conditional requests
Last-Modified Header
Similar to ETag but uses timestamps instead of content hashes:
app.get('/api/posts/:id', async (req, res) => {
const post = await db.post.findUnique({
where: { id: req.params.id },
select: { id: true, title: true, content: true, updatedAt: true }
});
const lastModified = post.updatedAt.toUTCString();
// Check if client has current version
if (req.headers['if-modified-since'] === lastModified) {
return res.status(304).end();
}
res.setHeader('Last-Modified', lastModified);
res.setHeader('Cache-Control', 'public, max-age=600');
res.json(post);
});
ETag vs Last-Modified:
- ETag: More accurate (detects any change), higher CPU cost (hashing)
- Last-Modified: Less accurate (only 1-second precision), lower CPU cost
Use Last-Modified when:
- Resources have reliable
updatedAttimestamps - 1-second precision is acceptable
- You want to minimize CPU overhead
Use ETag when:
- Content can change multiple times per second
- You need guaranteed freshness
- Resources don't have timestamps
Browser Caching
Browser caching is the first line of defense against unnecessary network requests.
Setting Up Browser Cache
// routes/api/public-data.ts
import { NextResponse } from 'next/server';
export async function GET() {
const data = await fetchPublicData();
return NextResponse.json(data, {
headers: {
'Cache-Control': 'public, max-age=3600, stale-while-revalidate=60',
'CDN-Cache-Control': 'public, max-age=7200', // Vercel/Cloudflare specific
}
});
}
Cache Strategy by Endpoint Type
| Endpoint Type | Cache-Control | Reasoning |
|---|---|---|
| Static content (logo, images) | public, max-age=31536000, immutable |
Never changes (use versioned URLs) |
| Public product catalog | public, max-age=3600, stale-while-revalidate=60 |
Shared, updates hourly |
| User profile | private, max-age=300 |
User-specific, updates occasionally |
| Real-time data (stock prices) | private, max-age=10 |
Needs to be fresh |
| Sensitive data (payment) | no-cache, no-store, must-revalidate |
Never cache |
Invalidating Browser Cache
You can't directly invalidate browser cache, but you can force revalidation:
1. Change the URL (cache busting):
// Instead of /api/products
// Use /api/products?v=2 when data changes
2. Use versioned assets:
<!-- URL changes when file changes -->
<script src="/bundle.abc123.js"></script>
3. Set no-cache (revalidate every time):
res.setHeader('Cache-Control', 'no-cache'); // Forces revalidation
CDN Caching
CDNs (Content Delivery Networks) like Cloudflare, AWS CloudFront, and Vercel cache responses at edge locations worldwide.
Cloudflare Configuration
// Cloudflare respects Cache-Control by default
export async function GET() {
const data = await getProducts();
return NextResponse.json(data, {
headers: {
'Cache-Control': 'public, max-age=3600',
'Cloudflare-CDN-Cache-Control': 'max-age=7200', // Cloudflare-specific
}
});
}
Cloudflare Page Rules (set in Cloudflare dashboard):
- Cache Level: Standard (HTML not cached) vs Bypass (no cache) vs Cache Everything
- Edge Cache TTL: Override origin Cache-Control
- Browser Cache TTL: Override browser Cache-Control
Vercel Edge Caching
// next.config.js
module.exports = {
async headers() {
return [
{
source: '/api/products',
headers: [
{
key: 'Cache-Control',
value: 'public, s-maxage=3600, stale-while-revalidate=60'
}
]
}
];
}
};
Vercel-specific headers:
s-maxage: How long Vercel Edge Network caches (overridesmax-agefor CDN)stale-while-revalidate: Serve stale while fetching fresh in background
CDN Cache Purging
When data changes, purge the CDN cache:
Cloudflare:
# Purge everything (use sparingly)
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer {token}" \
-d '{"purge_everything": true}'
# Purge specific URLs
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer {token}" \
-d '{"files": ["https://example.com/api/products/123"]}'
AWS CloudFront:
import { CloudFrontClient, CreateInvalidationCommand } from '@aws-sdk/client-cloudfront';
async function invalidateCDN(paths: string[]) {
const client = new CloudFrontClient({ region: 'us-east-1' });
await client.send(new CreateInvalidationCommand({
DistributionId: process.env.CLOUDFRONT_DISTRIBUTION_ID,
InvalidationBatch: {
CallerReference: Date.now().toString(),
Paths: {
Quantity: paths.length,
Items: paths // ['/api/products/*']
}
}
}));
}
Vercel:
# Vercel automatically purges on deploy
# Or use revalidate in Next.js:
export const revalidate = 3600; // ISR (Incremental Static Regeneration)
Application-Level Caching with Redis
Redis is the gold standard for application-level caching. It's an in-memory data store that's incredibly fast (sub-millisecond latency).
Redis Setup
# Install Redis
npm install ioredis
# Start Redis locally
brew install redis
brew services start redis
# Or use managed Redis (Redis Cloud, AWS ElastiCache, Upstash)
Basic Redis Caching Pattern
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
async function getCachedData<T>(
key: string,
fetchFn: () => Promise<T>,
ttl: number = 3600
): Promise<T> {
// Try to get from cache
const cached = await redis.get(key);
if (cached) {
console.log('Cache hit:', key);
return JSON.parse(cached);
}
// Cache miss - fetch from database
console.log('Cache miss:', key);
const data = await fetchFn();
// Store in cache with TTL
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
// Usage
app.get('/api/products/:id', async (req, res) => {
const productId = req.params.id;
const product = await getCachedData(
`product:${productId}`,
() => db.product.findUnique({ where: { id: productId } }),
3600 // 1 hour TTL
);
res.json(product);
});
Advanced Redis Patterns
1. Cache-Aside (Lazy Loading)
The pattern shown above: check cache first, fetch from database on miss, then populate cache.
Pros: Only cache what's actually requested
Cons: First request is slow (cold cache)
2. Write-Through Cache
Update cache whenever database is updated:
async function updateProduct(id: string, data: Partial<Product>) {
// Update database
const product = await db.product.update({
where: { id },
data
});
// Update cache immediately
await redis.setex(
`product:${id}`,
3600,
JSON.stringify(product)
);
return product;
}
Pros: Cache always fresh
Cons: Wasted writes for rarely-accessed data
3. Write-Behind (Write-Back) Cache
Write to cache first, asynchronously write to database:
async function updateProduct(id: string, data: Partial<Product>) {
// Update cache immediately
const product = { id, ...data, updatedAt: new Date() };
await redis.setex(
`product:${id}`,
3600,
JSON.stringify(product)
);
// Queue database write (async)
await queue.add('updateProduct', { id, data });
return product;
}
Pros: Fastest writes, reduced database load
Cons: Risk of data loss if cache fails before database write
4. Read-Through Cache
Cache handles database fetching transparently:
class ProductCache {
constructor(private redis: Redis, private db: PrismaClient) {}
async get(id: string): Promise<Product> {
const cached = await this.redis.get(`product:${id}`);
if (cached) {
return JSON.parse(cached);
}
// Cache automatically fetches from database
const product = await this.db.product.findUnique({
where: { id }
});
await this.redis.setex(
`product:${id}`,
3600,
JSON.stringify(product)
);
return product;
}
}
// Usage
const productCache = new ProductCache(redis, db);
const product = await productCache.get('123');
Redis Data Structures for Caching
Hash (Key-Value Pairs)
Great for caching objects with multiple fields:
// Store user object as hash
await redis.hset('user:123', {
id: '123',
name: 'John Doe',
email: 'john@example.com'
});
// Get single field
const name = await redis.hget('user:123', 'name');
// Get all fields
const user = await redis.hgetall('user:123');
// Set TTL on the hash
await redis.expire('user:123', 3600);
Sorted Sets (Leaderboards, Rankings)
// Add scores to leaderboard
await redis.zadd('leaderboard', 100, 'user:1', 95, 'user:2', 87, 'user:3');
// Get top 10
const top10 = await redis.zrevrange('leaderboard', 0, 9, 'WITHSCORES');
// Get user rank
const rank = await redis.zrevrank('leaderboard', 'user:1'); // 0 (1st place)
Lists (Recent Activity, Logs)
// Add to recent activity (front of list)
await redis.lpush('recent:user:123', 'viewed_product_456');
// Get last 10 activities
const recent = await redis.lrange('recent:user:123', 0, 9);
// Trim to max 100 items (automatic cleanup)
await redis.ltrim('recent:user:123', 0, 99);
Cache Stampede Prevention
When cache expires under high load, many requests simultaneously fetch from database (stampede).
Solution: Lock-based approach:
async function getCachedDataWithLock<T>(
key: string,
fetchFn: () => Promise<T>,
ttl: number = 3600
): Promise<T> {
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Try to acquire lock
const lockKey = `lock:${key}`;
const lockAcquired = await redis.set(lockKey, '1', 'EX', 10, 'NX');
if (lockAcquired) {
try {
// This request fetches from database
const data = await fetchFn();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
} finally {
// Release lock
await redis.del(lockKey);
}
} else {
// Another request is fetching - wait and retry
await new Promise(resolve => setTimeout(resolve, 100));
return getCachedDataWithLock(key, fetchFn, ttl);
}
}
Alternative: Probabilistic early expiration:
async function getCachedDataWithEarlyExpiration<T>(
key: string,
fetchFn: () => Promise<T>,
ttl: number = 3600
): Promise<T> {
const cached = await redis.get(key);
const ttlRemaining = await redis.ttl(key);
if (cached) {
// Probabilistic early refresh (10% chance in last 10% of TTL)
const refreshThreshold = ttl * 0.1;
const shouldRefresh = ttlRemaining < refreshThreshold && Math.random() < 0.1;
if (shouldRefresh) {
// Async refresh (don't block response)
fetchFn().then(data =>
redis.setex(key, ttl, JSON.stringify(data))
);
}
return JSON.parse(cached);
}
const data = await fetchFn();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
Database Query Caching
Most databases have built-in query caching.
PostgreSQL Query Cache
PostgreSQL doesn't have query-level caching (by design), but you can:
1. Use materialized views:
-- Create materialized view (pre-computed results)
CREATE MATERIALIZED VIEW popular_products AS
SELECT p.*, COUNT(o.id) as order_count
FROM products p
LEFT JOIN orders o ON o.product_id = p.id
GROUP BY p.id
ORDER BY order_count DESC
LIMIT 100;
-- Refresh periodically (e.g., every hour via cron)
REFRESH MATERIALIZED VIEW popular_products;
2. Use Redis for query result caching (shown in Application-Level Caching section)
MySQL Query Cache
MySQL has a query cache (deprecated in 8.0), but it's problematic:
Issues:
- Global lock (poor concurrency)
- Invalidated by any write to involved tables
- Difficult to tune
Better approach: Use Redis for application-level caching instead.
MongoDB Query Cache
MongoDB automatically caches frequently-accessed documents in memory.
Tuning:
// Set WiredTiger cache size (default: 50% of RAM - 1GB)
mongod --wiredTigerCacheSizeGB 4
Best practices:
- Ensure working set fits in cache
- Use indexes to minimize scanned documents
- Monitor cache hit ratio with
db.serverStatus().wiredTiger.cache
Cache Invalidation Strategies
"There are only two hard things in Computer Science: cache invalidation and naming things." โ Phil Karlton
Cache invalidation is the process of removing stale data from cache when the underlying data changes.
1. Time-Based Expiration (TTL)
Simplest approach: set a TTL (time-to-live) and let cache expire automatically.
// Cache for 1 hour
await redis.setex('product:123', 3600, JSON.stringify(product));
Pros: Simple, no coordination needed
Cons: Stale data until TTL expires
When to use: Data that's okay to be slightly stale (product descriptions, blog posts)
2. Event-Based Invalidation
Invalidate cache when data changes:
async function updateProduct(id: string, data: Partial<Product>) {
// Update database
const product = await db.product.update({
where: { id },
data
});
// Invalidate cache
await redis.del(`product:${id}`);
// Also invalidate related caches
await redis.del('products:list');
await redis.del(`products:category:${product.categoryId}`);
return product;
}
Pros: Immediate consistency
Cons: More complex, easy to miss invalidation points
When to use: Data that must be immediately fresh (inventory, prices)
3. Tag-Based Invalidation
Tag cache entries with categories, invalidate by tag:
// Store with tags
await redis.set('product:123', JSON.stringify(product));
await redis.sadd('tag:category:electronics', 'product:123');
await redis.sadd('tag:product', 'product:123');
// Invalidate all products in category
const keys = await redis.smembers('tag:category:electronics');
await redis.del(...keys);
await redis.del('tag:category:electronics');
Pros: Flexible, granular control
Cons: More storage overhead, complex tracking
When to use: Complex dependencies (e.g., product belongs to multiple categories)
4. Write-Through Invalidation
Update cache immediately when data changes:
async function updateProduct(id: string, data: Partial<Product>) {
const product = await db.product.update({
where: { id },
data
});
// Update cache with new data
await redis.setex(
`product:${id}`,
3600,
JSON.stringify(product)
);
return product;
}
Pros: Cache always fresh, no read after write
Cons: Write overhead for rarely-read data
When to use: Frequently-read data that changes occasionally
5. Lazy Invalidation (Stale-While-Revalidate)
Serve stale data while fetching fresh in background:
async function getCachedDataWithSWR<T>(
key: string,
fetchFn: () => Promise<T>,
ttl: number = 3600,
staleTime: number = 60
): Promise<T> {
const cached = await redis.get(key);
const ttlRemaining = await redis.ttl(key);
if (cached) {
// If within stale-while-revalidate window, refresh in background
if (ttlRemaining < staleTime) {
fetchFn().then(data =>
redis.setex(key, ttl, JSON.stringify(data))
);
}
return JSON.parse(cached);
}
const data = await fetchFn();
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
Pros: Always fast (serves cached), eventually consistent
Cons: May serve stale data briefly
When to use: Data that changes infrequently but needs fast responses
Invalidation Decision Matrix
| Data Type | Strategy | Reasoning |
|---|---|---|
| Product descriptions | TTL (1 hour) | Changes rarely, okay to be slightly stale |
| Product prices | Event-based | Must be immediately accurate |
| User profile | Write-through | Frequently read after update |
| Search results | TTL (5 min) + Tag-based | Complex dependencies, okay to be slightly stale |
| Real-time metrics | TTL (10 sec) | Constantly changing, short TTL acceptable |
| Static content | TTL (1 year) + Event-based | Never changes, invalidate on rare updates |
Cache Key Design
Good cache keys are:
- Unique: Different data โ different keys
- Consistent: Same data โ same key
- Readable: Easy to debug
- Hierarchical: Group related keys
Key Naming Conventions
// Good key patterns
'user:{userId}' // User profile
'product:{productId}' // Single product
'products:category:{categoryId}' // Products in category
'cart:{userId}' // User's shopping cart
'session:{sessionId}' // Session data
'leaderboard:global' // Global leaderboard
'api-response:{endpoint}:{hash}' // API response cache
// Bad key patterns (avoid)
'123' // No context
'user-data' // Not unique
'product/123/details' // Inconsistent delimiter
Dynamic Keys with Parameters
function generateCacheKey(
namespace: string,
params: Record<string, any>
): string {
// Sort keys for consistency
const sortedParams = Object.keys(params)
.sort()
.map(key => `${key}:${params[key]}`)
.join(':');
return `${namespace}:${sortedParams}`;
}
// Usage
const key = generateCacheKey('products', {
category: 'electronics',
minPrice: 100,
sort: 'price_asc'
});
// Result: "products:category:electronics:minPrice:100:sort:price_asc"
Hash-Based Keys (for Complex Queries)
import { createHash } from 'crypto';
function generateQueryCacheKey(
sql: string,
params: any[]
): string {
const queryString = `${sql}:${JSON.stringify(params)}`;
const hash = createHash('md5').update(queryString).digest('hex');
return `query:${hash}`;
}
// Usage
const key = generateQueryCacheKey(
'SELECT * FROM products WHERE category = ? AND price > ?',
['electronics', 100]
);
// Result: "query:a3f7b2c8d1e4f5a6b7c8d9e0f1a2b3c4"
Namespacing with Prefixes
// Version your cache schema
const CACHE_VERSION = 'v1';
function getCacheKey(type: string, id: string): string {
return `${CACHE_VERSION}:${type}:${id}`;
}
// When schema changes, bump version (auto-invalidates old cache)
// const CACHE_VERSION = 'v2';
Monitoring & Debugging
Track cache performance to optimize hit rates and identify bottlenecks.
Cache Metrics to Monitor
import { performance } from 'perf_hooks';
class CacheMonitor {
private hits = 0;
private misses = 0;
private errors = 0;
async get<T>(key: string, fetchFn: () => Promise<T>): Promise<T> {
const start = performance.now();
try {
const cached = await redis.get(key);
if (cached) {
this.hits++;
console.log(`Cache hit: ${key} (${performance.now() - start}ms)`);
return JSON.parse(cached);
}
this.misses++;
const data = await fetchFn();
await redis.setex(key, 3600, JSON.stringify(data));
console.log(`Cache miss: ${key} (${performance.now() - start}ms)`);
return data;
} catch (error) {
this.errors++;
console.error(`Cache error: ${key}`, error);
// Fallback to database on cache failure
return fetchFn();
}
}
getStats() {
const total = this.hits + this.misses;
return {
hits: this.hits,
misses: this.misses,
errors: this.errors,
hitRate: total > 0 ? (this.hits / total * 100).toFixed(2) : 0
};
}
}
Redis Monitoring Commands
# Check memory usage
redis-cli INFO memory
# Monitor all commands in real-time
redis-cli MONITOR
# Get slow queries
redis-cli SLOWLOG GET 10
# Check keyspace statistics
redis-cli INFO keyspace
# Memory analysis by key pattern
redis-cli --bigkeys
# Get cache hit rate
redis-cli INFO stats | grep keyspace
Logging Cache Performance
import winston from 'winston';
const logger = winston.createLogger({
format: winston.format.json(),
transports: [new winston.transports.Console()]
});
async function getCachedDataWithLogging<T>(
key: string,
fetchFn: () => Promise<T>
): Promise<T> {
const start = Date.now();
const cached = await redis.get(key);
const cacheLatency = Date.now() - start;
if (cached) {
logger.info('cache_hit', {
key,
latency: cacheLatency
});
return JSON.parse(cached);
}
logger.info('cache_miss', { key });
const fetchStart = Date.now();
const data = await fetchFn();
const fetchLatency = Date.now() - fetchStart;
logger.info('cache_populated', {
key,
fetchLatency
});
await redis.setex(key, 3600, JSON.stringify(data));
return data;
}
Alerting Thresholds
Set up alerts for:
- Low hit rate (<70% for frequently-accessed data)
- High memory usage (>80% Redis memory)
- Slow cache operations (>10ms p99 latency)
- High error rate (>1% cache failures)
Example with Datadog:
import { StatsD } from 'hot-shots';
const statsd = new StatsD({
host: 'localhost',
port: 8125,
prefix: 'cache.'
});
async function getCachedDataWithMetrics<T>(
key: string,
fetchFn: () => Promise<T>
): Promise<T> {
const start = Date.now();
const cached = await redis.get(key);
if (cached) {
statsd.increment('hits');
statsd.timing('latency', Date.now() - start);
return JSON.parse(cached);
}
statsd.increment('misses');
const data = await fetchFn();
await redis.setex(key, 3600, JSON.stringify(data));
return data;
}
Real-World Examples
Stripe API Caching
Stripe uses aggressive caching with ETags:
GET /v1/customers/cus_123
Authorization: Bearer sk_test_...
HTTP/1.1 200 OK
Cache-Control: no-cache
ETag: "a1b2c3d4"
{
"id": "cus_123",
"name": "John Doe"
}
On subsequent request:
GET /v1/customers/cus_123
Authorization: Bearer sk_test_...
If-None-Match: "a1b2c3d4"
HTTP/1.1 304 Not Modified
Key techniques:
- ETags for versioning
no-cacheto force revalidation (ensures freshness)- 304 responses save bandwidth
GitHub API Caching
GitHub uses conditional requests + rate limit windows:
GET /repos/facebook/react
Authorization: token ghp_...
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=60
ETag: "abc123"
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
X-RateLimit-Reset: 1678886400
Key techniques:
- 60-second cache for public endpoints
- ETags for conditional requests
- Rate limit headers to encourage caching
AWS API Gateway Caching
AWS API Gateway offers built-in caching per stage:
// Enable caching in API Gateway
{
"cacheClusterEnabled": true,
"cacheClusterSize": "0.5", // 0.5 GB cache
"methodSettings": {
"*/*/GET": {
"cachingEnabled": true,
"cacheTtlInSeconds": 300,
"cacheDataEncrypted": true
}
}
}
Key techniques:
- Automatic caching at API Gateway level (no application code)
- Cache key based on request parameters
- Per-method cache configuration
Shopify API Caching
Shopify uses pagination cursors + cache warming:
GET /admin/api/2023-01/products.json?limit=250
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store
Link: <...&page_info=abc123>; rel="next"
X-Shopify-Shop-Api-Call-Limit: 40/40
Key techniques:
no-cacheto prevent stale inventory data- Pagination cursors (stable across cache invalidations)
- Rate limit headers encourage client-side caching
Common Mistakes
1. Caching Without TTL
Problem: Cache entries never expire, causing stale data and memory bloat.
// โ Bad: No expiration
await redis.set('product:123', JSON.stringify(product));
// โ
Good: Always set TTL
await redis.setex('product:123', 3600, JSON.stringify(product));
2. Caching User-Specific Data Globally
Problem: Leaking user data across requests.
// โ Bad: All users see same cart
await redis.set('cart', JSON.stringify(cartItems));
// โ
Good: Include user ID in key
await redis.set(`cart:${userId}`, JSON.stringify(cartItems));
3. Not Handling Cache Failures
Problem: Application crashes when cache is unavailable.
// โ Bad: No fallback
const product = JSON.parse(await redis.get('product:123'));
// โ
Good: Graceful degradation
let product;
try {
const cached = await redis.get('product:123');
product = cached ? JSON.parse(cached) : await db.product.findUnique(...);
} catch (error) {
console.error('Cache error, falling back to database:', error);
product = await db.product.findUnique(...);
}
4. Cache Stampede
Problem: Many requests fetch simultaneously when cache expires.
// โ Bad: Thundering herd
const cached = await redis.get(key);
if (!cached) {
// 1000 requests all fetch from database at once
const data = await expensiveDatabaseQuery();
await redis.set(key, JSON.stringify(data));
return data;
}
// โ
Good: Use locking (see Cache Stampede Prevention section)
5. Caching Errors
Problem: Error responses get cached, breaking application.
// โ Bad: Cache errors
const data = await fetchAPI().catch(err => ({ error: err.message }));
await redis.set(key, JSON.stringify(data)); // Caches error!
// โ
Good: Only cache successful responses
const data = await fetchAPI();
if (data && !data.error) {
await redis.set(key, JSON.stringify(data));
}
6. Over-Caching
Problem: Caching data that changes frequently or is rarely accessed.
// โ Bad: Cache real-time stock prices for 1 hour
await redis.setex('stock:AAPL', 3600, price);
// โ
Good: Short TTL or no cache for real-time data
await redis.setex('stock:AAPL', 10, price); // 10 seconds
7. Inconsistent Cache Keys
Problem: Same data cached under multiple keys.
// โ Bad: Different keys for same data
await redis.set('product_123', ...);
await redis.set('product-123', ...);
await redis.set('products/123', ...);
// โ
Good: Consistent naming convention
await redis.set('product:123', ...);
8. Not Monitoring Cache Performance
Problem: Can't optimize what you don't measure.
// โ Bad: No metrics
const cached = await redis.get(key);
// โ
Good: Track hit/miss rates, latency, memory usage
statsd.increment(cached ? 'cache.hit' : 'cache.miss');
9. Caching Sensitive Data
Problem: Exposing payment info, passwords, personal data.
// โ Bad: Cache sensitive data
await redis.set('payment:123', JSON.stringify(paymentDetails));
// โ
Good: Never cache sensitive data
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
10. Ignoring Cache Warming
Problem: Cold cache causes slow responses after deploy.
// โ Bad: Deploy and hope cache fills naturally
// โ
Good: Warm cache after deploy
async function warmCache() {
const popularProducts = await db.product.findMany({
where: { viewCount: { gt: 1000 } }
});
for (const product of popularProducts) {
await redis.setex(
`product:${product.id}`,
3600,
JSON.stringify(product)
);
}
}
Production Checklist
Before deploying caching to production:
HTTP Caching
-
Cache-Controlheaders set on all endpoints - ETags implemented for dynamic content
- Different TTLs for public vs private data
-
no-cachefor sensitive data -
stale-while-revalidatefor performance
CDN Configuration
- CDN caching enabled for public endpoints
- Cache purging strategy documented
- CDN-specific headers configured
- Cache warming after deploys
Redis Setup
- Redis connection pooling configured
- TTL set on all cache entries
- Consistent key naming convention
- Cache stampede prevention implemented
- Graceful degradation on Redis failure
- Redis monitoring enabled
Cache Invalidation
- Invalidation strategy documented
- Event-based invalidation for critical data
- Tag-based invalidation for complex dependencies
- Version cache keys for schema changes
Monitoring
- Cache hit/miss rate tracking
- Latency monitoring (p50, p99)
- Memory usage alerts
- Error rate tracking
- Slow query logging
Testing
- Cache hit scenarios tested
- Cache miss scenarios tested
- Cache failure scenarios tested (Redis down)
- Stampede scenarios tested (load testing)
- Invalidation tested (data consistency)
Documentation
- Cache architecture documented
- Key naming conventions documented
- TTL values documented with reasoning
- Invalidation strategy documented
- Runbook for cache issues
Conclusion
Effective API caching is a multi-layered strategy combining HTTP headers, CDN configuration, Redis patterns, and database optimization. The key principles:
- Layer your cache: Browser โ CDN โ Application โ Database
- Choose appropriate TTLs: Balance freshness vs performance
- Invalidate proactively: Don't rely solely on expiration
- Monitor relentlessly: Track hit rates, latency, memory
- Handle failures gracefully: Always have a fallback
- Test under load: Prevent cache stampedes
- Document everything: Future you will thank you
A well-implemented caching strategy can:
- Reduce response times by 100x
- Cut infrastructure costs by 80%
- Handle 10x more traffic
- Improve reliability during outages
Start with simple TTL-based caching, measure performance, then optimize based on real-world usage patterns.
Related Guides
- API Rate Limiting: Complete Implementation Guide
- API Error Handling: Production-Ready Patterns
- Webhook Implementation: Build Reliable Event-Driven APIs
- API Testing: Complete Guide for Production APIs
- Circuit Breaker Pattern: Resilient API Integrations
Status Pages for Caching-Related Services
Monitor the uptime of caching and CDN services:
- Cloudflare Status โ Edge CDN and caching
- AWS CloudFront Status โ AWS CDN service
- Vercel Status โ Edge network and caching
- Redis Cloud Status โ Managed Redis service
- AWS ElastiCache Status โ AWS Redis/Memcached
- Datadog Status โ Monitoring and metrics
- New Relic Status โ Application performance monitoring
Last updated: March 11, 2026
๐ Tools We Use & Recommend
Tested across our own infrastructure monitoring 200+ APIs daily
Uptime Monitoring & Incident Management
Used by 100,000+ websites
Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.
โWe use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.โ
Secrets Management & Developer Security
Trusted by 150,000+ businesses
Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.
โAfter covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.โ
SEO & Site Performance Monitoring
Used by 10M+ marketers
Track your site health, uptime, search rankings, and competitor movements from one dashboard.
โWe use SEMrush to track how our API status pages rank and catch site health issues early.โ
API Status Check
Stop checking API status pages manually
Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.
14-day free trial ยท $0 due today ยท $9/mo after ยท Cancel anytime
Browse Free Dashboard โ