API Caching Strategies: Complete Guide to Faster, Cheaper APIs

by API Status Check Team
Staff Pick

๐Ÿ“ก Monitor your APIs โ€” know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free โ†’

Affiliate link โ€” we may earn a commission at no extra cost to you

Caching is the single most effective technique for improving API performance and reducing infrastructure costs. A well-implemented caching strategy can reduce response times from seconds to milliseconds, cut database load by 90%+, and handle 10x more traffic with the same infrastructure.

This guide covers production-ready caching patterns used by major APIs like Stripe, GitHub, and AWS, including HTTP caching headers, Redis patterns, CDN configuration, cache invalidation strategies, and common mistakes to avoid.

Table of Contents

Why API Caching Matters

Performance improvements:

  • Response times drop from 500ms to 5ms (100x faster)
  • 90% reduction in database queries
  • Handle 10x more requests per server

Cost savings:

  • Reduce server count by 50-80%
  • Lower database read units (AWS RDS, DynamoDB)
  • Reduced API costs for third-party services

Better user experience:

  • Instant page loads (perceived performance)
  • Reduced time-to-interactive (TTI)
  • Lower bounce rates

Infrastructure resilience:

  • Continue serving cached data during database outages
  • Graceful degradation under load
  • Protection against traffic spikes

Real-world impact example: A typical e-commerce API serving product catalog data sees:

  • Without caching: 500ms response time, 1,000 req/sec capacity, $5,000/month infrastructure
  • With caching: 10ms response time, 10,000 req/sec capacity, $1,000/month infrastructure

That's 50x faster response times and 80% cost reduction.

The Four Layers of API Caching

Effective caching happens at multiple layers, each with different characteristics:

1. Browser Cache (Client-Side)

  • Location: User's browser
  • Scope: Single user
  • TTL: Minutes to days
  • Best for: Static assets (images, CSS, JS), public API responses
  • Control: HTTP headers (Cache-Control, ETag)

2. CDN Cache (Edge Network)

  • Location: Global edge servers (Cloudflare, AWS CloudFront)
  • Scope: All users in a region
  • TTL: Minutes to hours
  • Best for: Public API endpoints, GET requests, high-traffic routes
  • Control: HTTP headers + CDN config

3. Application Cache (Server-Side)

  • Location: In-memory store (Redis, Memcached)
  • Scope: All users (shared cache)
  • TTL: Seconds to hours
  • Best for: Database query results, computed values, session data
  • Control: Application code

4. Database Cache (Query Cache)

  • Location: Database server memory
  • Scope: Database-level
  • TTL: Automatic (LRU eviction)
  • Best for: Frequently-run queries
  • Control: Database configuration

Layering strategy:

  1. Browser cache serves repeat requests (same user)
  2. CDN cache serves geographically distributed users
  3. Application cache reduces database load
  4. Database cache optimizes query execution

HTTP Caching Headers

HTTP caching headers tell browsers and CDNs how to cache responses. These are the foundation of any caching strategy.

Cache-Control Header

The Cache-Control header controls caching behavior:

// Example: Cache for 1 hour (public, shareable)
res.setHeader('Cache-Control', 'public, max-age=3600');

// Example: Cache for 5 minutes (private, user-specific)
res.setHeader('Cache-Control', 'private, max-age=300');

// Example: Never cache (always revalidate)
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');

Common directives:

Directive Meaning Use Case
public Can be cached by browsers + CDNs Public data (product catalog)
private Only browser can cache (not CDNs) User-specific data (profile)
max-age=3600 Cache valid for 3600 seconds Time-based expiration
s-maxage=7200 CDN-specific max-age (overrides max-age for shared caches) Different TTL for CDN vs browser
no-cache Must revalidate with server before using cached copy Ensure freshness
no-store Never cache (sensitive data) Payment info, personal data
must-revalidate Once expired, must revalidate (can't serve stale) Critical data accuracy
stale-while-revalidate=60 Serve stale content for 60s while fetching fresh data Performance + freshness balance
stale-if-error=86400 Serve stale content for 24h if origin is down Resilience during outages

ETag (Entity Tag)

An ETag is a unique identifier for a specific version of a resource. When the resource changes, the ETag changes.

How it works:

  1. Server generates ETag (hash of content)
  2. Client stores ETag with cached response
  3. On subsequent request, client sends If-None-Match: <etag>
  4. Server compares ETag:
    • If unchanged: Return 304 Not Modified (no body, save bandwidth)
    • If changed: Return 200 OK with new content + new ETag

Implementation:

import { createHash } from 'crypto';

function generateETag(content: string): string {
  return createHash('md5').update(content).digest('hex');
}

app.get('/api/product/:id', async (req, res) => {
  const product = await db.product.findUnique({
    where: { id: req.params.id }
  });

  const content = JSON.stringify(product);
  const etag = generateETag(content);

  // Check if client has current version
  if (req.headers['if-none-match'] === etag) {
    return res.status(304).end(); // Not modified
  }

  res.setHeader('ETag', etag);
  res.setHeader('Cache-Control', 'private, max-age=300');
  res.json(product);
});

Benefits:

  • Save bandwidth (304 responses have no body)
  • Guaranteed freshness (client always has latest version)
  • Works with dynamic content

When to use:

  • User-specific data that changes frequently
  • Resources where staleness is unacceptable
  • When you need conditional requests

Last-Modified Header

Similar to ETag but uses timestamps instead of content hashes:

app.get('/api/posts/:id', async (req, res) => {
  const post = await db.post.findUnique({
    where: { id: req.params.id },
    select: { id: true, title: true, content: true, updatedAt: true }
  });

  const lastModified = post.updatedAt.toUTCString();

  // Check if client has current version
  if (req.headers['if-modified-since'] === lastModified) {
    return res.status(304).end();
  }

  res.setHeader('Last-Modified', lastModified);
  res.setHeader('Cache-Control', 'public, max-age=600');
  res.json(post);
});

ETag vs Last-Modified:

  • ETag: More accurate (detects any change), higher CPU cost (hashing)
  • Last-Modified: Less accurate (only 1-second precision), lower CPU cost

Use Last-Modified when:

  • Resources have reliable updatedAt timestamps
  • 1-second precision is acceptable
  • You want to minimize CPU overhead

Use ETag when:

  • Content can change multiple times per second
  • You need guaranteed freshness
  • Resources don't have timestamps

Browser Caching

Browser caching is the first line of defense against unnecessary network requests.

Setting Up Browser Cache

// routes/api/public-data.ts
import { NextResponse } from 'next/server';

export async function GET() {
  const data = await fetchPublicData();

  return NextResponse.json(data, {
    headers: {
      'Cache-Control': 'public, max-age=3600, stale-while-revalidate=60',
      'CDN-Cache-Control': 'public, max-age=7200', // Vercel/Cloudflare specific
    }
  });
}

Cache Strategy by Endpoint Type

Endpoint Type Cache-Control Reasoning
Static content (logo, images) public, max-age=31536000, immutable Never changes (use versioned URLs)
Public product catalog public, max-age=3600, stale-while-revalidate=60 Shared, updates hourly
User profile private, max-age=300 User-specific, updates occasionally
Real-time data (stock prices) private, max-age=10 Needs to be fresh
Sensitive data (payment) no-cache, no-store, must-revalidate Never cache

Invalidating Browser Cache

You can't directly invalidate browser cache, but you can force revalidation:

1. Change the URL (cache busting):

// Instead of /api/products
// Use /api/products?v=2 when data changes

2. Use versioned assets:

<!-- URL changes when file changes -->
<script src="/bundle.abc123.js"></script>

3. Set no-cache (revalidate every time):

res.setHeader('Cache-Control', 'no-cache'); // Forces revalidation

CDN Caching

CDNs (Content Delivery Networks) like Cloudflare, AWS CloudFront, and Vercel cache responses at edge locations worldwide.

Cloudflare Configuration

// Cloudflare respects Cache-Control by default
export async function GET() {
  const data = await getProducts();

  return NextResponse.json(data, {
    headers: {
      'Cache-Control': 'public, max-age=3600',
      'Cloudflare-CDN-Cache-Control': 'max-age=7200', // Cloudflare-specific
    }
  });
}

Cloudflare Page Rules (set in Cloudflare dashboard):

  • Cache Level: Standard (HTML not cached) vs Bypass (no cache) vs Cache Everything
  • Edge Cache TTL: Override origin Cache-Control
  • Browser Cache TTL: Override browser Cache-Control

Vercel Edge Caching

// next.config.js
module.exports = {
  async headers() {
    return [
      {
        source: '/api/products',
        headers: [
          {
            key: 'Cache-Control',
            value: 'public, s-maxage=3600, stale-while-revalidate=60'
          }
        ]
      }
    ];
  }
};

Vercel-specific headers:

  • s-maxage: How long Vercel Edge Network caches (overrides max-age for CDN)
  • stale-while-revalidate: Serve stale while fetching fresh in background

CDN Cache Purging

When data changes, purge the CDN cache:

Cloudflare:

# Purge everything (use sparingly)
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "Authorization: Bearer {token}" \
  -d '{"purge_everything": true}'

# Purge specific URLs
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "Authorization: Bearer {token}" \
  -d '{"files": ["https://example.com/api/products/123"]}'

AWS CloudFront:

import { CloudFrontClient, CreateInvalidationCommand } from '@aws-sdk/client-cloudfront';

async function invalidateCDN(paths: string[]) {
  const client = new CloudFrontClient({ region: 'us-east-1' });
  
  await client.send(new CreateInvalidationCommand({
    DistributionId: process.env.CLOUDFRONT_DISTRIBUTION_ID,
    InvalidationBatch: {
      CallerReference: Date.now().toString(),
      Paths: {
        Quantity: paths.length,
        Items: paths // ['/api/products/*']
      }
    }
  }));
}

Vercel:

# Vercel automatically purges on deploy
# Or use revalidate in Next.js:
export const revalidate = 3600; // ISR (Incremental Static Regeneration)

Application-Level Caching with Redis

Redis is the gold standard for application-level caching. It's an in-memory data store that's incredibly fast (sub-millisecond latency).

Redis Setup

# Install Redis
npm install ioredis

# Start Redis locally
brew install redis
brew services start redis

# Or use managed Redis (Redis Cloud, AWS ElastiCache, Upstash)

Basic Redis Caching Pattern

import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

async function getCachedData<T>(
  key: string,
  fetchFn: () => Promise<T>,
  ttl: number = 3600
): Promise<T> {
  // Try to get from cache
  const cached = await redis.get(key);
  
  if (cached) {
    console.log('Cache hit:', key);
    return JSON.parse(cached);
  }

  // Cache miss - fetch from database
  console.log('Cache miss:', key);
  const data = await fetchFn();

  // Store in cache with TTL
  await redis.setex(key, ttl, JSON.stringify(data));

  return data;
}

// Usage
app.get('/api/products/:id', async (req, res) => {
  const productId = req.params.id;

  const product = await getCachedData(
    `product:${productId}`,
    () => db.product.findUnique({ where: { id: productId } }),
    3600 // 1 hour TTL
  );

  res.json(product);
});

Advanced Redis Patterns

1. Cache-Aside (Lazy Loading)

The pattern shown above: check cache first, fetch from database on miss, then populate cache.

Pros: Only cache what's actually requested
Cons: First request is slow (cold cache)

2. Write-Through Cache

Update cache whenever database is updated:

async function updateProduct(id: string, data: Partial<Product>) {
  // Update database
  const product = await db.product.update({
    where: { id },
    data
  });

  // Update cache immediately
  await redis.setex(
    `product:${id}`,
    3600,
    JSON.stringify(product)
  );

  return product;
}

Pros: Cache always fresh
Cons: Wasted writes for rarely-accessed data

3. Write-Behind (Write-Back) Cache

Write to cache first, asynchronously write to database:

async function updateProduct(id: string, data: Partial<Product>) {
  // Update cache immediately
  const product = { id, ...data, updatedAt: new Date() };
  await redis.setex(
    `product:${id}`,
    3600,
    JSON.stringify(product)
  );

  // Queue database write (async)
  await queue.add('updateProduct', { id, data });

  return product;
}

Pros: Fastest writes, reduced database load
Cons: Risk of data loss if cache fails before database write

4. Read-Through Cache

Cache handles database fetching transparently:

class ProductCache {
  constructor(private redis: Redis, private db: PrismaClient) {}

  async get(id: string): Promise<Product> {
    const cached = await this.redis.get(`product:${id}`);
    
    if (cached) {
      return JSON.parse(cached);
    }

    // Cache automatically fetches from database
    const product = await this.db.product.findUnique({
      where: { id }
    });

    await this.redis.setex(
      `product:${id}`,
      3600,
      JSON.stringify(product)
    );

    return product;
  }
}

// Usage
const productCache = new ProductCache(redis, db);
const product = await productCache.get('123');

Redis Data Structures for Caching

Hash (Key-Value Pairs)

Great for caching objects with multiple fields:

// Store user object as hash
await redis.hset('user:123', {
  id: '123',
  name: 'John Doe',
  email: 'john@example.com'
});

// Get single field
const name = await redis.hget('user:123', 'name');

// Get all fields
const user = await redis.hgetall('user:123');

// Set TTL on the hash
await redis.expire('user:123', 3600);

Sorted Sets (Leaderboards, Rankings)

// Add scores to leaderboard
await redis.zadd('leaderboard', 100, 'user:1', 95, 'user:2', 87, 'user:3');

// Get top 10
const top10 = await redis.zrevrange('leaderboard', 0, 9, 'WITHSCORES');

// Get user rank
const rank = await redis.zrevrank('leaderboard', 'user:1'); // 0 (1st place)

Lists (Recent Activity, Logs)

// Add to recent activity (front of list)
await redis.lpush('recent:user:123', 'viewed_product_456');

// Get last 10 activities
const recent = await redis.lrange('recent:user:123', 0, 9);

// Trim to max 100 items (automatic cleanup)
await redis.ltrim('recent:user:123', 0, 99);

Cache Stampede Prevention

When cache expires under high load, many requests simultaneously fetch from database (stampede).

Solution: Lock-based approach:

async function getCachedDataWithLock<T>(
  key: string,
  fetchFn: () => Promise<T>,
  ttl: number = 3600
): Promise<T> {
  const cached = await redis.get(key);
  
  if (cached) {
    return JSON.parse(cached);
  }

  // Try to acquire lock
  const lockKey = `lock:${key}`;
  const lockAcquired = await redis.set(lockKey, '1', 'EX', 10, 'NX');

  if (lockAcquired) {
    try {
      // This request fetches from database
      const data = await fetchFn();
      await redis.setex(key, ttl, JSON.stringify(data));
      return data;
    } finally {
      // Release lock
      await redis.del(lockKey);
    }
  } else {
    // Another request is fetching - wait and retry
    await new Promise(resolve => setTimeout(resolve, 100));
    return getCachedDataWithLock(key, fetchFn, ttl);
  }
}

Alternative: Probabilistic early expiration:

async function getCachedDataWithEarlyExpiration<T>(
  key: string,
  fetchFn: () => Promise<T>,
  ttl: number = 3600
): Promise<T> {
  const cached = await redis.get(key);
  const ttlRemaining = await redis.ttl(key);

  if (cached) {
    // Probabilistic early refresh (10% chance in last 10% of TTL)
    const refreshThreshold = ttl * 0.1;
    const shouldRefresh = ttlRemaining < refreshThreshold && Math.random() < 0.1;

    if (shouldRefresh) {
      // Async refresh (don't block response)
      fetchFn().then(data =>
        redis.setex(key, ttl, JSON.stringify(data))
      );
    }

    return JSON.parse(cached);
  }

  const data = await fetchFn();
  await redis.setex(key, ttl, JSON.stringify(data));
  return data;
}

Database Query Caching

Most databases have built-in query caching.

PostgreSQL Query Cache

PostgreSQL doesn't have query-level caching (by design), but you can:

1. Use materialized views:

-- Create materialized view (pre-computed results)
CREATE MATERIALIZED VIEW popular_products AS
SELECT p.*, COUNT(o.id) as order_count
FROM products p
LEFT JOIN orders o ON o.product_id = p.id
GROUP BY p.id
ORDER BY order_count DESC
LIMIT 100;

-- Refresh periodically (e.g., every hour via cron)
REFRESH MATERIALIZED VIEW popular_products;

2. Use Redis for query result caching (shown in Application-Level Caching section)

MySQL Query Cache

MySQL has a query cache (deprecated in 8.0), but it's problematic:

Issues:

  • Global lock (poor concurrency)
  • Invalidated by any write to involved tables
  • Difficult to tune

Better approach: Use Redis for application-level caching instead.

MongoDB Query Cache

MongoDB automatically caches frequently-accessed documents in memory.

Tuning:

// Set WiredTiger cache size (default: 50% of RAM - 1GB)
mongod --wiredTigerCacheSizeGB 4

Best practices:

  • Ensure working set fits in cache
  • Use indexes to minimize scanned documents
  • Monitor cache hit ratio with db.serverStatus().wiredTiger.cache

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." โ€” Phil Karlton

Cache invalidation is the process of removing stale data from cache when the underlying data changes.

1. Time-Based Expiration (TTL)

Simplest approach: set a TTL (time-to-live) and let cache expire automatically.

// Cache for 1 hour
await redis.setex('product:123', 3600, JSON.stringify(product));

Pros: Simple, no coordination needed
Cons: Stale data until TTL expires

When to use: Data that's okay to be slightly stale (product descriptions, blog posts)

2. Event-Based Invalidation

Invalidate cache when data changes:

async function updateProduct(id: string, data: Partial<Product>) {
  // Update database
  const product = await db.product.update({
    where: { id },
    data
  });

  // Invalidate cache
  await redis.del(`product:${id}`);

  // Also invalidate related caches
  await redis.del('products:list');
  await redis.del(`products:category:${product.categoryId}`);

  return product;
}

Pros: Immediate consistency
Cons: More complex, easy to miss invalidation points

When to use: Data that must be immediately fresh (inventory, prices)

3. Tag-Based Invalidation

Tag cache entries with categories, invalidate by tag:

// Store with tags
await redis.set('product:123', JSON.stringify(product));
await redis.sadd('tag:category:electronics', 'product:123');
await redis.sadd('tag:product', 'product:123');

// Invalidate all products in category
const keys = await redis.smembers('tag:category:electronics');
await redis.del(...keys);
await redis.del('tag:category:electronics');

Pros: Flexible, granular control
Cons: More storage overhead, complex tracking

When to use: Complex dependencies (e.g., product belongs to multiple categories)

4. Write-Through Invalidation

Update cache immediately when data changes:

async function updateProduct(id: string, data: Partial<Product>) {
  const product = await db.product.update({
    where: { id },
    data
  });

  // Update cache with new data
  await redis.setex(
    `product:${id}`,
    3600,
    JSON.stringify(product)
  );

  return product;
}

Pros: Cache always fresh, no read after write
Cons: Write overhead for rarely-read data

When to use: Frequently-read data that changes occasionally

5. Lazy Invalidation (Stale-While-Revalidate)

Serve stale data while fetching fresh in background:

async function getCachedDataWithSWR<T>(
  key: string,
  fetchFn: () => Promise<T>,
  ttl: number = 3600,
  staleTime: number = 60
): Promise<T> {
  const cached = await redis.get(key);
  const ttlRemaining = await redis.ttl(key);

  if (cached) {
    // If within stale-while-revalidate window, refresh in background
    if (ttlRemaining < staleTime) {
      fetchFn().then(data =>
        redis.setex(key, ttl, JSON.stringify(data))
      );
    }

    return JSON.parse(cached);
  }

  const data = await fetchFn();
  await redis.setex(key, ttl, JSON.stringify(data));
  return data;
}

Pros: Always fast (serves cached), eventually consistent
Cons: May serve stale data briefly

When to use: Data that changes infrequently but needs fast responses

Invalidation Decision Matrix

Data Type Strategy Reasoning
Product descriptions TTL (1 hour) Changes rarely, okay to be slightly stale
Product prices Event-based Must be immediately accurate
User profile Write-through Frequently read after update
Search results TTL (5 min) + Tag-based Complex dependencies, okay to be slightly stale
Real-time metrics TTL (10 sec) Constantly changing, short TTL acceptable
Static content TTL (1 year) + Event-based Never changes, invalidate on rare updates

Cache Key Design

Good cache keys are:

  • Unique: Different data โ†’ different keys
  • Consistent: Same data โ†’ same key
  • Readable: Easy to debug
  • Hierarchical: Group related keys

Key Naming Conventions

// Good key patterns
'user:{userId}'                    // User profile
'product:{productId}'              // Single product
'products:category:{categoryId}'   // Products in category
'cart:{userId}'                    // User's shopping cart
'session:{sessionId}'              // Session data
'leaderboard:global'               // Global leaderboard
'api-response:{endpoint}:{hash}'   // API response cache

// Bad key patterns (avoid)
'123'                             // No context
'user-data'                       // Not unique
'product/123/details'             // Inconsistent delimiter

Dynamic Keys with Parameters

function generateCacheKey(
  namespace: string,
  params: Record<string, any>
): string {
  // Sort keys for consistency
  const sortedParams = Object.keys(params)
    .sort()
    .map(key => `${key}:${params[key]}`)
    .join(':');

  return `${namespace}:${sortedParams}`;
}

// Usage
const key = generateCacheKey('products', {
  category: 'electronics',
  minPrice: 100,
  sort: 'price_asc'
});
// Result: "products:category:electronics:minPrice:100:sort:price_asc"

Hash-Based Keys (for Complex Queries)

import { createHash } from 'crypto';

function generateQueryCacheKey(
  sql: string,
  params: any[]
): string {
  const queryString = `${sql}:${JSON.stringify(params)}`;
  const hash = createHash('md5').update(queryString).digest('hex');
  return `query:${hash}`;
}

// Usage
const key = generateQueryCacheKey(
  'SELECT * FROM products WHERE category = ? AND price > ?',
  ['electronics', 100]
);
// Result: "query:a3f7b2c8d1e4f5a6b7c8d9e0f1a2b3c4"

Namespacing with Prefixes

// Version your cache schema
const CACHE_VERSION = 'v1';

function getCacheKey(type: string, id: string): string {
  return `${CACHE_VERSION}:${type}:${id}`;
}

// When schema changes, bump version (auto-invalidates old cache)
// const CACHE_VERSION = 'v2';

Monitoring & Debugging

Track cache performance to optimize hit rates and identify bottlenecks.

Cache Metrics to Monitor

import { performance } from 'perf_hooks';

class CacheMonitor {
  private hits = 0;
  private misses = 0;
  private errors = 0;

  async get<T>(key: string, fetchFn: () => Promise<T>): Promise<T> {
    const start = performance.now();

    try {
      const cached = await redis.get(key);

      if (cached) {
        this.hits++;
        console.log(`Cache hit: ${key} (${performance.now() - start}ms)`);
        return JSON.parse(cached);
      }

      this.misses++;
      const data = await fetchFn();
      await redis.setex(key, 3600, JSON.stringify(data));

      console.log(`Cache miss: ${key} (${performance.now() - start}ms)`);
      return data;

    } catch (error) {
      this.errors++;
      console.error(`Cache error: ${key}`, error);
      // Fallback to database on cache failure
      return fetchFn();
    }
  }

  getStats() {
    const total = this.hits + this.misses;
    return {
      hits: this.hits,
      misses: this.misses,
      errors: this.errors,
      hitRate: total > 0 ? (this.hits / total * 100).toFixed(2) : 0
    };
  }
}

Redis Monitoring Commands

# Check memory usage
redis-cli INFO memory

# Monitor all commands in real-time
redis-cli MONITOR

# Get slow queries
redis-cli SLOWLOG GET 10

# Check keyspace statistics
redis-cli INFO keyspace

# Memory analysis by key pattern
redis-cli --bigkeys

# Get cache hit rate
redis-cli INFO stats | grep keyspace

Logging Cache Performance

import winston from 'winston';

const logger = winston.createLogger({
  format: winston.format.json(),
  transports: [new winston.transports.Console()]
});

async function getCachedDataWithLogging<T>(
  key: string,
  fetchFn: () => Promise<T>
): Promise<T> {
  const start = Date.now();
  const cached = await redis.get(key);
  const cacheLatency = Date.now() - start;

  if (cached) {
    logger.info('cache_hit', {
      key,
      latency: cacheLatency
    });
    return JSON.parse(cached);
  }

  logger.info('cache_miss', { key });

  const fetchStart = Date.now();
  const data = await fetchFn();
  const fetchLatency = Date.now() - fetchStart;

  logger.info('cache_populated', {
    key,
    fetchLatency
  });

  await redis.setex(key, 3600, JSON.stringify(data));
  return data;
}

Alerting Thresholds

Set up alerts for:

  • Low hit rate (<70% for frequently-accessed data)
  • High memory usage (>80% Redis memory)
  • Slow cache operations (>10ms p99 latency)
  • High error rate (>1% cache failures)

Example with Datadog:

import { StatsD } from 'hot-shots';

const statsd = new StatsD({
  host: 'localhost',
  port: 8125,
  prefix: 'cache.'
});

async function getCachedDataWithMetrics<T>(
  key: string,
  fetchFn: () => Promise<T>
): Promise<T> {
  const start = Date.now();
  const cached = await redis.get(key);

  if (cached) {
    statsd.increment('hits');
    statsd.timing('latency', Date.now() - start);
    return JSON.parse(cached);
  }

  statsd.increment('misses');
  const data = await fetchFn();
  await redis.setex(key, 3600, JSON.stringify(data));

  return data;
}

Real-World Examples

Stripe API Caching

Stripe uses aggressive caching with ETags:

GET /v1/customers/cus_123
Authorization: Bearer sk_test_...

HTTP/1.1 200 OK
Cache-Control: no-cache
ETag: "a1b2c3d4"
{
  "id": "cus_123",
  "name": "John Doe"
}

On subsequent request:

GET /v1/customers/cus_123
Authorization: Bearer sk_test_...
If-None-Match: "a1b2c3d4"

HTTP/1.1 304 Not Modified

Key techniques:

  • ETags for versioning
  • no-cache to force revalidation (ensures freshness)
  • 304 responses save bandwidth

GitHub API Caching

GitHub uses conditional requests + rate limit windows:

GET /repos/facebook/react
Authorization: token ghp_...

HTTP/1.1 200 OK
Cache-Control: public, max-age=60, s-maxage=60
ETag: "abc123"
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
X-RateLimit-Reset: 1678886400

Key techniques:

  • 60-second cache for public endpoints
  • ETags for conditional requests
  • Rate limit headers to encourage caching

AWS API Gateway Caching

AWS API Gateway offers built-in caching per stage:

// Enable caching in API Gateway
{
  "cacheClusterEnabled": true,
  "cacheClusterSize": "0.5", // 0.5 GB cache
  "methodSettings": {
    "*/*/GET": {
      "cachingEnabled": true,
      "cacheTtlInSeconds": 300,
      "cacheDataEncrypted": true
    }
  }
}

Key techniques:

  • Automatic caching at API Gateway level (no application code)
  • Cache key based on request parameters
  • Per-method cache configuration

Shopify API Caching

Shopify uses pagination cursors + cache warming:

GET /admin/api/2023-01/products.json?limit=250

HTTP/1.1 200 OK
Cache-Control: no-cache, no-store
Link: <...&page_info=abc123>; rel="next"
X-Shopify-Shop-Api-Call-Limit: 40/40

Key techniques:

  • no-cache to prevent stale inventory data
  • Pagination cursors (stable across cache invalidations)
  • Rate limit headers encourage client-side caching

Common Mistakes

1. Caching Without TTL

Problem: Cache entries never expire, causing stale data and memory bloat.

// โŒ Bad: No expiration
await redis.set('product:123', JSON.stringify(product));

// โœ… Good: Always set TTL
await redis.setex('product:123', 3600, JSON.stringify(product));

2. Caching User-Specific Data Globally

Problem: Leaking user data across requests.

// โŒ Bad: All users see same cart
await redis.set('cart', JSON.stringify(cartItems));

// โœ… Good: Include user ID in key
await redis.set(`cart:${userId}`, JSON.stringify(cartItems));

3. Not Handling Cache Failures

Problem: Application crashes when cache is unavailable.

// โŒ Bad: No fallback
const product = JSON.parse(await redis.get('product:123'));

// โœ… Good: Graceful degradation
let product;
try {
  const cached = await redis.get('product:123');
  product = cached ? JSON.parse(cached) : await db.product.findUnique(...);
} catch (error) {
  console.error('Cache error, falling back to database:', error);
  product = await db.product.findUnique(...);
}

4. Cache Stampede

Problem: Many requests fetch simultaneously when cache expires.

// โŒ Bad: Thundering herd
const cached = await redis.get(key);
if (!cached) {
  // 1000 requests all fetch from database at once
  const data = await expensiveDatabaseQuery();
  await redis.set(key, JSON.stringify(data));
  return data;
}

// โœ… Good: Use locking (see Cache Stampede Prevention section)

5. Caching Errors

Problem: Error responses get cached, breaking application.

// โŒ Bad: Cache errors
const data = await fetchAPI().catch(err => ({ error: err.message }));
await redis.set(key, JSON.stringify(data)); // Caches error!

// โœ… Good: Only cache successful responses
const data = await fetchAPI();
if (data && !data.error) {
  await redis.set(key, JSON.stringify(data));
}

6. Over-Caching

Problem: Caching data that changes frequently or is rarely accessed.

// โŒ Bad: Cache real-time stock prices for 1 hour
await redis.setex('stock:AAPL', 3600, price);

// โœ… Good: Short TTL or no cache for real-time data
await redis.setex('stock:AAPL', 10, price); // 10 seconds

7. Inconsistent Cache Keys

Problem: Same data cached under multiple keys.

// โŒ Bad: Different keys for same data
await redis.set('product_123', ...);
await redis.set('product-123', ...);
await redis.set('products/123', ...);

// โœ… Good: Consistent naming convention
await redis.set('product:123', ...);

8. Not Monitoring Cache Performance

Problem: Can't optimize what you don't measure.

// โŒ Bad: No metrics
const cached = await redis.get(key);

// โœ… Good: Track hit/miss rates, latency, memory usage
statsd.increment(cached ? 'cache.hit' : 'cache.miss');

9. Caching Sensitive Data

Problem: Exposing payment info, passwords, personal data.

// โŒ Bad: Cache sensitive data
await redis.set('payment:123', JSON.stringify(paymentDetails));

// โœ… Good: Never cache sensitive data
res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');

10. Ignoring Cache Warming

Problem: Cold cache causes slow responses after deploy.

// โŒ Bad: Deploy and hope cache fills naturally

// โœ… Good: Warm cache after deploy
async function warmCache() {
  const popularProducts = await db.product.findMany({
    where: { viewCount: { gt: 1000 } }
  });

  for (const product of popularProducts) {
    await redis.setex(
      `product:${product.id}`,
      3600,
      JSON.stringify(product)
    );
  }
}

Production Checklist

Before deploying caching to production:

HTTP Caching

  • Cache-Control headers set on all endpoints
  • ETags implemented for dynamic content
  • Different TTLs for public vs private data
  • no-cache for sensitive data
  • stale-while-revalidate for performance

CDN Configuration

  • CDN caching enabled for public endpoints
  • Cache purging strategy documented
  • CDN-specific headers configured
  • Cache warming after deploys

Redis Setup

  • Redis connection pooling configured
  • TTL set on all cache entries
  • Consistent key naming convention
  • Cache stampede prevention implemented
  • Graceful degradation on Redis failure
  • Redis monitoring enabled

Cache Invalidation

  • Invalidation strategy documented
  • Event-based invalidation for critical data
  • Tag-based invalidation for complex dependencies
  • Version cache keys for schema changes

Monitoring

  • Cache hit/miss rate tracking
  • Latency monitoring (p50, p99)
  • Memory usage alerts
  • Error rate tracking
  • Slow query logging

Testing

  • Cache hit scenarios tested
  • Cache miss scenarios tested
  • Cache failure scenarios tested (Redis down)
  • Stampede scenarios tested (load testing)
  • Invalidation tested (data consistency)

Documentation

  • Cache architecture documented
  • Key naming conventions documented
  • TTL values documented with reasoning
  • Invalidation strategy documented
  • Runbook for cache issues

Conclusion

Effective API caching is a multi-layered strategy combining HTTP headers, CDN configuration, Redis patterns, and database optimization. The key principles:

  1. Layer your cache: Browser โ†’ CDN โ†’ Application โ†’ Database
  2. Choose appropriate TTLs: Balance freshness vs performance
  3. Invalidate proactively: Don't rely solely on expiration
  4. Monitor relentlessly: Track hit rates, latency, memory
  5. Handle failures gracefully: Always have a fallback
  6. Test under load: Prevent cache stampedes
  7. Document everything: Future you will thank you

A well-implemented caching strategy can:

  • Reduce response times by 100x
  • Cut infrastructure costs by 80%
  • Handle 10x more traffic
  • Improve reliability during outages

Start with simple TTL-based caching, measure performance, then optimize based on real-world usage patterns.

Related Guides

Status Pages for Caching-Related Services

Monitor the uptime of caching and CDN services:


Last updated: March 11, 2026

๐Ÿ›  Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

Better StackBest for API Teams

Uptime Monitoring & Incident Management

Used by 100,000+ websites

Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.

โ€œWe use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.โ€

Free tier ยท Paid from $24/moStart Free Monitoring
1PasswordBest for Credential Security

Secrets Management & Developer Security

Trusted by 150,000+ businesses

Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.

โ€œAfter covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.โ€

SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

โ€œWe use SEMrush to track how our API status pages rank and catch site health issues early.โ€

From $129.95/moTry SEMrush Free
View full comparison & more tools โ†’Affiliate links โ€” we earn a commission at no extra cost to you

API Status Check

Stop checking API status pages manually

Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.

Start Free Trial โ†’

14-day free trial ยท $0 due today ยท $9/mo after ยท Cancel anytime

Browse Free Dashboard โ†’