Where can I monitor API status in real-time?

API Status Check (apistatuscheck.com) provides real-time monitoring for 100+ APIs with uptime tracking and alerts. You can view dashboards, subscribe to feeds, and set up notifications in minutes.

API Caching Strategies: Complete Implementation Guide for High-Performance APIs

Q: API Caching Strategies: Complete Implementation Guide for High-Performance APIs?

This post explains API Caching Strategies: Complete Implementation Guide for High-Performance APIs with clear steps and practical examples. Use the guidance to apply the recommendations in your own API workflows.

Caching is the single most impactful performance optimization for APIs. A well-designed caching strategy can reduce response times from 500ms to 50ms, cut infrastructure costs by 80%, and handle 10x more traffic without scaling servers.

This guide covers production-ready caching strategies used by high-performance APIs at Stripe, GitHub, AWS, and Cloudflare.

Why Caching Matters
HTTP Caching Fundamentals
Cache-Control Headers
ETags and Conditional Requests
CDN Caching
Application-Level Caching
Distributed Caching with Redis
Cache Invalidation Strategies
Cache Warming
Cache Key Design
Real-World Examples
Common Mistakes
Production Checklist

Why Caching Matters

Without caching:

Request → Database Query (500ms) → JSON Serialization (50ms) → Response (550ms total)
1,000 requests/min = 1,000 database queries = expensive, slow, fragile

With caching:

Request → Cache Hit (5ms) → Response (5ms total)
1,000 requests/min = 10 database queries + 990 cache hits = cheap, fast, scalable

Impact metrics from production APIs:

99% cache hit rate → 100x reduction in database load (GitHub API)
50ms → 5ms response time improvement (Stripe API)
80% infrastructure cost savings (AWS CloudFront vs origin servers)
10x traffic handling without scaling servers (Shopify during Black Friday)

HTTP Caching Fundamentals

HTTP caching happens at multiple layers:

Client Browser ← HTTP Proxy ← CDN ← Load Balancer ← Origin Server ← Database
   (60s)          (5min)       (1hr)      (none)         (15min)       (source)

Cache Layers

Browser Cache: Client-side, user-specific (60s-1hr)
HTTP Proxy: Shared cache for multiple users (5-15min)
CDN: Geographic edge caching (1hr-1day)
Application Cache: In-memory server cache (15min-1hr)
Database Cache: Query result cache (5-15min)

Cache-Control Headers

The Cache-Control header controls caching behavior at all layers.

Common Directives

// Public, cacheable for 1 hour
res.setHeader('Cache-Control', 'public, max-age=3600');

// Private, only browser can cache
res.setHeader('Cache-Control', 'private, max-age=300');

// Never cache (authentication, sensitive data)
res.setHeader('Cache-Control', 'no-store');

// Cache but revalidate (ensure freshness)
res.setHeader('Cache-Control', 'public, max-age=0, must-revalidate');

// Immutable content (versioned assets)
res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');

Directive Meanings

Directive	Meaning	Use Case
`public`	Any cache can store	Public APIs, static content
`private`	Only browser can cache	User-specific data
`no-store`	Never cache	Sensitive data, authentication
`no-cache`	Cache but revalidate first	Dynamic content with ETags
`max-age=N`	Cache for N seconds	Freshness lifetime
`s-maxage=N`	CDN/proxy cache time (overrides max-age)	Separate client/CDN lifetimes
`must-revalidate`	Revalidate when stale	Ensure consistency
`immutable`	Never revalidate (versioned content)	`/assets/app.abc123.js`

Decision Matrix

// Static content (images, CSS, JS with version hashes)
'public, max-age=31536000, immutable'

// API responses (public data, low change frequency)
'public, max-age=300, s-maxage=3600'

// User-specific API responses
'private, max-age=60'

// Real-time data (stock prices, live scores)
'public, max-age=10, s-maxage=60'

// Authentication endpoints
'no-store, no-cache, must-revalidate'

// Frequently changing but cacheable
'public, max-age=0, must-revalidate' + ETag

ETags and Conditional Requests

ETags enable conditional caching: cache content but validate freshness before serving.

How ETags Work

1. Client: GET /api/users/123
2. Server: 200 OK
   ETag: "abc123"
   { "name": "Alice", "email": "alice@example.com" }
   
3. Client caches response + ETag

4. Later request: GET /api/users/123
   If-None-Match: "abc123"
   
5. Server checks if data changed:
   - Same ETag → 304 Not Modified (no body)
   - Different ETag → 200 OK + new data + new ETag

Bandwidth savings: 304 response = ~100 bytes vs 200 response = 5KB+ (98% reduction)

ETag Implementation

import crypto from 'crypto';
import express from 'express';

const app = express();

function generateETag(data: any): string {
  const hash = crypto.createHash('md5');
  hash.update(JSON.stringify(data));
  return `"${hash.digest('hex')}"`;
}

app.get('/api/users/:id', async (req, res) => {
  const user = await db.user.findUnique({
    where: { id: req.params.id }
  });
  
  if (!user) {
    return res.status(404).json({ error: 'User not found' });
  }
  
  const etag = generateETag(user);
  
  // Check If-None-Match header
  if (req.headers['if-none-match'] === etag) {
    return res.status(304).end();
  }
  
  res.setHeader('ETag', etag);
  res.setHeader('Cache-Control', 'public, max-age=0, must-revalidate');
  res.json(user);
});

Strong vs Weak ETags

// Strong ETag: byte-for-byte identical
ETag: "abc123"

// Weak ETag: semantically equivalent (gzip vs uncompressed)
ETag: W/"abc123"

Use weak ETags when:

Gzip compression changes bytes but not content
Whitespace formatting differs
Case-insensitive content

CDN Caching

CDNs cache content at edge locations near users, reducing latency and origin load.

How CDN Caching Works

User in Tokyo → Tokyo CDN Edge (10ms) → Response
                ↓ (miss)
                US Origin Server (200ms) → Response → Cache at Tokyo Edge

Without CDN: Every request travels 200ms to origin
With CDN: First request 200ms, subsequent requests 10ms (95% improvement)

CDN Cache-Control

// Browser caches for 5 minutes, CDN for 1 hour
res.setHeader('Cache-Control', 'public, max-age=300, s-maxage=3600');

// Cloudflare-specific: cache for 2 hours
res.setHeader('Cloudflare-CDN-Cache-Control', 'max-age=7200');

// Fastly-specific: cache for 1 day
res.setHeader('Surrogate-Control', 'max-age=86400');

CDN Providers

CDN	Use Case	Notable Users
Cloudflare	Free tier, DDoS protection	Discord, Shopify
AWS CloudFront	AWS integration, Lambda@Edge	Netflix, Slack
Fastly	Real-time purging, VCL control	GitHub, Stripe
Akamai	Enterprise, largest network	Apple, Microsoft

Cache Key Customization

By default, CDNs cache by full URL. Customize cache keys to improve hit rates:

// Default cache key (separate cache for each query param)
/api/posts?page=1&sort=date&limit=10
/api/posts?limit=10&sort=date&page=1  ← separate cache entry (query order differs)

// Normalized cache key (ignore query order)
cloudfront.createCacheKey({
  queryStringsAllowList: ['page', 'sort', 'limit'],
  enableAcceptEncodingGzip: true,
  headersAllowList: ['Authorization']
});

// Result: both requests share cache
/api/posts?limit=10&page=1&sort=date  ← same cache entry

Best practice: Only include query params that actually change the response.

Application-Level Caching

Cache expensive operations in-memory at the application layer.

In-Memory Cache (Single Server)

// Simple LRU cache with node-cache
import NodeCache from 'node-cache';

const cache = new NodeCache({
  stdTTL: 600,        // 10 minutes default
  checkperiod: 120,   // Check for expired keys every 2min
  useClones: false    // Return references (faster)
});

async function getUser(id: string) {
  const cacheKey = `user:${id}`;
  
  // Check cache
  const cached = cache.get(cacheKey);
  if (cached) {
    console.log('Cache hit');
    return cached;
  }
  
  // Cache miss - fetch from database
  console.log('Cache miss');
  const user = await db.user.findUnique({ where: { id } });
  
  // Store in cache
  cache.set(cacheKey, user, 600);  // 10 minutes
  
  return user;
}

app.get('/api/users/:id', async (req, res) => {
  const user = await getUser(req.params.id);
  res.json(user);
});

Limitation: Memory cache is per-server. With multiple servers, cache hit rates drop (each server has separate cache).

Solution: Use distributed caching with Redis.

Distributed Caching with Redis

Redis provides a shared cache across all servers.

Redis Setup

# Install Redis (macOS)
brew install redis
redis-server

# Install Redis (Docker)
docker run -d -p 6379:6379 redis:7-alpine

import { createClient } from 'redis';

const redis = createClient({
  url: process.env.REDIS_URL || 'redis://localhost:6379'
});

await redis.connect();

async function getCachedUser(id: string) {
  const cacheKey = `user:${id}`;
  
  // Try cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Cache miss - fetch from database
  const user = await db.user.findUnique({ where: { id } });
  
  // Store in Redis with 10-minute expiration
  await redis.setEx(cacheKey, 600, JSON.stringify(user));
  
  return user;
}

Redis Caching Patterns

1. Cache-Aside (Lazy Loading)

Most common pattern: Application checks cache, falls back to database on miss.

async function getPost(id: string) {
  const key = `post:${id}`;
  
  // 1. Check cache
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);
  
  // 2. Cache miss - load from database
  const post = await db.post.findUnique({ where: { id } });
  
  // 3. Store in cache
  await redis.setEx(key, 3600, JSON.stringify(post));
  
  return post;
}

Pros: Simple, only caches requested data
Cons: Cache misses cause latency spikes

2. Write-Through

Write to cache AND database simultaneously.

async function updatePost(id: string, data: any) {
  const key = `post:${id}`;
  
  // 1. Update database
  const post = await db.post.update({
    where: { id },
    data
  });
  
  // 2. Update cache
  await redis.setEx(key, 3600, JSON.stringify(post));
  
  return post;
}

Pros: Cache always fresh
Cons: Write latency (2 writes per update)

3. Write-Behind (Write-Back)

Write to cache immediately, persist to database asynchronously.

async function updatePost(id: string, data: any) {
  const key = `post:${id}`;
  
  // 1. Update cache immediately
  await redis.setEx(key, 3600, JSON.stringify(data));
  
  // 2. Queue database write (async)
  await queue.add('updateDatabase', { id, data });
  
  return data;
}

Pros: Fastest writes
Cons: Risk of data loss if cache fails before persistence

Redis Advanced Features

// Atomic increment (counters, rate limiting)
await redis.incr('api:requests:total');

// Hash operations (store objects efficiently)
await redis.hSet('user:123', {
  name: 'Alice',
  email: 'alice@example.com',
  age: '30'
});
const user = await redis.hGetAll('user:123');

// Sorted sets (leaderboards, time-series)
await redis.zAdd('leaderboard', { score: 1500, value: 'user:123' });
const top10 = await redis.zRange('leaderboard', 0, 9, { REV: true });

// Pub/Sub (cache invalidation across servers)
await redis.publish('cache:invalidate', 'user:123');

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

1. Time-Based Expiration (TTL)

Simplest strategy: Cache expires after N seconds.

// Set with expiration
await redis.setEx('user:123', 600, JSON.stringify(user));

// Check remaining TTL
const ttl = await redis.ttl('user:123');  // 547 seconds remaining

Pros: Simple, predictable
Cons: Stale data until expiration

Best for: Data that changes infrequently (product catalog, blog posts)

2. Event-Based Invalidation

Invalidate when data changes (most accurate).

async function updateUser(id: string, data: any) {
  // 1. Update database
  const user = await db.user.update({
    where: { id },
    data
  });
  
  // 2. Invalidate cache
  await redis.del(`user:${id}`);
  
  return user;
}

Pros: Always fresh data
Cons: Requires invalidation logic in all write paths

3. Tag-Based Invalidation

Group related cache entries with tags, invalidate all at once.

// Store with tags
await redis.setEx('post:123', 3600, JSON.stringify(post));
await redis.sAdd('tag:user:456:posts', 'post:123');
await redis.sAdd('tag:category:tech:posts', 'post:123');

// Invalidate all posts by user
async function invalidateUserPosts(userId: string) {
  const postKeys = await redis.sMembers(`tag:user:${userId}:posts`);
  if (postKeys.length > 0) {
    await redis.del(postKeys);
    await redis.del(`tag:user:${userId}:posts`);
  }
}

Pros: Flexible, invalidate related items
Cons: Complex implementation

4. Versioned Keys

Never invalidate — use versioned cache keys instead.

// Version in cache key
const version = await redis.get('user:version') || 1;
const cacheKey = `user:123:v${version}`;

// Increment version to "invalidate" all user caches
async function invalidateUser(id: string) {
  await redis.incr('user:version');
}

Pros: No cache deletion needed, atomic invalidation
Cons: Orphaned cache entries (need cleanup)

5. Surrogate Keys (CDN Invalidation)

Fastly pattern: Group related content with surrogate keys.

// Add surrogate key header
res.setHeader('Surrogate-Key', 'user-123 posts all-posts');

// Purge by surrogate key (invalidates all matching entries)
await fastly.purgeKey('user-123');  // Purges all content tagged with user-123

Pros: Instant CDN purging, flexible grouping
Cons: CDN-specific

Cache Warming

Pre-populate cache before traffic arrives (cold start prevention).

When to Warm Cache

Application deployment
Cache server restart
Scheduled cache expiration
Known traffic spikes (product launches, sales)

Implementation

// Warm critical data on startup
async function warmCache() {
  console.log('Warming cache...');
  
  // Popular posts
  const popularPosts = await db.post.findMany({
    where: { views: { gt: 10000 } },
    take: 100
  });
  
  for (const post of popularPosts) {
    const key = `post:${post.id}`;
    await redis.setEx(key, 3600, JSON.stringify(post));
  }
  
  // Homepage data
  const homepage = await db.page.findUnique({
    where: { slug: 'home' }
  });
  await redis.setEx('page:home', 600, JSON.stringify(homepage));
  
  console.log(`Cache warmed: ${popularPosts.length} posts + homepage`);
}

// Run on server start
await warmCache();

Progressive Warming

Warm cache gradually to avoid overwhelming database.

import pLimit from 'p-limit';

async function warmCacheProgressive() {
  const limit = pLimit(10);  // Max 10 concurrent queries
  
  const posts = await db.post.findMany({ take: 1000 });
  
  await Promise.all(
    posts.map(post => 
      limit(async () => {
        const key = `post:${post.id}`;
        await redis.setEx(key, 3600, JSON.stringify(post));
      })
    )
  );
}

Cache Key Design

Good cache keys prevent collisions and enable flexible invalidation.

Best Practices

// ❌ Bad: Ambiguous, collision-prone
cache.set('123', user);
cache.set('123', post);  // Overwrites user!

// ✅ Good: Namespaced, clear intent
cache.set('user:123', user);
cache.set('post:123', post);

// ✅ Better: Include version/filters
cache.set('user:123:v2', user);
cache.set('posts:category:tech:page:1:limit:10', posts);

// ✅ Best: Hierarchical with separators
cache.set('api:v1:users:123:profile', user);
cache.set('api:v1:posts:category:tech:page:1', posts);

Normalize Query Parameters

function buildCacheKey(baseKey: string, params: Record<string, any>): string {
  // Sort params for consistent keys
  const sortedParams = Object.keys(params)
    .sort()
    .map(key => `${key}:${params[key]}`)
    .join(':');
  
  return `${baseKey}:${sortedParams}`;
}

// Both produce same key
buildCacheKey('posts', { page: 1, limit: 10, sort: 'date' });
buildCacheKey('posts', { sort: 'date', limit: 10, page: 1 });
// → "posts:limit:10:page:1:sort:date"

Hash Long Keys

import crypto from 'crypto';

function hashKey(key: string): string {
  if (key.length <= 100) return key;
  
  const hash = crypto.createHash('sha256').update(key).digest('hex');
  return `${key.slice(0, 50)}:${hash}`;
}

// Before: "api:users:search:q:long+search+query+with+many+words:page:1:limit:10:sort:relevance"
// After: "api:users:search:q:long+search+query+with+man:abc123..."

Real-World Examples

Stripe API Caching

// Stripe caches idempotent operations with Idempotency-Key header
app.post('/api/charges', async (req, res) => {
  const idempotencyKey = req.headers['idempotency-key'];
  
  if (idempotencyKey) {
    // Check cache for duplicate request
    const cached = await redis.get(`idempotency:${idempotencyKey}`);
    if (cached) {
      return res.json(JSON.parse(cached));
    }
  }
  
  // Process charge
  const charge = await stripe.charges.create(req.body);
  
  // Cache result for 24 hours
  if (idempotencyKey) {
    await redis.setEx(`idempotency:${idempotencyKey}`, 86400, JSON.stringify(charge));
  }
  
  res.json(charge);
});

Learn more: Stripe API Status

GitHub API Conditional Requests

// GitHub uses ETags + conditional requests
app.get('/api/repos/:owner/:repo', async (req, res) => {
  const { owner, repo } = req.params;
  const repoData = await db.repo.findUnique({ where: { owner, repo } });
  
  const etag = `"${repoData.updatedAt.getTime()}"`;
  
  if (req.headers['if-none-match'] === etag) {
    res.setHeader('X-RateLimit-Remaining', '4999');
    return res.status(304).end();  // Doesn't count against rate limit!
  }
  
  res.setHeader('ETag', etag);
  res.setHeader('Cache-Control', 'private, max-age=60');
  res.setHeader('X-RateLimit-Remaining', '4998');
  res.json(repoData);
});

Rate limit optimization: 304 responses don't count against GitHub's rate limit.

Learn more: GitHub API Status

Shopify CDN Caching

// Shopify caches product pages at CDN with Vary header
app.get('/products/:id', async (req, res) => {
  const product = await db.product.findUnique({
    where: { id: req.params.id }
  });
  
  // Cache varies by currency
  res.setHeader('Vary', 'Accept-Encoding, Cookie');
  res.setHeader('Cache-Control', 'public, max-age=300, s-maxage=3600');
  
  res.json(product);
});

Vary header: CDN creates separate cache entries for different cookie/encoding values.

Learn more: Shopify API Status

Common Mistakes

1. Caching Authenticated Responses Publicly

// ❌ SECURITY RISK: User data cached publicly
app.get('/api/me', authenticate, async (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=300');  // WRONG!
  res.json(req.user);
});

// ✅ Correct: Private cache only
app.get('/api/me', authenticate, async (req, res) => {
  res.setHeader('Cache-Control', 'private, max-age=300');
  res.json(req.user);
});

Impact: User A's data served to User B (data leak!)

2. Not Setting Vary Headers

// ❌ Wrong: Different content, same cache key
app.get('/api/products', async (req, res) => {
  const currency = req.headers['x-currency'] || 'USD';
  const products = await getProductsInCurrency(currency);
  
  res.setHeader('Cache-Control', 'public, max-age=300');
  res.json(products);  // USD prices served to EUR users!
});

// ✅ Correct: Vary by currency header
app.get('/api/products', async (req, res) => {
  const currency = req.headers['x-currency'] || 'USD';
  const products = await getProductsInCurrency(currency);
  
  res.setHeader('Vary', 'X-Currency');
  res.setHeader('Cache-Control', 'public, max-age=300');
  res.json(products);
});

3. Caching Errors

// ❌ Wrong: 500 errors cached for 1 hour
app.get('/api/posts', async (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=3600');
  
  try {
    const posts = await db.post.findMany();
    res.json(posts);
  } catch (error) {
    res.status(500).json({ error: 'Database error' });  // Cached!
  }
});

// ✅ Correct: Only cache successful responses
app.get('/api/posts', async (req, res) => {
  try {
    const posts = await db.post.findMany();
    res.setHeader('Cache-Control', 'public, max-age=3600');
    res.json(posts);
  } catch (error) {
    res.setHeader('Cache-Control', 'no-store');
    res.status(500).json({ error: 'Database error' });
  }
});

4. Not Handling Cache Stampede

Problem: Cache expires → 1,000 concurrent requests → 1,000 database queries → database overwhelmed.

// ❌ Wrong: Every request queries database on cache miss
async function getPopularPosts() {
  const cached = await redis.get('popular-posts');
  if (cached) return JSON.parse(cached);
  
  // 1,000 concurrent requests all run this query!
  const posts = await db.post.findMany({ take: 10 });
  await redis.setEx('popular-posts', 600, JSON.stringify(posts));
  return posts;
}

// ✅ Correct: Use locking to prevent stampede
import Redlock from 'redlock';

const redlock = new Redlock([redis]);

async function getPopularPosts() {
  const cacheKey = 'popular-posts';
  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);
  
  // Acquire lock
  const lock = await redlock.acquire([`lock:${cacheKey}`], 5000);
  
  try {
    // Double-check cache (another request may have populated it)
    const rechecked = await redis.get(cacheKey);
    if (rechecked) return JSON.parse(rechecked);
    
    // Only 1 request queries database
    const posts = await db.post.findMany({ take: 10 });
    await redis.setEx(cacheKey, 600, JSON.stringify(posts));
    return posts;
  } finally {
    await lock.release();
  }
}

5. Ignoring Cache Size Limits

// ❌ Wrong: Unbounded cache growth
async function cacheSearchResults(query: string, results: any[]) {
  await redis.set(`search:${query}`, JSON.stringify(results));
}

// ✅ Correct: Set max memory + eviction policy
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru  # Evict least recently used keys

Eviction policies:

allkeys-lru: Evict least recently used (recommended for caching)
volatile-lru: Evict LRU among keys with TTL
allkeys-lfu: Evict least frequently used
volatile-ttl: Evict keys with shortest TTL first

6. Not Monitoring Cache Hit Rate

// Track cache hits/misses
async function getCached(key: string) {
  const value = await redis.get(key);
  
  if (value) {
    await redis.incr('cache:hits');
    return JSON.parse(value);
  } else {
    await redis.incr('cache:misses');
    return null;
  }
}

// Monitor hit rate
app.get('/api/metrics', async (req, res) => {
  const hits = parseInt(await redis.get('cache:hits') || '0');
  const misses = parseInt(await redis.get('cache:misses') || '0');
  const total = hits + misses;
  const hitRate = total > 0 ? (hits / total * 100).toFixed(2) : 0;
  
  res.json({ hits, misses, hitRate: `${hitRate}%` });
});

Target hit rate: 90%+ for effective caching

Production Checklist

HTTP Caching

Cache-Control headers set on all routes
public vs private used correctly
no-store used for sensitive data (authentication, payments)
Vary headers set when response varies by header
ETags implemented for dynamic content
Conditional requests (If-None-Match) supported
Immutable directive used for versioned assets

CDN

CDN deployed (Cloudflare, AWS CloudFront, Fastly)
s-maxage set for CDN caching
Cache key normalized (query param order ignored)
Purge/invalidation API integrated
Origin shield configured (reduce origin load)
Custom cache rules tested

Application Cache

Redis/Memcached deployed
Cache key naming convention documented
TTL values tuned per data type
Cache warming implemented for critical data
Distributed locking prevents cache stampede
maxmemory + eviction policy configured

Invalidation

Event-based invalidation on writes
Tag-based invalidation for related items
Surrogate keys for CDN purging
Invalidation tested in staging

Monitoring

Cache hit rate tracked (>90% target)
Cache miss latency monitored
Memory usage alerted
Eviction rate tracked
Slow query logs reviewed

Security

private used for user-specific data
no-store used for sensitive responses
Vary: Cookie prevents cache poisoning
Cache tested with different users/roles

Testing

Load tested with realistic traffic
Cache stampede tested (concurrent cache misses)
Invalidation tested (data consistency)
Cold start tested (cache warming)

📡 Caching reduces load, but what happens when your cache layer goes down? Redis crashes, CDN misconfigurations, and origin overloads can cascade fast. Better Stack monitors your entire caching stack — Redis, CDN, origin servers — and alerts you in seconds when cache hit rates drop or response times spike.

Conclusion

Effective API caching requires layered strategies:

HTTP caching (browsers, proxies): 60s-1hr for public data
CDN caching (Cloudflare, Fastly): 1hr-1day at edge locations
Application caching (Redis): 5-15min for expensive queries
Database caching (query cache): 1-5min for frequently accessed data

Start simple:

Add Cache-Control headers to public routes
Deploy a CDN (Cloudflare free tier)
Cache expensive queries in Redis

Measure impact:

Monitor cache hit rate (>90% target)
Track response time improvements
Measure infrastructure cost savings

Related guides:

Monitor real-time API status for Cloudflare, AWS, Redis, Fastly, and 160+ other APIs at API Status Check.