API Error Handling Best Practices: Build Robust Production APIs (2026)
API Error Handling Best Practices: Build Robust Production APIs (2026)
Quick Answer: Effective API error handling requires consistent error response formats (with error codes, messages, and details), proper HTTP status codes, retry logic with exponential backoff for transient failures, comprehensive error logging, graceful degradation when dependencies fail, and user-friendly error messages that guide developers to solutions. Production APIs should handle errors at multiple layers (network, application, business logic) and provide enough context for debugging without exposing sensitive information.
Error handling is what separates hobby projects from production-ready systems. When Stripe's API returns an error, developers know exactly what went wrong and how to fix it. When GitHub's API hits a rate limit, the response includes headers telling you when to retry. This guide shows you how to build that same level of robustness into your APIs.
Table of Contents
- Why Error Handling Matters
- The Three Layers of Error Handling
- Designing Error Responses
- HTTP Status Codes: Choosing the Right One
- Server-Side Error Handling Patterns
- Client-Side Error Handling Strategies
- Retry Logic & Exponential Backoff
- Error Logging & Monitoring
- Graceful Degradation
- Security Considerations
- Real-World Examples
- Common Mistakes to Avoid
- Production Readiness Checklist
Why Error Handling Matters
Every API call can fail. Networks drop packets, databases time out, third-party services go down, and users send malformed requests. The question isn't if errors will happen—it's how your system responds when they do.
The Cost of Poor Error Handling
Consider this real-world scenario:
Before: Generic Error Handling
{
"error": "Something went wrong"
}
- Developer Experience: Developers spend 45 minutes debugging, checking logs, and testing different inputs
- Support Load: 12 support tickets opened asking "What does 'something went wrong' mean?"
- Churn Risk: 3 developers abandon the integration, citing "unclear error messages"
After: Comprehensive Error Handling
{
"error": {
"code": "invalid_card",
"message": "The card number is invalid",
"param": "card_number",
"type": "card_error",
"doc_url": "https://docs.example.com/errors/invalid_card"
}
}
- Developer Experience: Immediate clarity on what's wrong and how to fix it
- Support Load: 73% reduction in error-related support tickets
- Developer Satisfaction: 89% of developers rate error messages as "helpful" or "very helpful"
Key Benefits of Good Error Handling
- Faster Debugging: Clear error messages reduce troubleshooting time from hours to minutes
- Better DX: Developers trust APIs that tell them exactly what went wrong
- Reduced Support Load: Self-service error resolution means fewer support tickets
- System Resilience: Proper error handling prevents cascade failures
- Operational Visibility: Comprehensive error logging enables proactive issue detection
The Three Layers of Error Handling
Production APIs handle errors at three distinct layers:
Layer 1: Network Errors (Infrastructure)
What they are: Failures in the network layer before your application code runs.
Examples:
- DNS resolution failures
- Connection timeouts
- TLS/SSL handshake failures
- Network unreachable
Handling strategy:
- Implement connection pooling with health checks
- Use circuit breakers to prevent cascade failures (see our circuit breaker pattern guide)
- Set appropriate timeouts (connection, read, total)
- Provide fallback mechanisms
Example:
import axios from 'axios';
import { CircuitBreaker } from 'opossum';
const apiClient = axios.create({
timeout: 5000, // 5 second timeout
maxRetries: 3,
});
const breaker = new CircuitBreaker(apiClient.get, {
timeout: 3000,
errorThresholdPercentage: 50,
resetTimeout: 30000,
});
breaker.fallback(() => {
return { data: getCachedData(), fromCache: true };
});
Layer 2: Application Errors (Request Processing)
What they are: Errors that occur during request validation and processing.
Examples:
- Invalid request parameters
- Authentication failures
- Authorization violations
- Rate limit exceeded
- Malformed JSON
Handling strategy:
- Validate early (fail fast)
- Return specific HTTP status codes
- Provide actionable error messages
- Include field-level validation errors
Example:
import { z } from 'zod';
import { Request, Response, NextFunction } from 'express';
const createUserSchema = z.object({
email: z.string().email(),
password: z.string().min(8),
name: z.string().min(2).max(100),
});
export function validateRequest(schema: z.ZodSchema) {
return (req: Request, res: Response, next: NextFunction) => {
try {
schema.parse(req.body);
next();
} catch (error) {
if (error instanceof z.ZodError) {
return res.status(400).json({
error: {
code: 'validation_error',
message: 'Request validation failed',
details: error.errors.map(err => ({
field: err.path.join('.'),
message: err.message,
code: err.code,
})),
},
});
}
next(error);
}
};
}
// Usage
app.post('/users', validateRequest(createUserSchema), createUserHandler);
Layer 3: Business Logic Errors (Domain Logic)
What they are: Errors related to your business rules and domain constraints.
Examples:
- Insufficient account balance
- Resource not found
- Duplicate resource
- State transition not allowed (e.g., can't cancel a shipped order)
- Business rule violation
Handling strategy:
- Create custom error classes for domain errors
- Map domain errors to appropriate HTTP status codes
- Provide context-specific error messages
- Include recovery suggestions
Example:
// Custom error classes
export class InsufficientBalanceError extends Error {
constructor(
public required: number,
public available: number,
public accountId: string
) {
super(`Insufficient balance: need $${required}, have $${available}`);
this.name = 'InsufficientBalanceError';
}
}
export class ResourceNotFoundError extends Error {
constructor(
public resourceType: string,
public resourceId: string
) {
super(`${resourceType} ${resourceId} not found`);
this.name = 'ResourceNotFoundError';
}
}
// Error mapper
export function mapDomainErrorToResponse(error: Error) {
if (error instanceof InsufficientBalanceError) {
return {
status: 402,
body: {
error: {
code: 'insufficient_balance',
message: error.message,
details: {
required: error.required,
available: error.available,
account_id: error.accountId,
},
recovery: 'Add funds to your account or reduce the transaction amount',
},
},
};
}
if (error instanceof ResourceNotFoundError) {
return {
status: 404,
body: {
error: {
code: 'resource_not_found',
message: error.message,
details: {
resource_type: error.resourceType,
resource_id: error.resourceId,
},
},
},
};
}
// Default 500 for unknown errors
return {
status: 500,
body: {
error: {
code: 'internal_error',
message: 'An unexpected error occurred',
},
},
};
}
// Express error handler middleware
app.use((err: Error, req: Request, res: Response, next: NextFunction) => {
const { status, body } = mapDomainErrorToResponse(err);
// Log error for monitoring
logger.error('Request failed', {
error: err,
status,
path: req.path,
method: req.method,
});
res.status(status).json(body);
});
Designing Error Responses
Consistency is key. Every error response should follow the same structure.
The Standard Error Response Format
{
"error": {
"code": "validation_error",
"message": "Human-readable error message",
"type": "client_error",
"details": {
"field": "email",
"reason": "invalid_format"
},
"request_id": "req_abc123",
"doc_url": "https://docs.example.com/errors/validation_error"
}
}
Essential Error Response Fields
| Field | Purpose | Example |
|---|---|---|
| code | Machine-readable error identifier | "invalid_card", "rate_limit_exceeded" |
| message | Human-readable description | "The card was declined" |
| type | Error category | "card_error", "api_error", "auth_error" |
| details | Context-specific information | { "param": "card_number" } |
| request_id | Unique identifier for this request | "req_abc123" (for support debugging) |
| doc_url | Link to error documentation | "https://docs.example.com/errors/..." |
TypeScript Error Response Types
export interface APIError {
error: {
code: string;
message: string;
type: 'client_error' | 'server_error' | 'auth_error' | 'rate_limit_error';
details?: Record<string, any>;
request_id: string;
doc_url?: string;
};
}
export interface ValidationError extends APIError {
error: {
code: 'validation_error';
message: string;
type: 'client_error';
details: {
fields: Array<{
field: string;
message: string;
code: string;
}>;
};
request_id: string;
};
}
export interface RateLimitError extends APIError {
error: {
code: 'rate_limit_exceeded';
message: string;
type: 'rate_limit_error';
details: {
limit: number;
remaining: 0;
reset_at: string; // ISO 8601 timestamp
};
request_id: string;
};
}
Real-World Error Response Examples
Stripe Card Error:
{
"error": {
"code": "card_declined",
"doc_url": "https://stripe.com/docs/error-codes/card-declined",
"message": "Your card was declined.",
"param": "exp_month",
"type": "card_error"
}
}
GitHub Rate Limit Error:
{
"message": "API rate limit exceeded",
"documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"
}
AWS API Error:
{
"Error": {
"Code": "InvalidParameterValue",
"Message": "Value (us-east-1x) for parameter AvailabilityZone is invalid."
},
"RequestId": "abc123"
}
HTTP Status Codes: Choosing the Right One
Using the correct HTTP status code is crucial for API consumers to handle errors appropriately.
Quick Reference Table
| Status Code | Use When | Example |
|---|---|---|
| 400 Bad Request | Request is malformed or invalid | Missing required field, invalid JSON |
| 401 Unauthorized | Authentication is missing or invalid | No API key, expired token |
| 403 Forbidden | User authenticated but lacks permission | Accessing another user's resource |
| 404 Not Found | Resource doesn't exist | /users/999999 for non-existent user |
| 409 Conflict | Request conflicts with current state | Creating duplicate resource |
| 422 Unprocessable Entity | Validation failed (well-formed but invalid) | Email already exists, invalid enum value |
| 429 Too Many Requests | Rate limit exceeded | More than 100 requests per minute |
| 500 Internal Server Error | Unexpected server error | Uncaught exception, database crash |
| 502 Bad Gateway | Upstream service failed | Payment processor down |
| 503 Service Unavailable | Temporary unavailability | Scheduled maintenance, overloaded |
| 504 Gateway Timeout | Upstream service timed out | Third-party API took too long |
Decision Tree for Status Codes
Is the request itself malformed (invalid JSON, wrong HTTP method)?
→ 400 Bad Request
Is authentication missing or invalid?
→ 401 Unauthorized
Is the user authenticated but not authorized for this action?
→ 403 Forbidden
Does the resource not exist?
→ 404 Not Found
Is the request valid but conflicts with current state?
→ 409 Conflict
Is the request well-formed but fails validation rules?
→ 422 Unprocessable Entity
Is the rate limit exceeded?
→ 429 Too Many Requests
Did the server encounter an unexpected error?
→ 500 Internal Server Error
Did an upstream dependency fail?
→ 502 Bad Gateway (if it failed)
→ 503 Service Unavailable (if it's temporarily down)
→ 504 Gateway Timeout (if it timed out)
Common Status Code Mistakes
❌ Wrong: Using 200 OK with error in response body
HTTP/1.1 200 OK
{
"success": false,
"error": "User not found"
}
✅ Right: Using proper status code
HTTP/1.1 404 Not Found
{
"error": {
"code": "user_not_found",
"message": "User not found"
}
}
❌ Wrong: Using 500 for validation errors
HTTP/1.1 500 Internal Server Error
{
"error": "Email is required"
}
✅ Right: Using 400 or 422 for client errors
HTTP/1.1 422 Unprocessable Entity
{
"error": {
"code": "validation_error",
"message": "Email is required"
}
}
See our complete API error codes guide for detailed status code reference.
Server-Side Error Handling Patterns
Pattern 1: Centralized Error Handler
Create a single error handling middleware that processes all errors.
import { Request, Response, NextFunction } from 'express';
import { logger } from './logger';
interface AppError extends Error {
statusCode?: number;
code?: string;
details?: Record<string, any>;
}
export function errorHandler(
err: AppError,
req: Request,
res: Response,
next: NextFunction
) {
const statusCode = err.statusCode || 500;
const requestId = req.headers['x-request-id'] as string || generateRequestId();
// Log error with context
logger.error('Request failed', {
error: {
name: err.name,
message: err.message,
stack: err.stack,
code: err.code,
},
request: {
id: requestId,
method: req.method,
path: req.path,
query: req.query,
headers: sanitizeHeaders(req.headers),
},
statusCode,
});
// Don't expose internal errors to clients
const isInternalError = statusCode >= 500;
res.status(statusCode).json({
error: {
code: err.code || (isInternalError ? 'internal_error' : 'unknown_error'),
message: isInternalError
? 'An unexpected error occurred. Please try again later.'
: err.message,
...(err.details && { details: err.details }),
request_id: requestId,
...(process.env.NODE_ENV === 'development' && {
stack: err.stack,
}),
},
});
}
// Use in Express app
app.use(errorHandler);
Pattern 2: Custom Error Classes
Create specific error classes for different error scenarios.
export class APIError extends Error {
constructor(
public statusCode: number,
public code: string,
message: string,
public details?: Record<string, any>
) {
super(message);
this.name = 'APIError';
Error.captureStackTrace(this, this.constructor);
}
}
export class ValidationError extends APIError {
constructor(
message: string,
public fields: Array<{ field: string; message: string }>
) {
super(422, 'validation_error', message, { fields });
this.name = 'ValidationError';
}
}
export class AuthenticationError extends APIError {
constructor(message = 'Authentication required') {
super(401, 'authentication_required', message);
this.name = 'AuthenticationError';
}
}
export class AuthorizationError extends APIError {
constructor(message = 'Permission denied') {
super(403, 'permission_denied', message);
this.name = 'AuthorizationError';
}
}
export class NotFoundError extends APIError {
constructor(resource: string, id: string) {
super(404, 'resource_not_found', `${resource} not found`, {
resource,
id,
});
this.name = 'NotFoundError';
}
}
export class RateLimitError extends APIError {
constructor(
public limit: number,
public resetAt: Date
) {
super(429, 'rate_limit_exceeded', 'Rate limit exceeded', {
limit,
reset_at: resetAt.toISOString(),
});
this.name = 'RateLimitError';
}
}
// Usage
async function getUserById(id: string) {
const user = await db.user.findUnique({ where: { id } });
if (!user) {
throw new NotFoundError('User', id);
}
return user;
}
Pattern 3: Async Error Wrapper
Automatically catch async errors without try-catch everywhere.
type AsyncRequestHandler = (
req: Request,
res: Response,
next: NextFunction
) => Promise<any>;
export function asyncHandler(fn: AsyncRequestHandler) {
return (req: Request, res: Response, next: NextFunction) => {
Promise.resolve(fn(req, res, next)).catch(next);
};
}
// Usage - no try-catch needed!
app.get('/users/:id', asyncHandler(async (req, res) => {
const user = await getUserById(req.params.id); // Throws NotFoundError if not found
res.json({ user });
}));
Pattern 4: Result Type (Railway-Oriented Programming)
Use Result types to make errors explicit in function signatures.
type Result<T, E = Error> =
| { success: true; value: T }
| { success: false; error: E };
export async function getUserSafe(id: string): Promise<Result<User, NotFoundError>> {
try {
const user = await db.user.findUnique({ where: { id } });
if (!user) {
return {
success: false,
error: new NotFoundError('User', id),
};
}
return { success: true, value: user };
} catch (error) {
return {
success: false,
error: new APIError(500, 'database_error', 'Database query failed'),
};
}
}
// Usage - explicitly handle success/failure
const result = await getUserSafe(userId);
if (result.success) {
console.log('User:', result.value);
} else {
console.error('Error:', result.error);
}
Client-Side Error Handling Strategies
Pattern 1: Axios Interceptor
Centralize error handling for all HTTP requests.
import axios, { AxiosError } from 'axios';
const apiClient = axios.create({
baseURL: 'https://api.example.com',
timeout: 10000,
});
// Request interceptor - add auth token
apiClient.interceptors.request.use(
(config) => {
const token = getAuthToken();
if (token) {
config.headers.Authorization = `Bearer ${token}`;
}
return config;
},
(error) => Promise.reject(error)
);
// Response interceptor - handle errors globally
apiClient.interceptors.response.use(
(response) => response,
async (error: AxiosError<APIError>) => {
const status = error.response?.status;
const errorData = error.response?.data;
// Token expired - refresh and retry
if (status === 401 && errorData?.error.code === 'token_expired') {
try {
const newToken = await refreshAuthToken();
setAuthToken(newToken);
// Retry original request with new token
error.config!.headers.Authorization = `Bearer ${newToken}`;
return apiClient.request(error.config!);
} catch (refreshError) {
// Refresh failed - redirect to login
redirectToLogin();
return Promise.reject(refreshError);
}
}
// Rate limit - wait and retry
if (status === 429) {
const resetAt = errorData?.error.details?.reset_at;
const waitMs = resetAt
? new Date(resetAt).getTime() - Date.now()
: 60000; // Default 1 minute
await new Promise(resolve => setTimeout(resolve, waitMs));
return apiClient.request(error.config!);
}
// Server error - show generic message
if (status && status >= 500) {
showToast('Server error. Please try again later.', 'error');
return Promise.reject(error);
}
// Client error - let component handle it
return Promise.reject(error);
}
);
export default apiClient;
Pattern 2: React Error Boundary
Catch rendering errors in React components.
import React from 'react';
import { logger } from './logger';
interface ErrorBoundaryProps {
children: React.ReactNode;
fallback?: React.ReactNode;
}
interface ErrorBoundaryState {
hasError: boolean;
error: Error | null;
}
export class ErrorBoundary extends React.Component<
ErrorBoundaryProps,
ErrorBoundaryState
> {
constructor(props: ErrorBoundaryProps) {
super(props);
this.state = { hasError: false, error: null };
}
static getDerivedStateFromError(error: Error): ErrorBoundaryState {
return { hasError: true, error };
}
componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
logger.error('React component error', {
error: {
message: error.message,
stack: error.stack,
},
errorInfo,
});
}
render() {
if (this.state.hasError) {
return this.props.fallback || (
<div className="error-container">
<h1>Something went wrong</h1>
<p>We've been notified and are working on a fix.</p>
<button onClick={() => window.location.reload()}>
Reload page
</button>
</div>
);
}
return this.props.children;
}
}
// Usage
<ErrorBoundary fallback={<ErrorFallback />}>
<App />
</ErrorBoundary>
Pattern 3: Type-Safe API Client
Use TypeScript to enforce error handling.
import { z } from 'zod';
const UserSchema = z.object({
id: z.string(),
email: z.string().email(),
name: z.string(),
});
type User = z.infer<typeof UserSchema>;
const ErrorSchema = z.object({
error: z.object({
code: z.string(),
message: z.string(),
details: z.record(z.any()).optional(),
}),
});
export async function fetchUser(id: string): Promise<User> {
const response = await fetch(`https://api.example.com/users/${id}`);
if (!response.ok) {
const errorData = await response.json();
const parsedError = ErrorSchema.parse(errorData);
throw new APIError(
response.status,
parsedError.error.code,
parsedError.error.message,
parsedError.error.details
);
}
const data = await response.json();
return UserSchema.parse(data); // Validates response shape
}
// Usage with error handling
try {
const user = await fetchUser('123');
console.log('User:', user);
} catch (error) {
if (error instanceof APIError) {
if (error.statusCode === 404) {
console.log('User not found');
} else if (error.statusCode === 429) {
console.log('Rate limited');
} else {
console.error('API error:', error.message);
}
} else {
console.error('Unexpected error:', error);
}
}
Retry Logic & Exponential Backoff
Not all errors are permanent. Transient failures (network hiccups, temporary overload) should be retried.
When to Retry
Retry these errors:
- Network timeouts (ETIMEDOUT, ECONNRESET)
- 503 Service Unavailable (temporary overload)
- 429 Too Many Requests (rate limit, with backoff)
- 502 Bad Gateway (upstream temporary failure)
- 504 Gateway Timeout (upstream timeout)
Don't retry these:
- 400 Bad Request (malformed request won't fix itself)
- 401 Unauthorized (auth token won't spontaneously appear)
- 403 Forbidden (permissions won't change mid-request)
- 404 Not Found (resource doesn't exist)
- 422 Unprocessable Entity (validation errors won't fix themselves)
Exponential Backoff Implementation
interface RetryConfig {
maxRetries: number;
baseDelayMs: number;
maxDelayMs: number;
retryableStatuses: number[];
}
const defaultRetryConfig: RetryConfig = {
maxRetries: 3,
baseDelayMs: 1000, // 1 second
maxDelayMs: 32000, // 32 seconds
retryableStatuses: [408, 429, 502, 503, 504],
};
function calculateBackoff(attempt: number, baseDelayMs: number, maxDelayMs: number): number {
// Exponential backoff: 2^attempt * baseDelay
const exponentialDelay = Math.pow(2, attempt) * baseDelayMs;
// Add jitter (random 0-20% variation) to prevent thundering herd
const jitter = exponentialDelay * 0.2 * Math.random();
// Cap at maxDelay
return Math.min(exponentialDelay + jitter, maxDelayMs);
}
async function fetchWithRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {}
): Promise<T> {
const cfg = { ...defaultRetryConfig, ...config };
let lastError: Error;
for (let attempt = 0; attempt <= cfg.maxRetries; attempt++) {
try {
return await fn();
} catch (error: any) {
lastError = error;
const status = error.response?.status;
const isLastAttempt = attempt === cfg.maxRetries;
const isRetryable = status && cfg.retryableStatuses.includes(status);
// Don't retry if not retryable or last attempt
if (!isRetryable || isLastAttempt) {
throw error;
}
// Special handling for 429 (rate limit)
if (status === 429) {
const resetHeader = error.response?.headers['x-ratelimit-reset'];
const retryAfterHeader = error.response?.headers['retry-after'];
if (resetHeader) {
const resetAt = new Date(resetHeader).getTime();
const waitMs = resetAt - Date.now();
await new Promise(resolve => setTimeout(resolve, waitMs));
continue;
}
if (retryAfterHeader) {
const waitMs = parseInt(retryAfterHeader) * 1000;
await new Promise(resolve => setTimeout(resolve, waitMs));
continue;
}
}
// Calculate backoff delay
const delayMs = calculateBackoff(attempt, cfg.baseDelayMs, cfg.maxDelayMs);
console.log(`Retry attempt ${attempt + 1}/${cfg.maxRetries} after ${delayMs}ms`);
await new Promise(resolve => setTimeout(resolve, delayMs));
}
}
throw lastError!;
}
// Usage
const user = await fetchWithRetry(
() => apiClient.get<User>('/users/123'),
{ maxRetries: 5 }
);
Using Libraries
Axios Retry:
import axios from 'axios';
import axiosRetry from 'axios-retry';
const apiClient = axios.create({
baseURL: 'https://api.example.com',
});
axiosRetry(apiClient, {
retries: 3,
retryDelay: axiosRetry.exponentialDelay,
retryCondition: (error) => {
return axiosRetry.isNetworkOrIdempotentRequestError(error) ||
[429, 502, 503, 504].includes(error.response?.status || 0);
},
onRetry: (retryCount, error, requestConfig) => {
console.log(`Retrying request (attempt ${retryCount}):`, requestConfig.url);
},
});
p-retry:
import pRetry from 'p-retry';
const user = await pRetry(
async () => {
const response = await fetch('https://api.example.com/users/123');
if (!response.ok) {
const error: any = new Error('Request failed');
error.statusCode = response.status;
throw error;
}
return response.json();
},
{
retries: 5,
onFailedAttempt: (error) => {
console.log(
`Attempt ${error.attemptNumber} failed. ${error.retriesLeft} retries left.`
);
},
}
);
See our API rate limiting guide for rate limit best practices.
Error Logging & Monitoring
You can't fix errors you don't know about. Comprehensive logging is essential.
What to Log
For every error, log:
- Timestamp (ISO 8601 format)
- Error details (name, message, stack trace)
- Request context (method, path, query params, headers)
- User context (user ID, session ID, IP address)
- Response (status code, response time)
- Environment (Node version, deployment version)
import winston from 'winston';
const logger = winston.createLogger({
level: 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
transports: [
new winston.transports.File({ filename: 'error.log', level: 'error' }),
new winston.transports.File({ filename: 'combined.log' }),
],
});
// Add console in development
if (process.env.NODE_ENV !== 'production') {
logger.add(new winston.transports.Console({
format: winston.format.simple(),
}));
}
export function logError(error: Error, context: Record<string, any>) {
logger.error('Request failed', {
error: {
name: error.name,
message: error.message,
stack: error.stack,
},
...context,
});
}
Structured Logging
Use structured logs (JSON) for easier searching and filtering.
// ❌ Bad: String interpolation
logger.error(`User ${userId} failed to create order ${orderId}: ${error.message}`);
// ✅ Good: Structured data
logger.error('Order creation failed', {
error: {
message: error.message,
stack: error.stack,
},
user_id: userId,
order_id: orderId,
action: 'create_order',
});
Integration with Error Tracking Services
Sentry:
import * as Sentry from '@sentry/node';
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
tracesSampleRate: 1.0,
});
export function errorHandler(
err: AppError,
req: Request,
res: Response,
next: NextFunction
) {
// Capture error in Sentry
Sentry.captureException(err, {
tags: {
endpoint: req.path,
method: req.method,
},
extra: {
request_id: req.headers['x-request-id'],
user_id: req.user?.id,
query: req.query,
body: req.body,
},
});
// Send error response
res.status(err.statusCode || 500).json({
error: {
code: err.code,
message: err.message,
},
});
}
Datadog:
import { datadogLogs } from '@datadog/browser-logs';
datadogLogs.init({
clientToken: process.env.DD_CLIENT_TOKEN!,
site: 'datadoghq.com',
forwardErrorsToLogs: true,
sessionSampleRate: 100,
});
// Log error
datadogLogs.logger.error('API request failed', {
error: error.message,
status: error.statusCode,
endpoint: '/users/123',
user_id: 'user_abc',
});
Setting Up Alerts
Create alerts for critical error patterns:
Alert on error rate spike:
Error rate > 5% of total requests in last 5 minutes
→ Page DevOps team
Alert on specific errors:
Database connection errors > 3 in last 1 minute
→ Page database team
Alert on 5xx errors:
5xx errors > 10 in last 5 minutes
→ Alert in Slack #incidents channel
Check Datadog's status and New Relic's status on APIStatusCheck.
Graceful Degradation
When dependencies fail, your API should degrade gracefully rather than crash.
Strategy 1: Fallback to Cached Data
import NodeCache from 'node-cache';
const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour TTL
async function getUserWithFallback(id: string): Promise<User> {
const cacheKey = `user:${id}`;
try {
// Try primary database
const user = await db.user.findUnique({ where: { id } });
if (user) {
// Cache successful response
cache.set(cacheKey, user);
return user;
}
throw new NotFoundError('User', id);
} catch (error) {
// Check cache on error
const cachedUser = cache.get<User>(cacheKey);
if (cachedUser) {
console.warn('Returning stale user data from cache', { user_id: id });
return {
...cachedUser,
_stale: true, // Flag as potentially outdated
} as User;
}
// No cache available - propagate error
throw error;
}
}
Strategy 2: Feature Flags
Disable non-essential features when dependencies fail.
import { getFeatureFlag } from './feature-flags';
async function createOrder(orderData: CreateOrderInput): Promise<Order> {
// Core functionality - always runs
const order = await db.order.create({ data: orderData });
// Non-essential features - can be disabled
try {
if (getFeatureFlag('email_notifications')) {
await sendOrderConfirmationEmail(order);
}
} catch (error) {
// Log but don't fail the request
logger.error('Failed to send order email', { error, order_id: order.id });
}
try {
if (getFeatureFlag('analytics_tracking')) {
await trackOrderCreated(order);
}
} catch (error) {
logger.error('Failed to track order', { error, order_id: order.id });
}
return order;
}
Strategy 3: Circuit Breaker Pattern
Prevent repeated calls to failing dependencies. See our complete circuit breaker pattern guide.
import CircuitBreaker from 'opossum';
const paymentBreaker = new CircuitBreaker(processPayment, {
timeout: 3000, // 3 seconds
errorThresholdPercentage: 50, // Open after 50% failures
resetTimeout: 30000, // Try again after 30 seconds
});
// Fallback to payment queue when circuit opens
paymentBreaker.fallback((paymentData) => {
logger.warn('Payment service unavailable, queueing payment', {
payment_id: paymentData.id,
});
return queuePaymentForLater(paymentData);
});
// Use the circuit breaker
const result = await paymentBreaker.fire(paymentData);
Strategy 4: Partial Response
Return partial data when some dependencies fail.
interface DashboardData {
user: User;
orders: Order[] | null;
analytics: Analytics | null;
recommendations: Product[] | null;
errors?: string[];
}
async function getDashboard(userId: string): Promise<DashboardData> {
const errors: string[] = [];
// Required data - fail if missing
const user = await getUserById(userId);
// Optional data - catch errors and continue
let orders: Order[] | null = null;
try {
orders = await getOrders(userId);
} catch (error) {
logger.error('Failed to fetch orders', { error, user_id: userId });
errors.push('orders_unavailable');
}
let analytics: Analytics | null = null;
try {
analytics = await getAnalytics(userId);
} catch (error) {
logger.error('Failed to fetch analytics', { error, user_id: userId });
errors.push('analytics_unavailable');
}
let recommendations: Product[] | null = null;
try {
recommendations = await getRecommendations(userId);
} catch (error) {
logger.error('Failed to fetch recommendations', { error, user_id: userId });
errors.push('recommendations_unavailable');
}
return {
user,
orders,
analytics,
recommendations,
...(errors.length > 0 && { errors }),
};
}
Security Considerations
Error messages can leak sensitive information. Balance helpful debugging with security.
What NOT to Expose
❌ Stack traces in production
{
"error": "Error: Invalid API key\n at validateApiKey (/app/auth.ts:42:11)\n at processRequest (/app/server.ts:89:5)"
}
❌ Internal file paths
{
"error": "ENOENT: no file exists at /var/www/app/uploads/user-123/document.pdf"
}
❌ Database queries
{
"error": "SQL Error: SELECT * FROM users WHERE email = 'attacker@example.com' AND password = 'hunter2'"
}
❌ API keys or tokens
{
"error": "Stripe API key sk_live_abc123 is invalid"
}
❌ Specific software versions
{
"error": "MySQL 5.7.22 connection failed"
}
Safe Error Responses
✅ Production-safe error messages:
function sanitizeError(error: Error, isProduction: boolean): APIError {
// Never expose internal errors in production
if (isProduction && !(error instanceof APIError)) {
return {
error: {
code: 'internal_error',
message: 'An unexpected error occurred',
type: 'server_error',
request_id: generateRequestId(),
},
};
}
// Only include stack trace in development
return {
error: {
code: error.code || 'unknown_error',
message: error.message,
type: 'server_error',
request_id: generateRequestId(),
...(process.env.NODE_ENV === 'development' && {
stack: error.stack,
}),
},
};
}
Sanitizing Headers
Remove sensitive headers from logs:
function sanitizeHeaders(headers: Record<string, any>): Record<string, any> {
const sanitized = { ...headers };
// Remove sensitive headers
const sensitiveHeaders = [
'authorization',
'cookie',
'x-api-key',
'x-auth-token',
];
sensitiveHeaders.forEach(header => {
if (sanitized[header]) {
sanitized[header] = '[REDACTED]';
}
});
return sanitized;
}
Rate Limiting Error Endpoints
Prevent attackers from using errors to probe your API:
import rateLimit from 'express-rate-limit';
const errorRateLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // Limit each IP to 100 error responses per window
message: {
error: {
code: 'too_many_errors',
message: 'Too many failed requests. Please try again later.',
},
},
skip: (req) => {
// Don't rate limit successful requests
return req.statusCode < 400;
},
});
app.use(errorRateLimiter);
Real-World Examples
Stripe's Error Handling
Stripe is known for excellent error messages that help developers fix issues quickly.
Example: Invalid card error
{
"error": {
"type": "card_error",
"code": "card_declined",
"decline_code": "insufficient_funds",
"message": "Your card has insufficient funds.",
"param": "source",
"charge": "ch_abc123"
}
}
What makes it good:
- Specific error type (
card_errorvsapi_error) - Machine-readable code (
card_declined) - Human-readable message ("Your card has insufficient funds")
- Actionable information (which parameter caused the error)
- Context (charge ID for debugging)
Check Stripe's API status on APIStatusCheck.
GitHub's Rate Limit Error
GitHub provides detailed information about rate limits in headers and error responses.
Response headers:
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1614556800
X-RateLimit-Used: 5000
X-RateLimit-Resource: core
Error response:
{
"message": "API rate limit exceeded for user ID 12345.",
"documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"
}
What makes it good:
- Headers provide all rate limit info (limit, remaining, reset time)
- Clear error message with user ID
- Link to documentation for more information
- Unix timestamp for exact reset time
Check GitHub's API status on APIStatusCheck.
AWS Error Response
AWS provides structured error responses with request IDs for support.
{
"Error": {
"Code": "InvalidParameterValue",
"Message": "Value (us-east-1x) for parameter AvailabilityZone is invalid. Subnets can currently only be created in the following zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1e, us-east-1f.",
"Type": "Sender"
},
"RequestId": "abc-123-def-456"
}
What makes it good:
- Specific error code (
InvalidParameterValue) - Detailed message explaining what's wrong and what's valid
- Error type (Sender = client error, Receiver = server error)
- Request ID for support cases
Check AWS's service status on APIStatusCheck.
Twilio's Error Response
Twilio includes error codes and links to detailed documentation.
{
"code": 21211,
"message": "The 'To' number +1234567890 is not a valid phone number.",
"more_info": "https://www.twilio.com/docs/errors/21211",
"status": 400
}
What makes it good:
- Numeric error code for easy reference
- Specific message with the problematic value
- Documentation link with more details
- HTTP status code included in body
Check Twilio's API status on APIStatusCheck.
Common Mistakes to Avoid
Mistake 1: Using 200 OK for Errors
❌ Wrong:
app.post('/users', async (req, res) => {
const result = await createUser(req.body);
if (!result.success) {
return res.status(200).json({
success: false,
error: result.error,
});
}
res.status(200).json({ success: true, user: result.user });
});
✅ Right:
app.post('/users', async (req, res) => {
try {
const user = await createUser(req.body);
res.status(201).json({ user });
} catch (error) {
if (error instanceof ValidationError) {
return res.status(422).json({
error: {
code: 'validation_error',
message: error.message,
details: error.fields,
},
});
}
throw error;
}
});
Mistake 2: Swallowing Errors Silently
❌ Wrong:
async function processOrder(orderId: string) {
try {
await sendConfirmationEmail(orderId);
} catch (error) {
// Silent failure - no one knows this failed!
}
}
✅ Right:
async function processOrder(orderId: string) {
try {
await sendConfirmationEmail(orderId);
} catch (error) {
// Log error but don't fail order processing
logger.error('Failed to send confirmation email', {
error,
order_id: orderId,
});
// Optionally queue for retry
await queueEmailForRetry(orderId);
}
}
Mistake 3: Generic Error Messages
❌ Wrong:
{
"error": "Invalid input"
}
✅ Right:
{
"error": {
"code": "validation_error",
"message": "Request validation failed",
"details": {
"fields": [
{
"field": "email",
"message": "Email address is required",
"code": "required"
},
{
"field": "password",
"message": "Password must be at least 8 characters",
"code": "min_length"
}
]
}
}
}
Mistake 4: Exposing Internal Details
❌ Wrong:
{
"error": "MongoError: connection refused to mongodb://internal-db-01:27017/production"
}
✅ Right:
{
"error": {
"code": "database_error",
"message": "A database error occurred. Please try again later.",
"request_id": "req_abc123"
}
}
Mistake 5: No Request ID
❌ Wrong:
{
"error": {
"code": "internal_error",
"message": "Something went wrong"
}
}
✅ Right:
{
"error": {
"code": "internal_error",
"message": "An unexpected error occurred",
"request_id": "req_7h3j8d9f2k"
}
}
Why request IDs matter:
- Support team can look up the exact request in logs
- Developers can correlate frontend errors with backend logs
- Easier debugging in distributed systems
Mistake 6: Inconsistent Error Format
❌ Wrong: Different error formats across endpoints
// Endpoint 1
{ "error": "User not found" }
// Endpoint 2
{ "message": "Invalid request", "code": 400 }
// Endpoint 3
{ "errors": ["Email required", "Password too short"] }
✅ Right: Consistent format everywhere
{
"error": {
"code": "user_not_found",
"message": "User not found"
}
}
{
"error": {
"code": "validation_error",
"message": "Request validation failed",
"details": {
"fields": [
{ "field": "email", "message": "Email required" },
{ "field": "password", "message": "Password too short" }
]
}
}
}
Mistake 7: Not Differentiating Error Types
❌ Wrong: Using 500 for everything
app.post('/users', async (req, res) => {
try {
const user = await createUser(req.body);
res.json({ user });
} catch (error) {
res.status(500).json({ error: error.message });
}
});
✅ Right: Use appropriate status codes
app.post('/users', async (req, res) => {
try {
const user = await createUser(req.body);
res.status(201).json({ user });
} catch (error) {
if (error instanceof ValidationError) {
return res.status(422).json({ error: formatValidationError(error) });
}
if (error instanceof DuplicateEmailError) {
return res.status(409).json({ error: formatDuplicateError(error) });
}
// Unknown error - 500
res.status(500).json({ error: formatInternalError(error) });
}
});
Mistake 8: Retrying Non-Idempotent Operations
❌ Wrong: Retrying POST without idempotency
// This could create duplicate orders!
await fetchWithRetry(() =>
apiClient.post('/orders', orderData)
);
✅ Right: Use idempotency keys for retries
const idempotencyKey = uuidv4();
await fetchWithRetry(() =>
apiClient.post('/orders', orderData, {
headers: {
'Idempotency-Key': idempotencyKey,
},
})
);
Production Readiness Checklist
Use this checklist before shipping your API to production:
Error Response Design
- All error responses follow a consistent format
- Error responses include machine-readable
codefield - Error responses include human-readable
messagefield - Request IDs are included in all error responses
- Documentation links are included for common errors
- Field-level validation errors are returned for 422 responses
HTTP Status Codes
- Appropriate status codes are used (not just 200/500)
- 4xx codes are used for client errors
- 5xx codes are used for server errors
- 429 responses include
Retry-AfterorX-RateLimit-Resetheaders - 503 responses include
Retry-Afterheader
Error Handling Implementation
- Centralized error handler middleware is implemented
- Custom error classes are used for different error types
- Async errors are caught and handled properly
- Database errors are caught and sanitized
- Third-party API errors are wrapped and normalized
Retry Logic
- Retry logic is implemented for transient failures
- Exponential backoff with jitter is used
- Maximum retry limit is set (typically 3-5)
- Only idempotent operations are retried automatically
- Circuit breaker pattern is used for failing dependencies
Logging & Monitoring
- All errors are logged with full context
- Structured logging (JSON) is used
- Error tracking service (Sentry, Rollbar) is integrated
- Alerts are set up for error rate spikes
- Alerts are set up for specific critical errors
- Request IDs link frontend and backend logs
Security
- Stack traces are not exposed in production
- Internal file paths are not exposed
- Database queries are not exposed
- API keys/tokens are redacted from logs
- Sensitive headers are sanitized before logging
- Rate limiting is applied to error endpoints
Graceful Degradation
- Non-essential features can be disabled via feature flags
- Fallback to cached data when dependencies fail
- Partial responses are returned when some data is unavailable
- Circuit breakers protect against cascade failures
Documentation
- Error codes are documented with examples
- Common error scenarios have troubleshooting guides
- Support contact information is provided
- Rate limit policies are documented
- Retry guidelines are documented
Testing
- Unit tests cover error handling logic
- Integration tests verify error responses
- Load tests verify error handling under stress
- Chaos engineering tests verify graceful degradation
Summary
Effective API error handling requires:
- Consistent error response format across all endpoints
- Proper HTTP status codes that reflect the error type
- Actionable error messages that help developers fix issues
- Retry logic with exponential backoff for transient failures
- Comprehensive error logging for debugging and monitoring
- Graceful degradation when dependencies fail
- Security-conscious error messages that don't leak sensitive information
The APIs developers love—Stripe, GitHub, Twilio—all excel at error handling. They make it easy to understand what went wrong and how to fix it. By following the patterns in this guide, you can build APIs that developers trust and enjoy working with.
Related Guides
- API Error Codes Explained - Complete HTTP status code reference
- Circuit Breaker Pattern - Prevent cascade failures
- API Rate Limiting Guide - Handle rate limits properly
- Complete API Dependency Monitoring Strategy - Monitor and detect failures
- Webhook Implementation Guide - Handle webhook errors and retries
Track API uptime for AWS, Stripe, GitHub, Twilio, Datadog, and 180+ other services on APIStatusCheck.com.
API Status Check
Stop checking API status pages manually
Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.
Free dashboard available · 14-day trial on paid plans · Cancel anytime
Browse Free Dashboard →