Where can I monitor API status in real-time?

API Status Check (apistatuscheck.com) provides real-time monitoring for 100+ APIs with uptime tracking and alerts. You can view dashboards, subscribe to feeds, and set up notifications in minutes.

API Error Handling Best Practices: Build Robust Production APIs (2026)

Q: API Error Handling Best Practices: Build Robust Production APIs (2026)?

This post explains API Error Handling Best Practices: Build Robust Production APIs (2026) with clear steps and practical examples. Use the guidance to apply the recommendations in your own API workflows.

Quick Answer: Effective API error handling requires consistent error response formats (with error codes, messages, and details), proper HTTP status codes, retry logic with exponential backoff for transient failures, comprehensive error logging, graceful degradation when dependencies fail, and user-friendly error messages that guide developers to solutions. Production APIs should handle errors at multiple layers (network, application, business logic) and provide enough context for debugging without exposing sensitive information.

Error handling is what separates hobby projects from production-ready systems. When Stripe's API returns an error, developers know exactly what went wrong and how to fix it. When GitHub's API hits a rate limit, the response includes headers telling you when to retry. This guide shows you how to build that same level of robustness into your APIs.

Why Error Handling Matters
The Three Layers of Error Handling
Designing Error Responses
HTTP Status Codes: Choosing the Right One
Server-Side Error Handling Patterns
Client-Side Error Handling Strategies
Retry Logic & Exponential Backoff
Error Logging & Monitoring
Graceful Degradation
Security Considerations
Real-World Examples
Common Mistakes to Avoid
Production Readiness Checklist

Why Error Handling Matters

Every API call can fail. Networks drop packets, databases time out, third-party services go down, and users send malformed requests. The question isn't if errors will happen—it's how your system responds when they do.

The Cost of Poor Error Handling

Consider this real-world scenario:

Before: Generic Error Handling

{
  "error": "Something went wrong"
}

Developer Experience: Developers spend 45 minutes debugging, checking logs, and testing different inputs
Support Load: 12 support tickets opened asking "What does 'something went wrong' mean?"
Churn Risk: 3 developers abandon the integration, citing "unclear error messages"

After: Comprehensive Error Handling

{
  "error": {
    "code": "invalid_card",
    "message": "The card number is invalid",
    "param": "card_number",
    "type": "card_error",
    "doc_url": "https://docs.example.com/errors/invalid_card"
  }
}

Developer Experience: Immediate clarity on what's wrong and how to fix it
Support Load: 73% reduction in error-related support tickets
Developer Satisfaction: 89% of developers rate error messages as "helpful" or "very helpful"

Key Benefits of Good Error Handling

Faster Debugging: Clear error messages reduce troubleshooting time from hours to minutes
Better DX: Developers trust APIs that tell them exactly what went wrong
Reduced Support Load: Self-service error resolution means fewer support tickets
System Resilience: Proper error handling prevents cascade failures
Operational Visibility: Comprehensive error logging enables proactive issue detection

The Three Layers of Error Handling

Production APIs handle errors at three distinct layers:

Layer 1: Network Errors (Infrastructure)

What they are: Failures in the network layer before your application code runs.

Examples:

DNS resolution failures
Connection timeouts
TLS/SSL handshake failures
Network unreachable

Handling strategy:

Implement connection pooling with health checks
Use circuit breakers to prevent cascade failures (see our circuit breaker pattern guide)
Set appropriate timeouts (connection, read, total)
Provide fallback mechanisms

Example:

import axios from 'axios';
import { CircuitBreaker } from 'opossum';

const apiClient = axios.create({
  timeout: 5000, // 5 second timeout
  maxRetries: 3,
});

const breaker = new CircuitBreaker(apiClient.get, {
  timeout: 3000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000,
});

breaker.fallback(() => {
  return { data: getCachedData(), fromCache: true };
});

Layer 2: Application Errors (Request Processing)

What they are: Errors that occur during request validation and processing.

Examples:

Invalid request parameters
Authentication failures
Authorization violations
Rate limit exceeded
Malformed JSON

Handling strategy:

Validate early (fail fast)
Return specific HTTP status codes
Provide actionable error messages
Include field-level validation errors

Example:

import { z } from 'zod';
import { Request, Response, NextFunction } from 'express';

const createUserSchema = z.object({
  email: z.string().email(),
  password: z.string().min(8),
  name: z.string().min(2).max(100),
});

export function validateRequest(schema: z.ZodSchema) {
  return (req: Request, res: Response, next: NextFunction) => {
    try {
      schema.parse(req.body);
      next();
    } catch (error) {
      if (error instanceof z.ZodError) {
        return res.status(400).json({
          error: {
            code: 'validation_error',
            message: 'Request validation failed',
            details: error.errors.map(err => ({
              field: err.path.join('.'),
              message: err.message,
              code: err.code,
            })),
          },
        });
      }
      next(error);
    }
  };
}

// Usage
app.post('/users', validateRequest(createUserSchema), createUserHandler);

Layer 3: Business Logic Errors (Domain Logic)

What they are: Errors related to your business rules and domain constraints.

Examples:

Insufficient account balance
Resource not found
Duplicate resource
State transition not allowed (e.g., can't cancel a shipped order)
Business rule violation

Handling strategy:

Create custom error classes for domain errors
Map domain errors to appropriate HTTP status codes
Provide context-specific error messages
Include recovery suggestions

Example:

// Custom error classes
export class InsufficientBalanceError extends Error {
  constructor(
    public required: number,
    public available: number,
    public accountId: string
  ) {
    super(`Insufficient balance: need $${required}, have $${available}`);
    this.name = 'InsufficientBalanceError';
  }
}

export class ResourceNotFoundError extends Error {
  constructor(
    public resourceType: string,
    public resourceId: string
  ) {
    super(`${resourceType} ${resourceId} not found`);
    this.name = 'ResourceNotFoundError';
  }
}

// Error mapper
export function mapDomainErrorToResponse(error: Error) {
  if (error instanceof InsufficientBalanceError) {
    return {
      status: 402,
      body: {
        error: {
          code: 'insufficient_balance',
          message: error.message,
          details: {
            required: error.required,
            available: error.available,
            account_id: error.accountId,
          },
          recovery: 'Add funds to your account or reduce the transaction amount',
        },
      },
    };
  }

  if (error instanceof ResourceNotFoundError) {
    return {
      status: 404,
      body: {
        error: {
          code: 'resource_not_found',
          message: error.message,
          details: {
            resource_type: error.resourceType,
            resource_id: error.resourceId,
          },
        },
      },
    };
  }

  // Default 500 for unknown errors
  return {
    status: 500,
    body: {
      error: {
        code: 'internal_error',
        message: 'An unexpected error occurred',
      },
    },
  };
}

// Express error handler middleware
app.use((err: Error, req: Request, res: Response, next: NextFunction) => {
  const { status, body } = mapDomainErrorToResponse(err);
  
  // Log error for monitoring
  logger.error('Request failed', {
    error: err,
    status,
    path: req.path,
    method: req.method,
  });

  res.status(status).json(body);
});

Designing Error Responses

Consistency is key. Every error response should follow the same structure.

The Standard Error Response Format

{
  "error": {
    "code": "validation_error",
    "message": "Human-readable error message",
    "type": "client_error",
    "details": {
      "field": "email",
      "reason": "invalid_format"
    },
    "request_id": "req_abc123",
    "doc_url": "https://docs.example.com/errors/validation_error"
  }
}

Essential Error Response Fields

Field	Purpose	Example
code	Machine-readable error identifier	`"invalid_card"`, `"rate_limit_exceeded"`
message	Human-readable description	`"The card was declined"`
type	Error category	`"card_error"`, `"api_error"`, `"auth_error"`
details	Context-specific information	`{ "param": "card_number" }`
request_id	Unique identifier for this request	`"req_abc123"` (for support debugging)
doc_url	Link to error documentation	`"https://docs.example.com/errors/..."`

TypeScript Error Response Types

export interface APIError {
  error: {
    code: string;
    message: string;
    type: 'client_error' | 'server_error' | 'auth_error' | 'rate_limit_error';
    details?: Record<string, any>;
    request_id: string;
    doc_url?: string;
  };
}

export interface ValidationError extends APIError {
  error: {
    code: 'validation_error';
    message: string;
    type: 'client_error';
    details: {
      fields: Array<{
        field: string;
        message: string;
        code: string;
      }>;
    };
    request_id: string;
  };
}

export interface RateLimitError extends APIError {
  error: {
    code: 'rate_limit_exceeded';
    message: string;
    type: 'rate_limit_error';
    details: {
      limit: number;
      remaining: 0;
      reset_at: string; // ISO 8601 timestamp
    };
    request_id: string;
  };
}

Real-World Error Response Examples

Stripe Card Error:

{
  "error": {
    "code": "card_declined",
    "doc_url": "https://stripe.com/docs/error-codes/card-declined",
    "message": "Your card was declined.",
    "param": "exp_month",
    "type": "card_error"
  }
}

GitHub Rate Limit Error:

{
  "message": "API rate limit exceeded",
  "documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"
}

AWS API Error:

{
  "Error": {
    "Code": "InvalidParameterValue",
    "Message": "Value (us-east-1x) for parameter AvailabilityZone is invalid."
  },
  "RequestId": "abc123"
}

HTTP Status Codes: Choosing the Right One

Using the correct HTTP status code is crucial for API consumers to handle errors appropriately.

Quick Reference Table

Status Code	Use When	Example
400 Bad Request	Request is malformed or invalid	Missing required field, invalid JSON
401 Unauthorized	Authentication is missing or invalid	No API key, expired token
403 Forbidden	User authenticated but lacks permission	Accessing another user's resource
404 Not Found	Resource doesn't exist	`/users/999999` for non-existent user
409 Conflict	Request conflicts with current state	Creating duplicate resource
422 Unprocessable Entity	Validation failed (well-formed but invalid)	Email already exists, invalid enum value
429 Too Many Requests	Rate limit exceeded	More than 100 requests per minute
500 Internal Server Error	Unexpected server error	Uncaught exception, database crash
502 Bad Gateway	Upstream service failed	Payment processor down
503 Service Unavailable	Temporary unavailability	Scheduled maintenance, overloaded
504 Gateway Timeout	Upstream service timed out	Third-party API took too long

Decision Tree for Status Codes

Is the request itself malformed (invalid JSON, wrong HTTP method)?
  → 400 Bad Request

Is authentication missing or invalid?
  → 401 Unauthorized

Is the user authenticated but not authorized for this action?
  → 403 Forbidden

Does the resource not exist?
  → 404 Not Found

Is the request valid but conflicts with current state?
  → 409 Conflict

Is the request well-formed but fails validation rules?
  → 422 Unprocessable Entity

Is the rate limit exceeded?
  → 429 Too Many Requests

Did the server encounter an unexpected error?
  → 500 Internal Server Error

Did an upstream dependency fail?
  → 502 Bad Gateway (if it failed)
  → 503 Service Unavailable (if it's temporarily down)
  → 504 Gateway Timeout (if it timed out)

Common Status Code Mistakes

❌ Wrong: Using 200 OK with error in response body

HTTP/1.1 200 OK
{
  "success": false,
  "error": "User not found"
}

✅ Right: Using proper status code

HTTP/1.1 404 Not Found
{
  "error": {
    "code": "user_not_found",
    "message": "User not found"
  }
}

❌ Wrong: Using 500 for validation errors

HTTP/1.1 500 Internal Server Error
{
  "error": "Email is required"
}

✅ Right: Using 400 or 422 for client errors

HTTP/1.1 422 Unprocessable Entity
{
  "error": {
    "code": "validation_error",
    "message": "Email is required"
  }
}

See our complete API error codes guide for detailed status code reference.

Server-Side Error Handling Patterns

Pattern 1: Centralized Error Handler

Create a single error handling middleware that processes all errors.

import { Request, Response, NextFunction } from 'express';
import { logger } from './logger';

interface AppError extends Error {
  statusCode?: number;
  code?: string;
  details?: Record<string, any>;
}

export function errorHandler(
  err: AppError,
  req: Request,
  res: Response,
  next: NextFunction
) {
  const statusCode = err.statusCode || 500;
  const requestId = req.headers['x-request-id'] as string || generateRequestId();

  // Log error with context
  logger.error('Request failed', {
    error: {
      name: err.name,
      message: err.message,
      stack: err.stack,
      code: err.code,
    },
    request: {
      id: requestId,
      method: req.method,
      path: req.path,
      query: req.query,
      headers: sanitizeHeaders(req.headers),
    },
    statusCode,
  });

  // Don't expose internal errors to clients
  const isInternalError = statusCode >= 500;
  
  res.status(statusCode).json({
    error: {
      code: err.code || (isInternalError ? 'internal_error' : 'unknown_error'),
      message: isInternalError 
        ? 'An unexpected error occurred. Please try again later.'
        : err.message,
      ...(err.details && { details: err.details }),
      request_id: requestId,
      ...(process.env.NODE_ENV === 'development' && {
        stack: err.stack,
      }),
    },
  });
}

// Use in Express app
app.use(errorHandler);

Pattern 2: Custom Error Classes

Create specific error classes for different error scenarios.

export class APIError extends Error {
  constructor(
    public statusCode: number,
    public code: string,
    message: string,
    public details?: Record<string, any>
  ) {
    super(message);
    this.name = 'APIError';
    Error.captureStackTrace(this, this.constructor);
  }
}

export class ValidationError extends APIError {
  constructor(
    message: string,
    public fields: Array<{ field: string; message: string }>
  ) {
    super(422, 'validation_error', message, { fields });
    this.name = 'ValidationError';
  }
}

export class AuthenticationError extends APIError {
  constructor(message = 'Authentication required') {
    super(401, 'authentication_required', message);
    this.name = 'AuthenticationError';
  }
}

export class AuthorizationError extends APIError {
  constructor(message = 'Permission denied') {
    super(403, 'permission_denied', message);
    this.name = 'AuthorizationError';
  }
}

export class NotFoundError extends APIError {
  constructor(resource: string, id: string) {
    super(404, 'resource_not_found', `${resource} not found`, {
      resource,
      id,
    });
    this.name = 'NotFoundError';
  }
}

export class RateLimitError extends APIError {
  constructor(
    public limit: number,
    public resetAt: Date
  ) {
    super(429, 'rate_limit_exceeded', 'Rate limit exceeded', {
      limit,
      reset_at: resetAt.toISOString(),
    });
    this.name = 'RateLimitError';
  }
}

// Usage
async function getUserById(id: string) {
  const user = await db.user.findUnique({ where: { id } });
  if (!user) {
    throw new NotFoundError('User', id);
  }
  return user;
}

Pattern 3: Async Error Wrapper

Automatically catch async errors without try-catch everywhere.

type AsyncRequestHandler = (
  req: Request,
  res: Response,
  next: NextFunction
) => Promise<any>;

export function asyncHandler(fn: AsyncRequestHandler) {
  return (req: Request, res: Response, next: NextFunction) => {
    Promise.resolve(fn(req, res, next)).catch(next);
  };
}

// Usage - no try-catch needed!
app.get('/users/:id', asyncHandler(async (req, res) => {
  const user = await getUserById(req.params.id); // Throws NotFoundError if not found
  res.json({ user });
}));

Pattern 4: Result Type (Railway-Oriented Programming)

Use Result types to make errors explicit in function signatures.

type Result<T, E = Error> = 
  | { success: true; value: T }
  | { success: false; error: E };

export async function getUserSafe(id: string): Promise<Result<User, NotFoundError>> {
  try {
    const user = await db.user.findUnique({ where: { id } });
    if (!user) {
      return {
        success: false,
        error: new NotFoundError('User', id),
      };
    }
    return { success: true, value: user };
  } catch (error) {
    return {
      success: false,
      error: new APIError(500, 'database_error', 'Database query failed'),
    };
  }
}

// Usage - explicitly handle success/failure
const result = await getUserSafe(userId);
if (result.success) {
  console.log('User:', result.value);
} else {
  console.error('Error:', result.error);
}

Client-Side Error Handling Strategies

Pattern 1: Axios Interceptor

Centralize error handling for all HTTP requests.

import axios, { AxiosError } from 'axios';

const apiClient = axios.create({
  baseURL: 'https://api.example.com',
  timeout: 10000,
});

// Request interceptor - add auth token
apiClient.interceptors.request.use(
  (config) => {
    const token = getAuthToken();
    if (token) {
      config.headers.Authorization = `Bearer ${token}`;
    }
    return config;
  },
  (error) => Promise.reject(error)
);

// Response interceptor - handle errors globally
apiClient.interceptors.response.use(
  (response) => response,
  async (error: AxiosError<APIError>) => {
    const status = error.response?.status;
    const errorData = error.response?.data;

    // Token expired - refresh and retry
    if (status === 401 && errorData?.error.code === 'token_expired') {
      try {
        const newToken = await refreshAuthToken();
        setAuthToken(newToken);
        
        // Retry original request with new token
        error.config!.headers.Authorization = `Bearer ${newToken}`;
        return apiClient.request(error.config!);
      } catch (refreshError) {
        // Refresh failed - redirect to login
        redirectToLogin();
        return Promise.reject(refreshError);
      }
    }

    // Rate limit - wait and retry
    if (status === 429) {
      const resetAt = errorData?.error.details?.reset_at;
      const waitMs = resetAt 
        ? new Date(resetAt).getTime() - Date.now()
        : 60000; // Default 1 minute
      
      await new Promise(resolve => setTimeout(resolve, waitMs));
      return apiClient.request(error.config!);
    }

    // Server error - show generic message
    if (status && status >= 500) {
      showToast('Server error. Please try again later.', 'error');
      return Promise.reject(error);
    }

    // Client error - let component handle it
    return Promise.reject(error);
  }
);

export default apiClient;

Pattern 2: React Error Boundary

Catch rendering errors in React components.

import React from 'react';
import { logger } from './logger';

interface ErrorBoundaryProps {
  children: React.ReactNode;
  fallback?: React.ReactNode;
}

interface ErrorBoundaryState {
  hasError: boolean;
  error: Error | null;
}

export class ErrorBoundary extends React.Component<
  ErrorBoundaryProps,
  ErrorBoundaryState
> {
  constructor(props: ErrorBoundaryProps) {
    super(props);
    this.state = { hasError: false, error: null };
  }

  static getDerivedStateFromError(error: Error): ErrorBoundaryState {
    return { hasError: true, error };
  }

  componentDidCatch(error: Error, errorInfo: React.ErrorInfo) {
    logger.error('React component error', {
      error: {
        message: error.message,
        stack: error.stack,
      },
      errorInfo,
    });
  }

  render() {
    if (this.state.hasError) {
      return this.props.fallback || (
        <div className="error-container">
          <h1>Something went wrong</h1>
          <p>We've been notified and are working on a fix.</p>
          <button onClick={() => window.location.reload()}>
            Reload page
          </button>
        </div>
      );
    }

    return this.props.children;
  }
}

// Usage
<ErrorBoundary fallback={<ErrorFallback />}>
  <App />
</ErrorBoundary>

Pattern 3: Type-Safe API Client

Use TypeScript to enforce error handling.

import { z } from 'zod';

const UserSchema = z.object({
  id: z.string(),
  email: z.string().email(),
  name: z.string(),
});

type User = z.infer<typeof UserSchema>;

const ErrorSchema = z.object({
  error: z.object({
    code: z.string(),
    message: z.string(),
    details: z.record(z.any()).optional(),
  }),
});

export async function fetchUser(id: string): Promise<User> {
  const response = await fetch(`https://api.example.com/users/${id}`);
  
  if (!response.ok) {
    const errorData = await response.json();
    const parsedError = ErrorSchema.parse(errorData);
    throw new APIError(
      response.status,
      parsedError.error.code,
      parsedError.error.message,
      parsedError.error.details
    );
  }

  const data = await response.json();
  return UserSchema.parse(data); // Validates response shape
}

// Usage with error handling
try {
  const user = await fetchUser('123');
  console.log('User:', user);
} catch (error) {
  if (error instanceof APIError) {
    if (error.statusCode === 404) {
      console.log('User not found');
    } else if (error.statusCode === 429) {
      console.log('Rate limited');
    } else {
      console.error('API error:', error.message);
    }
  } else {
    console.error('Unexpected error:', error);
  }
}

Retry Logic & Exponential Backoff

Not all errors are permanent. Transient failures (network hiccups, temporary overload) should be retried.

When to Retry

Retry these errors:

Network timeouts (ETIMEDOUT, ECONNRESET)
503 Service Unavailable (temporary overload)
429 Too Many Requests (rate limit, with backoff)
502 Bad Gateway (upstream temporary failure)
504 Gateway Timeout (upstream timeout)

Don't retry these:

400 Bad Request (malformed request won't fix itself)
401 Unauthorized (auth token won't spontaneously appear)
403 Forbidden (permissions won't change mid-request)
404 Not Found (resource doesn't exist)
422 Unprocessable Entity (validation errors won't fix themselves)

Exponential Backoff Implementation

interface RetryConfig {
  maxRetries: number;
  baseDelayMs: number;
  maxDelayMs: number;
  retryableStatuses: number[];
}

const defaultRetryConfig: RetryConfig = {
  maxRetries: 3,
  baseDelayMs: 1000, // 1 second
  maxDelayMs: 32000, // 32 seconds
  retryableStatuses: [408, 429, 502, 503, 504],
};

function calculateBackoff(attempt: number, baseDelayMs: number, maxDelayMs: number): number {
  // Exponential backoff: 2^attempt * baseDelay
  const exponentialDelay = Math.pow(2, attempt) * baseDelayMs;
  
  // Add jitter (random 0-20% variation) to prevent thundering herd
  const jitter = exponentialDelay * 0.2 * Math.random();
  
  // Cap at maxDelay
  return Math.min(exponentialDelay + jitter, maxDelayMs);
}

async function fetchWithRetry<T>(
  fn: () => Promise<T>,
  config: Partial<RetryConfig> = {}
): Promise<T> {
  const cfg = { ...defaultRetryConfig, ...config };
  let lastError: Error;

  for (let attempt = 0; attempt <= cfg.maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error: any) {
      lastError = error;
      
      const status = error.response?.status;
      const isLastAttempt = attempt === cfg.maxRetries;
      const isRetryable = status && cfg.retryableStatuses.includes(status);

      // Don't retry if not retryable or last attempt
      if (!isRetryable || isLastAttempt) {
        throw error;
      }

      // Special handling for 429 (rate limit)
      if (status === 429) {
        const resetHeader = error.response?.headers['x-ratelimit-reset'];
        const retryAfterHeader = error.response?.headers['retry-after'];
        
        if (resetHeader) {
          const resetAt = new Date(resetHeader).getTime();
          const waitMs = resetAt - Date.now();
          await new Promise(resolve => setTimeout(resolve, waitMs));
          continue;
        }
        
        if (retryAfterHeader) {
          const waitMs = parseInt(retryAfterHeader) * 1000;
          await new Promise(resolve => setTimeout(resolve, waitMs));
          continue;
        }
      }

      // Calculate backoff delay
      const delayMs = calculateBackoff(attempt, cfg.baseDelayMs, cfg.maxDelayMs);
      
      console.log(`Retry attempt ${attempt + 1}/${cfg.maxRetries} after ${delayMs}ms`);
      
      await new Promise(resolve => setTimeout(resolve, delayMs));
    }
  }

  throw lastError!;
}

// Usage
const user = await fetchWithRetry(
  () => apiClient.get<User>('/users/123'),
  { maxRetries: 5 }
);

Using Libraries

Axios Retry:

import axios from 'axios';
import axiosRetry from 'axios-retry';

const apiClient = axios.create({
  baseURL: 'https://api.example.com',
});

axiosRetry(apiClient, {
  retries: 3,
  retryDelay: axiosRetry.exponentialDelay,
  retryCondition: (error) => {
    return axiosRetry.isNetworkOrIdempotentRequestError(error) ||
           [429, 502, 503, 504].includes(error.response?.status || 0);
  },
  onRetry: (retryCount, error, requestConfig) => {
    console.log(`Retrying request (attempt ${retryCount}):`, requestConfig.url);
  },
});

p-retry:

import pRetry from 'p-retry';

const user = await pRetry(
  async () => {
    const response = await fetch('https://api.example.com/users/123');
    if (!response.ok) {
      const error: any = new Error('Request failed');
      error.statusCode = response.status;
      throw error;
    }
    return response.json();
  },
  {
    retries: 5,
    onFailedAttempt: (error) => {
      console.log(
        `Attempt ${error.attemptNumber} failed. ${error.retriesLeft} retries left.`
      );
    },
  }
);

See our API rate limiting guide for rate limit best practices.

Error Logging & Monitoring

You can't fix errors you don't know about. Comprehensive logging is essential.

What to Log

For every error, log:

Timestamp (ISO 8601 format)
Error details (name, message, stack trace)
Request context (method, path, query params, headers)
User context (user ID, session ID, IP address)
Response (status code, response time)
Environment (Node version, deployment version)

import winston from 'winston';

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.errors({ stack: true }),
    winston.format.json()
  ),
  transports: [
    new winston.transports.File({ filename: 'error.log', level: 'error' }),
    new winston.transports.File({ filename: 'combined.log' }),
  ],
});

// Add console in development
if (process.env.NODE_ENV !== 'production') {
  logger.add(new winston.transports.Console({
    format: winston.format.simple(),
  }));
}

export function logError(error: Error, context: Record<string, any>) {
  logger.error('Request failed', {
    error: {
      name: error.name,
      message: error.message,
      stack: error.stack,
    },
    ...context,
  });
}

Structured Logging

Use structured logs (JSON) for easier searching and filtering.

// ❌ Bad: String interpolation
logger.error(`User ${userId} failed to create order ${orderId}: ${error.message}`);

// ✅ Good: Structured data
logger.error('Order creation failed', {
  error: {
    message: error.message,
    stack: error.stack,
  },
  user_id: userId,
  order_id: orderId,
  action: 'create_order',
});

Integration with Error Tracking Services

Sentry:

import * as Sentry from '@sentry/node';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV,
  tracesSampleRate: 1.0,
});

export function errorHandler(
  err: AppError,
  req: Request,
  res: Response,
  next: NextFunction
) {
  // Capture error in Sentry
  Sentry.captureException(err, {
    tags: {
      endpoint: req.path,
      method: req.method,
    },
    extra: {
      request_id: req.headers['x-request-id'],
      user_id: req.user?.id,
      query: req.query,
      body: req.body,
    },
  });

  // Send error response
  res.status(err.statusCode || 500).json({
    error: {
      code: err.code,
      message: err.message,
    },
  });
}

Datadog:

import { datadogLogs } from '@datadog/browser-logs';

datadogLogs.init({
  clientToken: process.env.DD_CLIENT_TOKEN!,
  site: 'datadoghq.com',
  forwardErrorsToLogs: true,
  sessionSampleRate: 100,
});

// Log error
datadogLogs.logger.error('API request failed', {
  error: error.message,
  status: error.statusCode,
  endpoint: '/users/123',
  user_id: 'user_abc',
});

Setting Up Alerts

Create alerts for critical error patterns:

Alert on error rate spike:

Error rate > 5% of total requests in last 5 minutes
→ Page DevOps team

Alert on specific errors:

Database connection errors > 3 in last 1 minute
→ Page database team

Alert on 5xx errors:

5xx errors > 10 in last 5 minutes
→ Alert in Slack #incidents channel

Check Datadog's status and New Relic's status on APIStatusCheck.

Graceful Degradation

When dependencies fail, your API should degrade gracefully rather than crash.

Strategy 1: Fallback to Cached Data

import NodeCache from 'node-cache';

const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour TTL

async function getUserWithFallback(id: string): Promise<User> {
  const cacheKey = `user:${id}`;

  try {
    // Try primary database
    const user = await db.user.findUnique({ where: { id } });
    
    if (user) {
      // Cache successful response
      cache.set(cacheKey, user);
      return user;
    }
    
    throw new NotFoundError('User', id);
  } catch (error) {
    // Check cache on error
    const cachedUser = cache.get<User>(cacheKey);
    
    if (cachedUser) {
      console.warn('Returning stale user data from cache', { user_id: id });
      return {
        ...cachedUser,
        _stale: true, // Flag as potentially outdated
      } as User;
    }
    
    // No cache available - propagate error
    throw error;
  }
}

Strategy 2: Feature Flags

Disable non-essential features when dependencies fail.

import { getFeatureFlag } from './feature-flags';

async function createOrder(orderData: CreateOrderInput): Promise<Order> {
  // Core functionality - always runs
  const order = await db.order.create({ data: orderData });

  // Non-essential features - can be disabled
  try {
    if (getFeatureFlag('email_notifications')) {
      await sendOrderConfirmationEmail(order);
    }
  } catch (error) {
    // Log but don't fail the request
    logger.error('Failed to send order email', { error, order_id: order.id });
  }

  try {
    if (getFeatureFlag('analytics_tracking')) {
      await trackOrderCreated(order);
    }
  } catch (error) {
    logger.error('Failed to track order', { error, order_id: order.id });
  }

  return order;
}

Strategy 3: Circuit Breaker Pattern

Prevent repeated calls to failing dependencies. See our complete circuit breaker pattern guide.

import CircuitBreaker from 'opossum';

const paymentBreaker = new CircuitBreaker(processPayment, {
  timeout: 3000, // 3 seconds
  errorThresholdPercentage: 50, // Open after 50% failures
  resetTimeout: 30000, // Try again after 30 seconds
});

// Fallback to payment queue when circuit opens
paymentBreaker.fallback((paymentData) => {
  logger.warn('Payment service unavailable, queueing payment', {
    payment_id: paymentData.id,
  });
  
  return queuePaymentForLater(paymentData);
});

// Use the circuit breaker
const result = await paymentBreaker.fire(paymentData);

Strategy 4: Partial Response

Return partial data when some dependencies fail.

interface DashboardData {
  user: User;
  orders: Order[] | null;
  analytics: Analytics | null;
  recommendations: Product[] | null;
  errors?: string[];
}

async function getDashboard(userId: string): Promise<DashboardData> {
  const errors: string[] = [];
  
  // Required data - fail if missing
  const user = await getUserById(userId);

  // Optional data - catch errors and continue
  let orders: Order[] | null = null;
  try {
    orders = await getOrders(userId);
  } catch (error) {
    logger.error('Failed to fetch orders', { error, user_id: userId });
    errors.push('orders_unavailable');
  }

  let analytics: Analytics | null = null;
  try {
    analytics = await getAnalytics(userId);
  } catch (error) {
    logger.error('Failed to fetch analytics', { error, user_id: userId });
    errors.push('analytics_unavailable');
  }

  let recommendations: Product[] | null = null;
  try {
    recommendations = await getRecommendations(userId);
  } catch (error) {
    logger.error('Failed to fetch recommendations', { error, user_id: userId });
    errors.push('recommendations_unavailable');
  }

  return {
    user,
    orders,
    analytics,
    recommendations,
    ...(errors.length > 0 && { errors }),
  };
}

Security Considerations

Error messages can leak sensitive information. Balance helpful debugging with security.

What NOT to Expose

❌ Stack traces in production

{
  "error": "Error: Invalid API key\n    at validateApiKey (/app/auth.ts:42:11)\n    at processRequest (/app/server.ts:89:5)"
}

❌ Internal file paths

{
  "error": "ENOENT: no file exists at /var/www/app/uploads/user-123/document.pdf"
}

❌ Database queries

{
  "error": "SQL Error: SELECT * FROM users WHERE email = 'attacker@example.com' AND password = 'hunter2'"
}

❌ API keys or tokens

{
  "error": "Stripe API key sk_live_abc123 is invalid"
}

🔐 Leaked API keys in error messages are a top cause of credential breaches. 1Password stores your API keys in encrypted vaults with CLI access (op read), injects them at runtime, and ensures they never appear in logs, error messages, or environment dumps.

❌ Specific software versions

{
  "error": "MySQL 5.7.22 connection failed"
}

Safe Error Responses

✅ Production-safe error messages:

function sanitizeError(error: Error, isProduction: boolean): APIError {
  // Never expose internal errors in production
  if (isProduction && !(error instanceof APIError)) {
    return {
      error: {
        code: 'internal_error',
        message: 'An unexpected error occurred',
        type: 'server_error',
        request_id: generateRequestId(),
      },
    };
  }

  // Only include stack trace in development
  return {
    error: {
      code: error.code || 'unknown_error',
      message: error.message,
      type: 'server_error',
      request_id: generateRequestId(),
      ...(process.env.NODE_ENV === 'development' && {
        stack: error.stack,
      }),
    },
  };
}

Sanitizing Headers

Remove sensitive headers from logs:

function sanitizeHeaders(headers: Record<string, any>): Record<string, any> {
  const sanitized = { ...headers };
  
  // Remove sensitive headers
  const sensitiveHeaders = [
    'authorization',
    'cookie',
    'x-api-key',
    'x-auth-token',
  ];
  
  sensitiveHeaders.forEach(header => {
    if (sanitized[header]) {
      sanitized[header] = '[REDACTED]';
    }
  });
  
  return sanitized;
}

Rate Limiting Error Endpoints

Prevent attackers from using errors to probe your API:

import rateLimit from 'express-rate-limit';

const errorRateLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 error responses per window
  message: {
    error: {
      code: 'too_many_errors',
      message: 'Too many failed requests. Please try again later.',
    },
  },
  skip: (req) => {
    // Don't rate limit successful requests
    return req.statusCode < 400;
  },
});

app.use(errorRateLimiter);

Real-World Examples

Stripe's Error Handling

Stripe is known for excellent error messages that help developers fix issues quickly.

Example: Invalid card error

{
  "error": {
    "type": "card_error",
    "code": "card_declined",
    "decline_code": "insufficient_funds",
    "message": "Your card has insufficient funds.",
    "param": "source",
    "charge": "ch_abc123"
  }
}

What makes it good:

Specific error type (card_error vs api_error)
Machine-readable code (card_declined)
Human-readable message ("Your card has insufficient funds")
Actionable information (which parameter caused the error)
Context (charge ID for debugging)

Check Stripe's API status on APIStatusCheck.

GitHub's Rate Limit Error

GitHub provides detailed information about rate limits in headers and error responses.

Response headers:

X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1614556800
X-RateLimit-Used: 5000
X-RateLimit-Resource: core

Error response:

{
  "message": "API rate limit exceeded for user ID 12345.",
  "documentation_url": "https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"
}

What makes it good:

Headers provide all rate limit info (limit, remaining, reset time)
Clear error message with user ID
Link to documentation for more information
Unix timestamp for exact reset time

Check GitHub's API status on APIStatusCheck.

AWS Error Response

AWS provides structured error responses with request IDs for support.

{
  "Error": {
    "Code": "InvalidParameterValue",
    "Message": "Value (us-east-1x) for parameter AvailabilityZone is invalid. Subnets can currently only be created in the following zones: us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1e, us-east-1f.",
    "Type": "Sender"
  },
  "RequestId": "abc-123-def-456"
}

What makes it good:

Specific error code (InvalidParameterValue)
Detailed message explaining what's wrong and what's valid
Error type (Sender = client error, Receiver = server error)
Request ID for support cases

Check AWS's service status on APIStatusCheck.

Twilio's Error Response

Twilio includes error codes and links to detailed documentation.

{
  "code": 21211,
  "message": "The 'To' number +1234567890 is not a valid phone number.",
  "more_info": "https://www.twilio.com/docs/errors/21211",
  "status": 400
}

What makes it good:

Numeric error code for easy reference
Specific message with the problematic value
Documentation link with more details
HTTP status code included in body

Check Twilio's API status on APIStatusCheck.

Common Mistakes to Avoid

Mistake 1: Using 200 OK for Errors

❌ Wrong:

app.post('/users', async (req, res) => {
  const result = await createUser(req.body);
  
  if (!result.success) {
    return res.status(200).json({
      success: false,
      error: result.error,
    });
  }
  
  res.status(200).json({ success: true, user: result.user });
});

✅ Right:

app.post('/users', async (req, res) => {
  try {
    const user = await createUser(req.body);
    res.status(201).json({ user });
  } catch (error) {
    if (error instanceof ValidationError) {
      return res.status(422).json({
        error: {
          code: 'validation_error',
          message: error.message,
          details: error.fields,
        },
      });
    }
    throw error;
  }
});

Mistake 2: Swallowing Errors Silently

❌ Wrong:

async function processOrder(orderId: string) {
  try {
    await sendConfirmationEmail(orderId);
  } catch (error) {
    // Silent failure - no one knows this failed!
  }
}

✅ Right:

async function processOrder(orderId: string) {
  try {
    await sendConfirmationEmail(orderId);
  } catch (error) {
    // Log error but don't fail order processing
    logger.error('Failed to send confirmation email', {
      error,
      order_id: orderId,
    });
    
    // Optionally queue for retry
    await queueEmailForRetry(orderId);
  }
}

Mistake 3: Generic Error Messages

❌ Wrong:

{
  "error": "Invalid input"
}

✅ Right:

{
  "error": {
    "code": "validation_error",
    "message": "Request validation failed",
    "details": {
      "fields": [
        {
          "field": "email",
          "message": "Email address is required",
          "code": "required"
        },
        {
          "field": "password",
          "message": "Password must be at least 8 characters",
          "code": "min_length"
        }
      ]
    }
  }
}

Mistake 4: Exposing Internal Details

❌ Wrong:

{
  "error": "MongoError: connection refused to mongodb://internal-db-01:27017/production"
}

✅ Right:

{
  "error": {
    "code": "database_error",
    "message": "A database error occurred. Please try again later.",
    "request_id": "req_abc123"
  }
}

Mistake 5: No Request ID

❌ Wrong:

{
  "error": {
    "code": "internal_error",
    "message": "Something went wrong"
  }
}

✅ Right:

{
  "error": {
    "code": "internal_error",
    "message": "An unexpected error occurred",
    "request_id": "req_7h3j8d9f2k"
  }
}

Why request IDs matter:

Support team can look up the exact request in logs
Developers can correlate frontend errors with backend logs
Easier debugging in distributed systems

Mistake 6: Inconsistent Error Format

❌ Wrong: Different error formats across endpoints

// Endpoint 1
{ "error": "User not found" }

// Endpoint 2
{ "message": "Invalid request", "code": 400 }

// Endpoint 3
{ "errors": ["Email required", "Password too short"] }

✅ Right: Consistent format everywhere

{
  "error": {
    "code": "user_not_found",
    "message": "User not found"
  }
}

{
  "error": {
    "code": "validation_error",
    "message": "Request validation failed",
    "details": {
      "fields": [
        { "field": "email", "message": "Email required" },
        { "field": "password", "message": "Password too short" }
      ]
    }
  }
}

Mistake 7: Not Differentiating Error Types

❌ Wrong: Using 500 for everything

app.post('/users', async (req, res) => {
  try {
    const user = await createUser(req.body);
    res.json({ user });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

✅ Right: Use appropriate status codes

app.post('/users', async (req, res) => {
  try {
    const user = await createUser(req.body);
    res.status(201).json({ user });
  } catch (error) {
    if (error instanceof ValidationError) {
      return res.status(422).json({ error: formatValidationError(error) });
    }
    if (error instanceof DuplicateEmailError) {
      return res.status(409).json({ error: formatDuplicateError(error) });
    }
    // Unknown error - 500
    res.status(500).json({ error: formatInternalError(error) });
  }
});

Mistake 8: Retrying Non-Idempotent Operations

❌ Wrong: Retrying POST without idempotency

// This could create duplicate orders!
await fetchWithRetry(() => 
  apiClient.post('/orders', orderData)
);

✅ Right: Use idempotency keys for retries

const idempotencyKey = uuidv4();

await fetchWithRetry(() => 
  apiClient.post('/orders', orderData, {
    headers: {
      'Idempotency-Key': idempotencyKey,
    },
  })
);

Production Readiness Checklist

Use this checklist before shipping your API to production:

Error Response Design

All error responses follow a consistent format
Error responses include machine-readable code field
Error responses include human-readable message field
Request IDs are included in all error responses
Documentation links are included for common errors
Field-level validation errors are returned for 422 responses

HTTP Status Codes

Appropriate status codes are used (not just 200/500)
4xx codes are used for client errors
5xx codes are used for server errors
429 responses include Retry-After or X-RateLimit-Reset headers
503 responses include Retry-After header

Error Handling Implementation

Centralized error handler middleware is implemented
Custom error classes are used for different error types
Async errors are caught and handled properly
Database errors are caught and sanitized
Third-party API errors are wrapped and normalized

Retry Logic

Retry logic is implemented for transient failures
Exponential backoff with jitter is used
Maximum retry limit is set (typically 3-5)
Only idempotent operations are retried automatically
Circuit breaker pattern is used for failing dependencies

Logging & Monitoring

All errors are logged with full context
Structured logging (JSON) is used
Error tracking service (Sentry, Rollbar) is integrated
Alerts are set up for error rate spikes
Alerts are set up for specific critical errors
Request IDs link frontend and backend logs

Security

Stack traces are not exposed in production
Internal file paths are not exposed
Database queries are not exposed
API keys/tokens are redacted from logs
Sensitive headers are sanitized before logging
Rate limiting is applied to error endpoints

Graceful Degradation

Non-essential features can be disabled via feature flags
Fallback to cached data when dependencies fail
Partial responses are returned when some data is unavailable
Circuit breakers protect against cascade failures

Documentation

Error codes are documented with examples
Common error scenarios have troubleshooting guides
Support contact information is provided
Rate limit policies are documented
Retry guidelines are documented

Testing

Unit tests cover error handling logic
Integration tests verify error responses
Load tests verify error handling under stress
Chaos engineering tests verify graceful degradation

Summary

Effective API error handling requires:

Consistent error response format across all endpoints
Proper HTTP status codes that reflect the error type
Actionable error messages that help developers fix issues
Retry logic with exponential backoff for transient failures
Comprehensive error logging for debugging and monitoring
Graceful degradation when dependencies fail
Security-conscious error messages that don't leak sensitive information

The APIs developers love—Stripe, GitHub, Twilio—all excel at error handling. They make it easy to understand what went wrong and how to fix it. By following the patterns in this guide, you can build APIs that developers trust and enjoy working with.

Related Guides

API Error Codes Explained - Complete HTTP status code reference
Circuit Breaker Pattern - Prevent cascade failures
API Rate Limiting Guide - Handle rate limits properly
Complete API Dependency Monitoring Strategy - Monitor and detect failures
Webhook Implementation Guide - Handle webhook errors and retries

Recommended Tools

Catch errors before your users do. Better Stack monitors your API endpoints every 30 seconds, correlates errors across services, and alerts your team instantly when error rates spike. {/* affiliate:betterstack */}

Secure your error reporting pipeline. Error logs often contain sensitive data — API keys, user tokens, connection strings. 1Password ensures credentials in your error handling code stay secure with environment variable injection. {/* affiliate:1password */}

API Error Handling Best Practices: Build Robust Production APIs (2026)

Table of Contents

Why Error Handling Matters

The Cost of Poor Error Handling

Key Benefits of Good Error Handling

The Three Layers of Error Handling

Layer 1: Network Errors (Infrastructure)

Layer 2: Application Errors (Request Processing)

Layer 3: Business Logic Errors (Domain Logic)

Designing Error Responses

The Standard Error Response Format

Essential Error Response Fields

TypeScript Error Response Types

Real-World Error Response Examples

HTTP Status Codes: Choosing the Right One

Quick Reference Table

Decision Tree for Status Codes

Common Status Code Mistakes

Server-Side Error Handling Patterns

Pattern 1: Centralized Error Handler

Pattern 2: Custom Error Classes

Pattern 3: Async Error Wrapper

Pattern 4: Result Type (Railway-Oriented Programming)

Client-Side Error Handling Strategies

Pattern 1: Axios Interceptor

Pattern 2: React Error Boundary

Pattern 3: Type-Safe API Client

Retry Logic & Exponential Backoff

When to Retry

Exponential Backoff Implementation

Using Libraries

Error Logging & Monitoring

What to Log

Structured Logging

Integration with Error Tracking Services

Setting Up Alerts

Graceful Degradation

Strategy 1: Fallback to Cached Data

Strategy 2: Feature Flags

Strategy 3: Circuit Breaker Pattern

Strategy 4: Partial Response

Security Considerations

What NOT to Expose

Safe Error Responses

Sanitizing Headers

Rate Limiting Error Endpoints

Real-World Examples

Stripe's Error Handling

GitHub's Rate Limit Error

AWS Error Response

Twilio's Error Response

Common Mistakes to Avoid

Mistake 1: Using 200 OK for Errors

Mistake 2: Swallowing Errors Silently

Mistake 3: Generic Error Messages

Mistake 4: Exposing Internal Details

Mistake 5: No Request ID

Mistake 6: Inconsistent Error Format

Mistake 7: Not Differentiating Error Types

Mistake 8: Retrying Non-Idempotent Operations

Production Readiness Checklist

Error Response Design

HTTP Status Codes

Error Handling Implementation

Retry Logic

Logging & Monitoring

Security

Graceful Degradation

Documentation

Testing

Summary

Related Guides

Recommended Tools

Stop checking — get alerted instantly