API Health Check Endpoints: Complete Implementation Guide

12 min read
Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

A health check endpoint is one of the most valuable additions to any production API. It gives load balancers, orchestrators, and monitoring tools a single URL to determine whether your service is ready to handle traffic — and exactly what's wrong when it isn't. This guide covers what to build, how to build it, and how to monitor it effectively.

What Is a Health Check Endpoint?

A health check endpoint is a dedicated URL — typically /health, /healthz, or /status — that reports the current operational state of your service. When called, it:

Every component in your infrastructure uses this endpoint. Load balancers use it to decide whether to route traffic. Kubernetes uses it to decide whether to restart containers. Monitoring tools use it to trigger alerts. Your on-call engineer uses it to quickly assess an incident.

Liveness vs Readiness vs Startup Probes

Kubernetes distinguishes three types of health probes — and even outside Kubernetes, this mental model is useful for structuring your health checks:

Liveness Probe: /healthz/live

Answers: "Is this service alive, or is it stuck in a broken state that requires a restart?"

Liveness checks should be fast and simple — they just verify the process is running and not deadlocked. Don't include database checks here. If a database goes down, you don't want Kubernetes to restart your service — the service itself is fine, just a dependency is unavailable.

If the liveness probe fails, Kubernetes restarts the container.

Readiness Probe: /healthz/ready

Answers: "Is this service ready to accept incoming traffic right now?"

Readiness checks can be more comprehensive. Check database connectivity, cache availability, and any other dependencies required to serve requests. If the readiness probe fails, Kubernetes removes the pod from the service's endpoints — traffic stops routing to it — but doesn't restart it.

Use readiness probes to:

Startup Probe: /healthz/startup

Answers: "Has the application finished initializing?"

Used for slow-starting containers. While the startup probe is active, liveness and readiness probes are disabled. Once it succeeds, Kubernetes switches to the liveness and readiness probes. Prevents slow startups from triggering liveness restarts.

📡
Recommended

Monitor Your Health Endpoints with Better Stack

Poll your /health endpoint every 30 seconds, get instant alerts when it starts failing, and track uptime over time.

Try Better Stack Free →

What to Include in Your Health Check

A production-grade health check should validate your service's ability to do its job. For most services, that means:

Required Checks

Recommended Checks

Optional Metadata

Health Check Response Format

The response body should be JSON with a consistent structure that monitoring tools and humans can parse:

// Healthy response (HTTP 200) { "status": "healthy", "version": "1.4.2", "uptime_seconds": 86400, "timestamp": "2026-04-29T12:00:00Z", "checks": { "database": { "status": "healthy", "response_time_ms": 4 }, "cache": { "status": "healthy", "response_time_ms": 1 }, "disk": { "status": "healthy", "free_gb": 42.3 } } } // Unhealthy response (HTTP 503) { "status": "unhealthy", "version": "1.4.2", "uptime_seconds": 3600, "timestamp": "2026-04-29T12:00:00Z", "checks": { "database": { "status": "unhealthy", "error": "connection refused: host db:5432" }, "cache": { "status": "healthy", "response_time_ms": 1 } } }

Return HTTP 200 for healthy, HTTP 503 for unhealthy or degraded. This lets load balancers and proxies act on the status without parsing the body.

Implementation Examples

Node.js (Express)

const express = require('express'); const app = express(); app.get('/health', async (req, res) => { const checks = {}; let isHealthy = true; // Database check try { const start = Date.now(); await db.query('SELECT 1'); checks.database = { status: 'healthy', response_time_ms: Date.now() - start, }; } catch (err) { isHealthy = false; checks.database = { status: 'unhealthy', error: err.message, }; } // Cache check try { const start = Date.now(); await redis.ping(); checks.cache = { status: 'healthy', response_time_ms: Date.now() - start, }; } catch (err) { isHealthy = false; checks.cache = { status: 'unhealthy', error: err.message, }; } const statusCode = isHealthy ? 200 : 503; res.status(statusCode).json({ status: isHealthy ? 'healthy' : 'unhealthy', version: process.env.APP_VERSION || 'unknown', uptime_seconds: Math.floor(process.uptime()), timestamp: new Date().toISOString(), checks, }); });

Python (FastAPI)

from fastapi import FastAPI, Response from datetime import datetime import time import os app = FastAPI() start_time = time.time() @app.get("/health") async def health_check(response: Response): checks = {} is_healthy = True # Database check try: start = time.time() await db.execute("SELECT 1") checks["database"] = { "status": "healthy", "response_time_ms": int((time.time() - start) * 1000), } except Exception as e: is_healthy = False checks["database"] = {"status": "unhealthy", "error": str(e)} # Redis check try: start = time.time() await redis.ping() checks["cache"] = { "status": "healthy", "response_time_ms": int((time.time() - start) * 1000), } except Exception as e: is_healthy = False checks["cache"] = {"status": "unhealthy", "error": str(e)} response.status_code = 200 if is_healthy else 503 return { "status": "healthy" if is_healthy else "unhealthy", "version": os.getenv("APP_VERSION", "unknown"), "uptime_seconds": int(time.time() - start_time), "timestamp": datetime.utcnow().isoformat() + "Z", "checks": checks, }

Go

package main import ( "encoding/json" "net/http" "time" ) var startTime = time.Now() type HealthResponse struct { Status string `json:"status"` Uptime int64 `json:"uptime_seconds"` Checks map[string]CheckResult `json:"checks"` } type CheckResult struct { Status string `json:"status"` Error string `json:"error,omitempty"` Ms int64 `json:"response_time_ms,omitempty"` } func healthHandler(w http.ResponseWriter, r *http.Request) { checks := make(map[string]CheckResult) healthy := true // Database check start := time.Now() if err := db.PingContext(r.Context()); err != nil { healthy = false checks["database"] = CheckResult{Status: "unhealthy", Error: err.Error()} } else { checks["database"] = CheckResult{ Status: "healthy", Ms: time.Since(start).Milliseconds(), } } status := "healthy" code := http.StatusOK if !healthy { status = "unhealthy" code = http.StatusServiceUnavailable } resp := HealthResponse{ Status: status, Uptime: int64(time.Since(startTime).Seconds()), Checks: checks, } w.Header().Set("Content-Type", "application/json") w.WriteHeader(code) json.NewEncoder(w).Encode(resp) }
📡
Recommended

Get Alerted When Health Checks Fail

Set up HTTP monitoring on your /health endpoint. Alert on 503 responses, slow check times, or when dependencies start failing.

Try Better Stack Free →

Health Check Timeouts

Health check endpoints must have strict timeouts. A health check that takes 30 seconds is worse than no health check — it blocks the load balancer from making routing decisions.

Run dependency checks in parallel when possible to keep total health check time low:

// Run checks in parallel (Node.js example) const [dbResult, cacheResult] = await Promise.allSettled([ checkDatabase(), checkCache(), ]);

Security Considerations

What to Expose Publicly vs Privately

Not all health check details should be publicly accessible. Consider two tiers:

Don't Leak Internal Details

Error messages in health check responses can reveal your infrastructure topology. "connection refused: host postgres-primary.internal:5432" tells attackers what database you're using and your internal hostname. In public health endpoints, return generic error messages like "database unavailable" rather than the raw exception.

Rate Limit Health Endpoints

Health endpoints that run database queries on every call can be exploited for denial-of-service attacks. Add rate limiting (100 req/min is generous for legitimate monitors) and cache health check results for 5-10 seconds to avoid hammering your database with health check queries.

Kubernetes Configuration

Here's a complete Kubernetes probe configuration using separate liveness and readiness endpoints:

apiVersion: apps/v1 kind: Deployment spec: template: spec: containers: - name: api image: myapp:1.4.2 livenessProbe: httpGet: path: /healthz/live port: 8080 initialDelaySeconds: 10 periodSeconds: 15 failureThreshold: 3 timeoutSeconds: 5 readinessProbe: httpGet: path: /healthz/ready port: 8080 initialDelaySeconds: 5 periodSeconds: 10 failureThreshold: 3 timeoutSeconds: 5 startupProbe: httpGet: path: /healthz/startup port: 8080 failureThreshold: 30 periodSeconds: 10

Key settings:

Monitoring Health Endpoints from Outside

Health endpoints are most valuable when monitored continuously by an external system — not just your infrastructure. External monitoring catches issues your internal probes miss (network routing problems, CDN failures, geographic outages).

What to Monitor

Polling Frequency

Poll your health endpoint every 30-60 seconds from external monitors. This gives you a median time-to-detect of 15-30 seconds for outages, which is sufficient for most SLOs. Polling more frequently (every 10 seconds) is appropriate for high-availability systems, but generates more load.

For critical paths, set up monitoring from multiple geographic regions to distinguish regional outages from global ones.

📡
Recommended

External Health Monitoring with Better Stack

Poll your health endpoints from 15+ global locations. Get alerted via Slack, PagerDuty, or SMS when any check fails. Free tier available.

Try Better Stack Free →

Health Checks During Deployment

Health checks are critical during rolling deployments. The pattern:

  1. Start new pod: Startup probe runs until it passes
  2. Readiness probe passes: Pod is added to load balancer rotation
  3. Traffic shifts gradually: Old pod gets less traffic as new pod handles more
  4. Graceful shutdown begins on old pod: Readiness probe is deliberately failed to drain traffic
  5. Old pod terminates: After all in-flight requests complete

To implement graceful shutdown, listen for SIGTERM and immediately fail your readiness probe while continuing to handle in-flight requests:

let isShuttingDown = false; process.on('SIGTERM', () => { isShuttingDown = true; // Give load balancer time to stop routing to this instance setTimeout(() => { server.close(() => process.exit(0)); }, 10000); // 10 second drain window }); app.get('/healthz/ready', (req, res) => { if (isShuttingDown) { return res.status(503).json({ status: 'shutting_down' }); } // ... normal readiness checks });

Common Health Check Mistakes

Checking Too Much

A health check that calls external payment APIs or runs complex queries makes your service dependent on those external systems for basic availability. If Stripe is down, should Kubernetes restart your containers? Probably not. Keep health checks focused on what's required to serve requests.

No Timeouts

Health checks without timeouts can hang indefinitely when dependencies are slow, causing the health check to appear failed and triggering unnecessary restarts or routing changes. Always set explicit timeouts.

Exposing Too Much Information

Detailed health responses on public endpoints leak infrastructure details. Use tiered endpoints: simple public check, detailed internal check.

Same Endpoint for Liveness and Readiness

Using the same endpoint for both means a database outage (which should only fail readiness) will also fail liveness — causing your pods to restart in a loop even though the application itself is fine. Always separate them.

Health Check Implementation Checklist

Key Takeaways

A well-implemented health check endpoint is invisible when everything is working — and invaluable when something breaks. It's the first thing your monitoring tool checks, the first thing your load balancer consults, and the first thing an on-call engineer looks at during an incident. Get it right once and it pays dividends forever.

Monitor Your Health Endpoints 24/7

APIStatusCheck monitors your /health endpoints from multiple global locations, tracking uptime and alerting your team the moment a check starts failing.

Start Monitoring Free →

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time API Health Check goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for API Health Check + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

Better StackBest for API Teams

Uptime Monitoring & Incident Management

Used by 100,000+ websites

Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.

We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.

Free tier · Paid from $24/moStart Free Monitoring
1PasswordBest for Credential Security

Secrets Management & Developer Security

Trusted by 150,000+ businesses

Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.

After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.

OpteryBest for Privacy

Automated Personal Data Removal

Removes data from 350+ brokers

Removes your personal data from 350+ data broker sites. Protects against phishing and social engineering attacks.

Service outages sometimes involve data breaches. Optery keeps your personal info off the sites attackers use first.

From $9.99/moFree Privacy Scan
ElevenLabsBest for AI Voice

AI Voice & Audio Generation

Used by 1M+ developers

Text-to-speech, voice cloning, and audio AI for developers. Build voice features into your apps with a simple API.

The best AI voice API we've tested — natural-sounding speech with low latency. Essential for any app adding voice features.

Free tier · Paid from $5/moTry ElevenLabs Free
SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

We use SEMrush to track how our API status pages rank and catch site health issues early.

From $129.95/moTry SEMrush Free
View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you