What is an API health check endpoint?

An API health check endpoint is a dedicated URL (typically /health or /healthz) that returns information about the current state of a service. It checks whether the service is running, whether its dependencies (database, cache, external APIs) are reachable, and reports overall service status. Load balancers, orchestrators like Kubernetes, and monitoring tools poll health endpoints to determine whether to route traffic to a service.

What is the difference between a liveness probe and a readiness probe?

A liveness probe checks whether a service is alive and should be restarted if it fails. A readiness probe checks whether a service is ready to accept traffic. In Kubernetes: if the liveness probe fails, the container is restarted; if the readiness probe fails, the pod is removed from the service endpoints (traffic stops routing to it). Use liveness for detecting deadlocks or crashes; use readiness to signal when the service is ready after startup or when temporarily unable to serve traffic.

What should a health check endpoint return?

A health check endpoint should return: overall status (healthy/degraded/unhealthy), individual component checks (database, cache, external dependencies), response time for each check, version/build info, and uptime. Return HTTP 200 for healthy, 503 for unhealthy. The response body should be JSON so monitoring tools can parse component-level details.

Should health check endpoints require authentication?

Basic health endpoints (/health returning just "OK") can be public since they reveal minimal information. Detailed health endpoints that show database status, version info, or dependency states should require authentication or be accessible only from internal networks. In Kubernetes environments, health probes run from inside the cluster so they can use internal-only endpoints.

What HTTP status code should a health check return?

Return 200 OK when the service is healthy. Return 503 Service Unavailable when the service is unhealthy or degraded. Some teams use 200 with a status field in the JSON body for degraded states (not failing hard checks, but some dependencies are slow), reserving 503 for complete failures. Avoid 500 — it implies an unhandled exception in the health check itself.

API Health Check Endpoints: Complete Implementation Guide (2026)

A health check endpoint is one of the most valuable additions to any production API. It gives load balancers, orchestrators, and monitoring tools a single URL to determine whether your service is ready to handle traffic — and exactly what's wrong when it isn't. This guide covers what to build, how to build it, and how to monitor it effectively.

What Is a Health Check Endpoint?

A health check endpoint is a dedicated URL — typically /health, /healthz, or /status — that reports the current operational state of your service. When called, it:

Verifies the service process is running
Checks connectivity to critical dependencies (database, cache, message queue)
Reports overall service status as a simple pass/fail or detailed breakdown
Returns an appropriate HTTP status code (200 for healthy, 503 for unhealthy)

Every component in your infrastructure uses this endpoint. Load balancers use it to decide whether to route traffic. Kubernetes uses it to decide whether to restart containers. Monitoring tools use it to trigger alerts. Your on-call engineer uses it to quickly assess an incident.

Liveness vs Readiness vs Startup Probes

Kubernetes distinguishes three types of health probes — and even outside Kubernetes, this mental model is useful for structuring your health checks:

Liveness Probe: /healthz/live

Answers: "Is this service alive, or is it stuck in a broken state that requires a restart?"

Liveness checks should be fast and simple — they just verify the process is running and not deadlocked. Don't include database checks here. If a database goes down, you don't want Kubernetes to restart your service — the service itself is fine, just a dependency is unavailable.

If the liveness probe fails, Kubernetes restarts the container.

Readiness Probe: /healthz/ready

Answers: "Is this service ready to accept incoming traffic right now?"

Readiness checks can be more comprehensive. Check database connectivity, cache availability, and any other dependencies required to serve requests. If the readiness probe fails, Kubernetes removes the pod from the service's endpoints — traffic stops routing to it — but doesn't restart it.

Use readiness probes to:

Signal when a service has finished warming up (loading caches, establishing connection pools)
Temporarily remove a pod from rotation when a dependency is unavailable
Implement graceful shutdown (fail readiness before terminating)

Startup Probe: /healthz/startup

Answers: "Has the application finished initializing?"

Used for slow-starting containers. While the startup probe is active, liveness and readiness probes are disabled. Once it succeeds, Kubernetes switches to the liveness and readiness probes. Prevents slow startups from triggering liveness restarts.

📡

Recommended

Monitor Your Health Endpoints with Better Stack

Poll your /health endpoint every 30 seconds, get instant alerts when it starts failing, and track uptime over time.

Try Better Stack Free →

What to Include in Your Health Check

A production-grade health check should validate your service's ability to do its job. For most services, that means:

Required Checks

Database connectivity: Can you connect and run a lightweight query (e.g., SELECT 1)?
Service process: Is the process running? Is memory usage within bounds?

Recommended Checks

Cache connectivity: Is Redis/Memcached reachable?
Message queue: Is Kafka/RabbitMQ/SQS reachable?
Critical external API: Can you reach a payment processor or auth service?
Disk space: Is there enough free disk space for logs and temporary files?

Optional Metadata

Version: Current service version or git SHA (useful during deploys)
Uptime: How long the process has been running
Environment: Production, staging, etc.
Check durations: How long each dependency check took

Health Check Response Format

The response body should be JSON with a consistent structure that monitoring tools and humans can parse:

// Healthy response (HTTP 200) { "status": "healthy", "version": "1.4.2", "uptime_seconds": 86400, "timestamp": "2026-04-29T12:00:00Z", "checks": { "database": { "status": "healthy", "response_time_ms": 4 }, "cache": { "status": "healthy", "response_time_ms": 1 }, "disk": { "status": "healthy", "free_gb": 42.3 } } } // Unhealthy response (HTTP 503) { "status": "unhealthy", "version": "1.4.2", "uptime_seconds": 3600, "timestamp": "2026-04-29T12:00:00Z", "checks": { "database": { "status": "unhealthy", "error": "connection refused: host db:5432" }, "cache": { "status": "healthy", "response_time_ms": 1 } } }

Return HTTP 200 for healthy, HTTP 503 for unhealthy or degraded. This lets load balancers and proxies act on the status without parsing the body.

Implementation Examples

Node.js (Express)

Python (FastAPI)

from fastapi import FastAPI, Response from datetime import datetime import time import os app = FastAPI() start_time = time.time() @app.get("/health") async def health_check(response: Response): checks = {} is_healthy = True # Database check try: start = time.time() await db.execute("SELECT 1") checks["database"] = { "status": "healthy", "response_time_ms": int((time.time() - start) * 1000), } except Exception as e: is_healthy = False checks["database"] = {"status": "unhealthy", "error": str(e)} # Redis check try: start = time.time() await redis.ping() checks["cache"] = { "status": "healthy", "response_time_ms": int((time.time() - start) * 1000), } except Exception as e: is_healthy = False checks["cache"] = {"status": "unhealthy", "error": str(e)} response.status_code = 200 if is_healthy else 503 return { "status": "healthy" if is_healthy else "unhealthy", "version": os.getenv("APP_VERSION", "unknown"), "uptime_seconds": int(time.time() - start_time), "timestamp": datetime.utcnow().isoformat() + "Z", "checks": checks, }

Go

package main import ( "encoding/json" "net/http" "time" ) var startTime = time.Now() type HealthResponse struct { Status string `json:"status"` Uptime int64 `json:"uptime_seconds"` Checks map[string]CheckResult `json:"checks"` } type CheckResult struct { Status string `json:"status"` Error string `json:"error,omitempty"` Ms int64 `json:"response_time_ms,omitempty"` } func healthHandler(w http.ResponseWriter, r *http.Request) { checks := make(map[string]CheckResult) healthy := true // Database check start := time.Now() if err := db.PingContext(r.Context()); err != nil { healthy = false checks["database"] = CheckResult{Status: "unhealthy", Error: err.Error()} } else { checks["database"] = CheckResult{ Status: "healthy", Ms: time.Since(start).Milliseconds(), } } status := "healthy" code := http.StatusOK if !healthy { status = "unhealthy" code = http.StatusServiceUnavailable } resp := HealthResponse{ Status: status, Uptime: int64(time.Since(startTime).Seconds()), Checks: checks, } w.Header().Set("Content-Type", "application/json") w.WriteHeader(code) json.NewEncoder(w).Encode(resp) }

📡

Recommended

Get Alerted When Health Checks Fail

Set up HTTP monitoring on your /health endpoint. Alert on 503 responses, slow check times, or when dependencies start failing.

Try Better Stack Free →

Health Check Timeouts

Health check endpoints must have strict timeouts. A health check that takes 30 seconds is worse than no health check — it blocks the load balancer from making routing decisions.

Total health check timeout: 5 seconds maximum
Per-check timeout: 2 seconds per dependency check
Database check timeout: 1 second — if the DB takes more than 1 second to respond to SELECT 1, it's effectively unavailable

Run dependency checks in parallel when possible to keep total health check time low:

// Run checks in parallel (Node.js example) const [dbResult, cacheResult] = await Promise.allSettled([ checkDatabase(), checkCache(), ]);

Security Considerations

What to Expose Publicly vs Privately

Not all health check details should be publicly accessible. Consider two tiers:

Public /health: Returns only { "status": "ok" } or HTTP 200/503. No dependency details. Safe for load balancers and CDN health probes.
Internal /health/detailed: Returns full component status, versions, and dependency states. Requires authentication or IP allowlist (internal network only).

Don't Leak Internal Details

Error messages in health check responses can reveal your infrastructure topology. "connection refused: host postgres-primary.internal:5432" tells attackers what database you're using and your internal hostname. In public health endpoints, return generic error messages like "database unavailable" rather than the raw exception.

Rate Limit Health Endpoints

Health endpoints that run database queries on every call can be exploited for denial-of-service attacks. Add rate limiting (100 req/min is generous for legitimate monitors) and cache health check results for 5-10 seconds to avoid hammering your database with health check queries.

Kubernetes Configuration

Here's a complete Kubernetes probe configuration using separate liveness and readiness endpoints:

apiVersion: apps/v1 kind: Deployment spec: template: spec: containers: - name: api image: myapp:1.4.2 livenessProbe: httpGet: path: /healthz/live port: 8080 initialDelaySeconds: 10 periodSeconds: 15 failureThreshold: 3 timeoutSeconds: 5 readinessProbe: httpGet: path: /healthz/ready port: 8080 initialDelaySeconds: 5 periodSeconds: 10 failureThreshold: 3 timeoutSeconds: 5 startupProbe: httpGet: path: /healthz/startup port: 8080 failureThreshold: 30 periodSeconds: 10

Key settings:

initialDelaySeconds: Wait before first probe (allows app to start)
periodSeconds: How often to probe
failureThreshold: How many consecutive failures before action
timeoutSeconds: How long to wait for a probe response

Monitoring Health Endpoints from Outside

Health endpoints are most valuable when monitored continuously by an external system — not just your infrastructure. External monitoring catches issues your internal probes miss (network routing problems, CDN failures, geographic outages).

What to Monitor

Availability: Is the health endpoint returning 200?
Response time: Is the endpoint responding quickly? Slow health checks often precede full failures.
Component status: Parse the JSON to alert on individual dependency failures
Version changes: Track version field to verify successful deployments

Polling Frequency

Poll your health endpoint every 30-60 seconds from external monitors. This gives you a median time-to-detect of 15-30 seconds for outages, which is sufficient for most SLOs. Polling more frequently (every 10 seconds) is appropriate for high-availability systems, but generates more load.

For critical paths, set up monitoring from multiple geographic regions to distinguish regional outages from global ones.

📡

Recommended

External Health Monitoring with Better Stack

Poll your health endpoints from 15+ global locations. Get alerted via Slack, PagerDuty, or SMS when any check fails. Free tier available.

Try Better Stack Free →

Health Checks During Deployment

Health checks are critical during rolling deployments. The pattern:

Start new pod: Startup probe runs until it passes
Readiness probe passes: Pod is added to load balancer rotation
Traffic shifts gradually: Old pod gets less traffic as new pod handles more
Graceful shutdown begins on old pod: Readiness probe is deliberately failed to drain traffic
Old pod terminates: After all in-flight requests complete

To implement graceful shutdown, listen for SIGTERM and immediately fail your readiness probe while continuing to handle in-flight requests:

Common Health Check Mistakes

Checking Too Much

A health check that calls external payment APIs or runs complex queries makes your service dependent on those external systems for basic availability. If Stripe is down, should Kubernetes restart your containers? Probably not. Keep health checks focused on what's required to serve requests.

No Timeouts

Health checks without timeouts can hang indefinitely when dependencies are slow, causing the health check to appear failed and triggering unnecessary restarts or routing changes. Always set explicit timeouts.

Exposing Too Much Information

Detailed health responses on public endpoints leak infrastructure details. Use tiered endpoints: simple public check, detailed internal check.

Same Endpoint for Liveness and Readiness

Using the same endpoint for both means a database outage (which should only fail readiness) will also fail liveness — causing your pods to restart in a loop even though the application itself is fine. Always separate them.

Health Check Implementation Checklist

☐ /health returns 200/503 with JSON status body
☐ Separate /healthz/live and /healthz/ready endpoints
☐ Database connectivity check with 1-2 second timeout
☐ Cache connectivity check (if used)
☐ Checks run in parallel, not sequentially
☐ Total health check timeout < 5 seconds
☐ Results cached for 5-10 seconds to prevent DB hammering
☐ Sensitive details require authentication or internal network only
☐ Version/build info included for deployment verification
☐ Graceful shutdown fails readiness probe before terminating
☐ External monitoring polling the endpoint every 30-60 seconds
☐ Alerts configured for 503 responses and slow response times

Key Takeaways

Health check endpoints are the foundation of reliable deployments and infrastructure automation
Separate liveness (is the process alive?) from readiness (can it serve traffic?)
Check only what's required to serve requests — external dependencies included
Run checks in parallel with strict timeouts (2 seconds per check, 5 seconds total)
Return 200 for healthy, 503 for unhealthy — not 500
Cache health check results to avoid hammering your database
Use tiered endpoints: simple public check, detailed internal check
Monitor from external systems — internal probes alone don't catch network routing failures

A well-implemented health check endpoint is invisible when everything is working — and invaluable when something breaks. It's the first thing your monitoring tool checks, the first thing your load balancer consults, and the first thing an on-call engineer looks at during an incident. Get it right once and it pays dividends forever.

Monitor Your Health Endpoints 24/7

APIStatusCheck monitors your /health endpoints from multiple global locations, tracking uptime and alerting your team the moment a check starts failing.

Start Monitoring Free →

What Is a Health Check Endpoint?

Liveness vs Readiness vs Startup Probes

Liveness Probe: /healthz/live

Readiness Probe: /healthz/ready

Startup Probe: /healthz/startup

What to Include in Your Health Check

Required Checks

Recommended Checks

Optional Metadata

Health Check Response Format

Implementation Examples

Node.js (Express)

Python (FastAPI)

Go

Health Check Timeouts

Security Considerations

What to Expose Publicly vs Privately

Don't Leak Internal Details

Rate Limit Health Endpoints

Kubernetes Configuration

Monitoring Health Endpoints from Outside

What to Monitor

Polling Frequency

Health Checks During Deployment

Common Health Check Mistakes

Checking Too Much

No Timeouts

Exposing Too Much Information

Same Endpoint for Liveness and Readiness

Health Check Implementation Checklist

Key Takeaways

Monitor Your Health Endpoints 24/7

Stop checking — get alerted instantly