How do I monitor Kong API Gateway?

Kong exposes metrics via the Prometheus plugin (/metrics endpoint). Install the Prometheus plugin, scrape metrics into Prometheus/Grafana, and alert on kong_http_requests_total (filtered by status code) and kong_latency_bucket. The Kong Vitals feature (Enterprise) provides built-in dashboards. For cloud Kong, use Kong Konnect's built-in analytics.

Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

Blog/API Gateway Monitoring

API Gateway Monitoring: Complete Guide 2026

Q: What metrics should I monitor for an API gateway?

The essential API gateway metrics are: request rate (total requests per second), error rate (4xx and 5xx response percentages), latency (p50, p95, p99 response times), uptime/availability, and bandwidth consumption. Secondary metrics include cache hit rate, authentication failures, rate limit violations, and upstream service health.

Q: How do I monitor AWS API Gateway?

AWS API Gateway automatically pushes metrics to CloudWatch including Count, Latency, IntegrationLatency, 4XXError, 5XXError, and CacheHitCount. Enable detailed metrics in API Gateway settings for per-resource and per-method granularity. Set CloudWatch alarms on 5XXError rate and P99 latency exceeding thresholds. Use X-Ray for distributed tracing through your gateway to backend services.

Q: What is the difference between API gateway monitoring and API monitoring?

API gateway monitoring focuses on the proxy layer — request routing, rate limiting, authentication, and traffic management metrics. API monitoring focuses on the API endpoints themselves — response correctness, payload validation, uptime, and business logic behavior. You need both: the gateway tells you about infrastructure health, API monitoring tells you if the actual responses are correct.

Your API gateway is the front door to your backend services. Every request goes through it. When it degrades, every service degrades. This guide covers the metrics that matter, how to monitor the major gateways (Kong, AWS, Nginx), and how to build alerting that catches issues before your users do.

What is an API Gateway?

An API gateway is a reverse proxy that sits between clients and your backend services. It handles request routing, authentication, rate limiting, SSL termination, caching, and observability. Common gateways: AWS API Gateway, Kong, Nginx, Traefik,Envoy, Azure API Management, and Apigee.

The 8 Essential API Gateway Metrics

Request Rate

Must track

Total requests per second (RPS) passing through the gateway

Alert threshold: Alert on sudden drops (>50% decrease) or unexpected spikes (>3x baseline)

Why it matters: Sudden drops indicate upstream failures or routing issues; spikes indicate traffic surge or DDoS

Error Rate (5xx)

Must track

Percentage of requests returning 5xx server errors

Alert threshold: Alert when 5xx rate exceeds 1% over 5 minutes

Why it matters: Gateway-level 5xx means upstream services are failing or gateway itself is misconfigured

Error Rate (4xx)

Must track

Percentage of requests returning 4xx client errors

Alert threshold: Alert when 4xx rate exceeds 10% (may indicate auth issues or client bugs)

Why it matters: High 401/403 rates suggest auth service problems; high 429 suggests rate limiting is too aggressive

Latency (P95/P99)

Must track

Response time at the 95th and 99th percentile

Alert threshold: Alert when P95 exceeds your SLA target (e.g., >500ms for most APIs)

Why it matters: P99 catches tail latency that averages hide — this is what your worst-case users experience

Upstream Latency

Must track

Time spent waiting for the upstream backend to respond

Alert threshold: Alert when upstream latency exceeds gateway latency by >200% consistently

Why it matters: Isolates whether slowness is in the gateway layer or the backend service

Active Connections

Must track

Number of concurrent connections being handled

Alert threshold: Alert at 80% of configured max connections

Why it matters: Connection pool exhaustion causes new requests to fail immediately with connection refused

Cache Hit Rate

Must track

Percentage of responses served from gateway cache

Alert threshold: Alert if cache hit rate drops >20% from baseline

Why it matters: Sudden cache miss spikes increase load on backend services and increase latency

Rate Limit Violations

Must track

Count of requests rejected due to rate limiting (429 responses)

Alert threshold: Alert on unexpected spikes in rate limiting

Why it matters: Excessive rate limiting may indicate misconfigured limits or a client with a bug making too many calls

📡

Recommended

External monitoring to complement your gateway metrics

Internal gateway metrics tell you what's happening inside. Better Stack adds external synthetic checks — verifying your API works end-to-end from the client's perspective.

Try Better Stack Free →

Monitoring by Gateway Type

AWS API Gateway

AWS API Gateway automatically sends metrics to CloudWatch. Enable detailed CloudWatch metricsin your API settings for per-resource and per-method granularity (default is aggregate only).

Key CloudWatch metrics:

Count, Latency, IntegrationLatency

4XXError, 5XXError

CacheHitCount, CacheMissCount

✓Enable X-Ray tracing to trace requests through API Gateway to Lambda/ECS backends

✓Set CloudWatch alarms on 5XXError and P99 Latency breaching SLA thresholds

✓Use API Gateway access logs for per-request detail (log to CloudWatch Logs)

✓Consider AWS Managed Grafana for dashboarding CloudWatch metrics

Kong Gateway

Kong exposes Prometheus metrics via the Prometheus plugin. Install it globally to get metrics for all services and routes.

# Enable Prometheus plugin globally

curl -X POST http://localhost:8001/plugins \

--data "name=prometheus"

# Scrape endpoint

curl http://localhost:8001/metrics

✓Key metrics: kong_http_requests_total (by status), kong_latency_bucket (by type: request/upstream/kong)

✓Use Grafana with the official Kong dashboard (ID: 7424) for instant visualization

✓Kong Vitals (Enterprise) adds built-in dashboards without Prometheus setup

✓Alert on kong_datastore_reachable gauge dropping to 0 — means gateway can't reach its database

Nginx / Nginx Plus

Open-source Nginx provides basic metrics via stub_status. Nginx Plus adds a full JSON metrics API at /api/ with per-upstream metrics.

# nginx.conf — enable stub_status

location /nginx_status {

stub_status;

allow 127.0.0.1;

deny all;

}

✓Use nginx-prometheus-exporter to expose stub_status metrics to Prometheus

✓For access log analysis: ship to Datadog, Grafana Loki, or Elastic for per-route metrics

✓Nginx Plus /api/ endpoint gives per-upstream server health, active connections, and error counts

✓Monitor worker process count — sudden drop means Nginx is crashing and restarting

Alerting Strategy: What to Alert On

Most teams either under-alert (miss real incidents) or over-alert (alert fatigue kills response times). Use this tiered approach:

P0 — Page immediately

•5xx error rate > 5% for 2+ consecutive minutes
•Gateway uptime check failing (gateway unreachable)
•P99 latency > 5s for 5+ minutes
•Active connections > 95% of configured max

P1 — Alert during business hours

•5xx error rate > 1% sustained over 10 minutes
•P95 latency > your SLA threshold (e.g., 500ms)
•Authentication failure rate > 20%
•Rate limit violations > 2x baseline

P2 — Daily digest / dashboard review

•Cache hit rate trending down week-over-week
•4xx rate gradually increasing (could indicate client-side bug)
•Traffic volume anomalies (unexpected drops in off-hours)
•Upstream latency creeping up (early warning of backend degradation)

📡

Recommended

All your gateway and API monitoring in one place

Better Stack combines uptime monitoring, log management, and incident alerting. Monitor your API gateway endpoints, correlate with logs, and get on-call rotations without stitching together 5 tools.

Try Better Stack Free →

API Gateway Monitoring Tools Comparison

Tool	Best For	Gateway Support	Starting Price
Datadog	Enterprise full-stack observability	AWS, Kong, Nginx, Envoy, HAProxy	$15/host/mo
Better Stack	Uptime + log monitoring combo	Any (external health checks)	$24/mo
Grafana + Prometheus	Open-source, self-hosted	Kong, Nginx, Envoy, Traefik	Free (infra costs)
New Relic	APM + infrastructure monitoring	AWS API GW, Nginx, Kong	Free tier available
AWS CloudWatch	AWS API Gateway native monitoring	AWS API Gateway only	Pay per metric/alarm
Dynatrace	AI-powered anomaly detection	Nginx, Kong, AWS, Traefik	$69/host/mo

Frequently Asked Questions

What metrics should I monitor for an API gateway?

The essentials: request rate, 5xx error rate, 4xx error rate, P95/P99 latency, upstream latency, active connections, and cache hit rate. Start with error rate and P99 latency — these are the two metrics that directly impact user experience.

How do I monitor AWS API Gateway?

Enable detailed CloudWatch metrics in API Gateway settings. Set up alarms on 5XXError (alert >1%) and Latency P99 (alert when exceeding your SLA). Enable X-Ray tracing to see full request traces through to Lambda/backends. Use API Gateway access logs for per-request detail.

What is the difference between API gateway monitoring and API monitoring?

Gateway monitoring covers infrastructure concerns: request routing, rate limiting, auth, and traffic management. API monitoring covers business logic: does the API return correct data, are responses valid, is the service doing what it should? Both are essential — use synthetic API tests (Better Stack, Checkly) alongside gateway metrics.

How do I reduce alert noise in API gateway monitoring?

Use anomaly detection instead of static thresholds for traffic-based alerts. Require 2+ consecutive breach minutes before alerting to filter transient spikes. Tier your alerts (P0/P1/P2) so only critical thresholds page on-call. Group related alerts to prevent alert storms during cascade failures.

Should I monitor my API gateway externally as well as internally?

Yes, always. Internal metrics tell you what's happening inside the gateway. External synthetic checks (from a monitoring service outside your network) tell you what your users actually experience. A gateway can report healthy internally while users see timeouts due to network or DNS issues upstream.

Related Monitoring Guides

API Observability Guide Best API Monitoring Tools 2026 API Rate Limiting Guide OpenTelemetry Guide Kubernetes Monitoring

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time API Gateway goes down, you'll know in under 60 seconds — not when your users start complaining.

Email alerts for API Gateway + 9 more APIs
$0 due today for trial
Cancel anytime — $9/mo after trial

Start Free Trial →Compare all plans →

Also recommended:

Better Stack — all-in-one monitoring 1Password — secure your API keys

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

See all →

SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

“We use SEMrush to track how our API status pages rank and catch site health issues early.”

From $129.95/moTry SEMrush Free

View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you