gRPC Monitoring Guide: Metrics, Tracing & Observability (2026)
gRPC's binary protocol, HTTP/2 multiplexing, and streaming RPC patterns require different monitoring approaches than REST APIs. This guide covers how to instrument gRPC services with OpenTelemetry, expose Prometheus metrics via interceptors, trace distributed calls across polyglot services, and alert on gRPC status codes.
📡 Monitor your APIs — know when they go down before your users do
Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.
Affiliate link — we may earn a commission at no extra cost to you
TL;DR — gRPC Monitoring Checklist
- ✅ Track grpc_server_handled_total by method and status code — your error rate signal
- ✅ Alert on UNAVAILABLE (14) and INTERNAL (13) status codes
- ✅ Add OpenTelemetry gRPC interceptors for distributed trace propagation
- ✅ Monitor grpc_server_handling_seconds for latency histograms
- ✅ Use grpc-health-probe for Kubernetes health checks
- ✅ Track active stream count for server-streaming and bidirectional RPCs
gRPC vs REST: Monitoring Differences
gRPC's architecture creates monitoring challenges that don't exist with REST:
Binary protocol — no HTTP proxy visibility
gRPC uses Protobuf binary encoding over HTTP/2. Standard reverse proxies (nginx with HTTP/1.1) can't read gRPC frames. You need an HTTP/2-capable load balancer (Envoy, Traefik, NGINX with http2, or cloud load balancers). Without this, you lose connection-level visibility — a single HTTP/2 connection carries all RPCs, so connection metrics alone won't show per-RPC health.
Status codes vs HTTP codes
gRPC uses its own status codes (0-16), not HTTP status codes. HTTP/2 will return 200 OK even for gRPC errors — the actual error is in the grpc-status trailer header. Your monitoring must decode gRPC trailers, not just HTTP status codes.
Streaming RPCs change latency semantics
For unary RPCs, latency = time from request to response. For server-streaming RPCs, latency = time to first message + stream duration. These are fundamentally different SLOs. A slow server-streaming RPC that sends messages for 5 minutes isn't "slow" in the same way a 5-minute unary RPC is. Define separate SLOs for stream duration and time-to-first-message.
gRPC Status Codes Reference
| Code | Name | Meaning | Alert? |
|---|---|---|---|
| 0 | OK | Success | — |
| 1 | CANCELLED | Client cancelled the call | If spike |
| 2 | UNKNOWN | Unexpected server error | ⚠️ Yes |
| 3 | INVALID_ARGUMENT | Bad client request | Client bug |
| 4 | DEADLINE_EXCEEDED | Client timeout fired | ⚠️ Yes |
| 5 | NOT_FOUND | Resource missing | Usually OK |
| 8 | RESOURCE_EXHAUSTED | Rate limit / overload | 🚨 Critical |
| 13 | INTERNAL | Server-side panic or bug | 🚨 Critical |
| 14 | UNAVAILABLE | Server unreachable | 🚨 Critical |
| 16 | UNAUTHENTICATED | Auth failure | If spike |
Monitor your gRPC service endpoints with Better Stack
Better Stack runs uptime checks on your gRPC health endpoints — with on-call alerting when services go UNAVAILABLE.
Try Better Stack Free →Prometheus Metrics for gRPC (Go)
The go-grpc-prometheus library adds server and client interceptors that emit standard gRPC metrics:
// go get github.com/grpc-ecosystem/go-grpc-prometheus
package main
import (
grpc_prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"
"google.golang.org/grpc"
"github.com/prometheus/client_golang/prometheus/promhttp"
"net/http"
)
func main() {
// Server setup with Prometheus interceptors
srv := grpc.NewServer(
grpc.UnaryInterceptor(grpc_prometheus.UnaryServerInterceptor),
grpc.StreamInterceptor(grpc_prometheus.StreamServerInterceptor),
)
// Register your gRPC services
pb.RegisterMyServiceServer(srv, &MyService{})
// Enable default metrics with histograms for latency
grpc_prometheus.EnableHandlingTimeHistogram()
// Initialize server metrics after registering services
grpc_prometheus.Register(srv)
// Expose Prometheus metrics on /metrics
go func() {
http.Handle("/metrics", promhttp.Handler())
http.ListenAndServe(":9090", nil)
}()
// Serve gRPC on :50051
lis, _ := net.Listen("tcp", ":50051")
srv.Serve(lis)
}
// Client setup
conn, _ := grpc.Dial(
"server:50051",
grpc.WithUnaryInterceptor(grpc_prometheus.UnaryClientInterceptor),
grpc.WithStreamInterceptor(grpc_prometheus.StreamClientInterceptor),
)
// This emits:
// grpc_server_handled_total{grpc_code, grpc_method, grpc_service, grpc_type}
// grpc_server_handling_seconds{grpc_method, grpc_service, grpc_type} histogram
// grpc_server_msg_received_total — for streaming
// grpc_server_msg_sent_total
// Key alert rules:
// High error rate by status code:
// rate(grpc_server_handled_total{grpc_code!="OK"}[5m])
// / rate(grpc_server_handled_total[5m]) > 0.01
// UNAVAILABLE spike (server down):
// increase(grpc_server_handled_total{grpc_code="UNAVAILABLE"}[1m]) > 10
// Latency regression:
// histogram_quantile(0.99, rate(grpc_server_handling_seconds_bucket[5m])) > 1.0OpenTelemetry Distributed Tracing for gRPC
OpenTelemetry propagates trace context across gRPC calls automatically via metadata headers — even across language boundaries:
// Go: otelgrpc interceptors
// go get go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
import "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
// Server
srv := grpc.NewServer(
grpc.StatsHandler(otelgrpc.NewServerHandler()),
)
// Client
conn, _ := grpc.Dial(
"downstream:50051",
grpc.WithStatsHandler(otelgrpc.NewClientHandler()),
)
// The interceptors automatically:
// 1. Extract W3C Trace Context from incoming metadata (traceparent header)
// 2. Create a span for each RPC with rpc.method, rpc.service, rpc.grpc.status_code
// 3. Inject trace context into outgoing client calls
// 4. Record errors and status codes on spans
// Python: auto-instrumentation
# pip install opentelemetry-instrumentation-grpc
from opentelemetry.instrumentation.grpc import GrpcInstrumentorServer, GrpcInstrumentorClient
GrpcInstrumentorServer().instrument() # instrument all grpc.server instances
GrpcInstrumentorClient().instrument() # instrument all grpc.channel instances
# Java: OpenTelemetry agent (zero code change)
# java -javaagent:opentelemetry-javaagent.jar \
# -Dotel.service.name=my-grpc-service \
# -Dotel.exporter.otlp.endpoint=http://collector:4317 \
# -jar my-service.jar
# What you see in traces:
# - Full RPC call tree across microservices
# - Exact method that failed (rpc.method="OrderService/PlaceOrder")
# - gRPC status code as span attribute
# - Time breakdown: client send, server process, server send, client receiveAlert Pro
14-day free trialStop checking — get alerted instantly
Next time your gRPC microservices goes down, you'll know in under 60 seconds — not when your users start complaining.
- Email alerts for your gRPC microservices + 9 more APIs
- $0 due today for trial
- Cancel anytime — $9/mo after trial
gRPC Health Protocol for Kubernetes
gRPC defines a standard health checking protocol (grpc.health.v1). Use grpc-health-probe for Kubernetes probes:
# Test health check manually
grpc-health-probe -addr=:50051
# Kubernetes deployment with gRPC health probes
spec:
containers:
- name: grpc-service
livenessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:50051"]
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:50051", "-service=MyService"]
initialDelaySeconds: 5
periodSeconds: 10
# Kubernetes 1.24+ supports native gRPC probes:
livenessProbe:
grpc:
port: 50051
service: "" # empty = overall server health
# Go: implement the health server
import "google.golang.org/grpc/health"
import healthpb "google.golang.org/grpc/health/grpc_health_v1"
healthServer := health.NewServer()
healthpb.RegisterHealthServer(srv, healthServer)
// Mark service healthy/unhealthy based on dependency checks
healthServer.SetServingStatus("MyService", healthpb.HealthCheckResponse_SERVING)
healthServer.SetServingStatus("MyService", healthpb.HealthCheckResponse_NOT_SERVING)FAQ
What metrics should I monitor for gRPC services?
Core gRPC metrics: grpc_server_handled_total (by method and status code), grpc_server_handling_seconds histogram (latency), grpc_server_msg_received_total and grpc_server_msg_sent_total (streaming throughput), and grpc_client_handled_total on clients. Alert on UNAVAILABLE (14), INTERNAL (13), and DEADLINE_EXCEEDED (4) error rates. Monitor active stream count for server-streaming and bidirectional RPCs.
How do I add OpenTelemetry tracing to gRPC?
Use the OTel gRPC instrumentation library for your language. In Go: otelgrpc.NewServerHandler() and otelgrpc.NewClientHandler() as gRPC StatsHandlers. In Python: GrpcInstrumentorServer().instrument() auto-instruments all grpc.server instances. In Java: use the OpenTelemetry agent (zero code change). OTel propagates W3C Trace Context via gRPC metadata across service boundaries automatically.
How do gRPC status codes map to monitoring alerts?
Alert on: INTERNAL (13) and UNAVAILABLE (14) — server errors, alert at >1% rate. RESOURCE_EXHAUSTED (8) — rate limiting or overload, alert immediately. DEADLINE_EXCEEDED (4) — client timeout, may indicate slow server or tight client timeout config. UNKNOWN (2) — unexpected server error. CANCELLED (1) and NOT_FOUND (5) are usually normal application behavior, not infrastructure alerts.
How do I monitor gRPC streaming RPCs?
Track grpc_server_msg_received_total and grpc_server_msg_sent_total per method for streaming throughput. Monitor active stream count (monotonically increasing count indicates stream leaks). Define separate SLOs for time-to-first-message vs total stream duration. RST_STREAM frames indicate abnormal stream termination — monitor via Envoy access logs if using a service mesh.
How do I debug gRPC UNAVAILABLE errors?
UNAVAILABLE (14) means the server is not accepting connections. Check: (1) server readiness probe is passing, (2) TLS cert CN/SAN matches client dial address, (3) max_concurrent_streams limit — raise it or scale horizontally, (4) load balancer supports HTTP/2, (5) use grpc-health-probe to test manually: grpc-health-probe -addr=:50051. Implement exponential backoff retry on UNAVAILABLE on the client side.
🛠 Tools We Use & Recommend
Tested across our own infrastructure monitoring 200+ APIs daily
Uptime Monitoring & Incident Management
Used by 100,000+ websites
Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.
“We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.”
Secrets Management & Developer Security
Trusted by 150,000+ businesses
Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.
“After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.”
Automated Personal Data Removal
Removes data from 350+ brokers
Removes your personal data from 350+ data broker sites. Protects against phishing and social engineering attacks.
“Service outages sometimes involve data breaches. Optery keeps your personal info off the sites attackers use first.”
AI Voice & Audio Generation
Used by 1M+ developers
Text-to-speech, voice cloning, and audio AI for developers. Build voice features into your apps with a simple API.
“The best AI voice API we've tested — natural-sounding speech with low latency. Essential for any app adding voice features.”
SEO & Site Performance Monitoring
Used by 10M+ marketers
Track your site health, uptime, search rankings, and competitor movements from one dashboard.
“We use SEMrush to track how our API status pages rank and catch site health issues early.”
Related Guides
Microservices Monitoring Guide
Service mesh, distributed tracing, and inter-service health.
Distributed Tracing Guide 2026
Jaeger, Zipkin, Tempo — end-to-end tracing for microservices.
OpenTelemetry Guide 2026
OTel collectors, exporters, and backend setup.
Kubernetes Monitoring Guide
Pod metrics, HPA alerting, and cluster observability.
Alert Pro
14-day free trialStop checking — get alerted instantly
Next time your gRPC microservices goes down, you'll know in under 60 seconds — not when your users start complaining.
- Email alerts for your gRPC microservices + 9 more APIs
- $0 due today for trial
- Cancel anytime — $9/mo after trial