gRPCMicroservices2026 Guide

gRPC Monitoring Guide: Metrics, Tracing & Observability (2026)

gRPC's binary protocol, HTTP/2 multiplexing, and streaming RPC patterns require different monitoring approaches than REST APIs. This guide covers how to instrument gRPC services with OpenTelemetry, expose Prometheus metrics via interceptors, trace distributed calls across polyglot services, and alert on gRPC status codes.

Updated April 202611 min readgRPC / Microservices / Protobuf
Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

TL;DR — gRPC Monitoring Checklist

  • ✅ Track grpc_server_handled_total by method and status code — your error rate signal
  • ✅ Alert on UNAVAILABLE (14) and INTERNAL (13) status codes
  • ✅ Add OpenTelemetry gRPC interceptors for distributed trace propagation
  • ✅ Monitor grpc_server_handling_seconds for latency histograms
  • ✅ Use grpc-health-probe for Kubernetes health checks
  • ✅ Track active stream count for server-streaming and bidirectional RPCs

gRPC vs REST: Monitoring Differences

gRPC's architecture creates monitoring challenges that don't exist with REST:

Binary protocol — no HTTP proxy visibility

gRPC uses Protobuf binary encoding over HTTP/2. Standard reverse proxies (nginx with HTTP/1.1) can't read gRPC frames. You need an HTTP/2-capable load balancer (Envoy, Traefik, NGINX with http2, or cloud load balancers). Without this, you lose connection-level visibility — a single HTTP/2 connection carries all RPCs, so connection metrics alone won't show per-RPC health.

Status codes vs HTTP codes

gRPC uses its own status codes (0-16), not HTTP status codes. HTTP/2 will return 200 OK even for gRPC errors — the actual error is in the grpc-status trailer header. Your monitoring must decode gRPC trailers, not just HTTP status codes.

Streaming RPCs change latency semantics

For unary RPCs, latency = time from request to response. For server-streaming RPCs, latency = time to first message + stream duration. These are fundamentally different SLOs. A slow server-streaming RPC that sends messages for 5 minutes isn't "slow" in the same way a 5-minute unary RPC is. Define separate SLOs for stream duration and time-to-first-message.

gRPC Status Codes Reference

CodeNameMeaningAlert?
0OKSuccess
1CANCELLEDClient cancelled the callIf spike
2UNKNOWNUnexpected server error⚠️ Yes
3INVALID_ARGUMENTBad client requestClient bug
4DEADLINE_EXCEEDEDClient timeout fired⚠️ Yes
5NOT_FOUNDResource missingUsually OK
8RESOURCE_EXHAUSTEDRate limit / overload🚨 Critical
13INTERNALServer-side panic or bug🚨 Critical
14UNAVAILABLEServer unreachable🚨 Critical
16UNAUTHENTICATEDAuth failureIf spike
📡
Recommended

Monitor your gRPC service endpoints with Better Stack

Better Stack runs uptime checks on your gRPC health endpoints — with on-call alerting when services go UNAVAILABLE.

Try Better Stack Free →

Prometheus Metrics for gRPC (Go)

The go-grpc-prometheus library adds server and client interceptors that emit standard gRPC metrics:

// go get github.com/grpc-ecosystem/go-grpc-prometheus

package main

import (
    grpc_prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"
    "google.golang.org/grpc"
    "github.com/prometheus/client_golang/prometheus/promhttp"
    "net/http"
)

func main() {
    // Server setup with Prometheus interceptors
    srv := grpc.NewServer(
        grpc.UnaryInterceptor(grpc_prometheus.UnaryServerInterceptor),
        grpc.StreamInterceptor(grpc_prometheus.StreamServerInterceptor),
    )

    // Register your gRPC services
    pb.RegisterMyServiceServer(srv, &MyService{})

    // Enable default metrics with histograms for latency
    grpc_prometheus.EnableHandlingTimeHistogram()

    // Initialize server metrics after registering services
    grpc_prometheus.Register(srv)

    // Expose Prometheus metrics on /metrics
    go func() {
        http.Handle("/metrics", promhttp.Handler())
        http.ListenAndServe(":9090", nil)
    }()

    // Serve gRPC on :50051
    lis, _ := net.Listen("tcp", ":50051")
    srv.Serve(lis)
}

// Client setup
conn, _ := grpc.Dial(
    "server:50051",
    grpc.WithUnaryInterceptor(grpc_prometheus.UnaryClientInterceptor),
    grpc.WithStreamInterceptor(grpc_prometheus.StreamClientInterceptor),
)

// This emits:
// grpc_server_handled_total{grpc_code, grpc_method, grpc_service, grpc_type}
// grpc_server_handling_seconds{grpc_method, grpc_service, grpc_type} histogram
// grpc_server_msg_received_total  — for streaming
// grpc_server_msg_sent_total

// Key alert rules:
// High error rate by status code:
// rate(grpc_server_handled_total{grpc_code!="OK"}[5m])
//   / rate(grpc_server_handled_total[5m]) > 0.01

// UNAVAILABLE spike (server down):
// increase(grpc_server_handled_total{grpc_code="UNAVAILABLE"}[1m]) > 10

// Latency regression:
// histogram_quantile(0.99, rate(grpc_server_handling_seconds_bucket[5m])) > 1.0

OpenTelemetry Distributed Tracing for gRPC

OpenTelemetry propagates trace context across gRPC calls automatically via metadata headers — even across language boundaries:

// Go: otelgrpc interceptors
// go get go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc

import "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"

// Server
srv := grpc.NewServer(
    grpc.StatsHandler(otelgrpc.NewServerHandler()),
)

// Client
conn, _ := grpc.Dial(
    "downstream:50051",
    grpc.WithStatsHandler(otelgrpc.NewClientHandler()),
)

// The interceptors automatically:
// 1. Extract W3C Trace Context from incoming metadata (traceparent header)
// 2. Create a span for each RPC with rpc.method, rpc.service, rpc.grpc.status_code
// 3. Inject trace context into outgoing client calls
// 4. Record errors and status codes on spans

// Python: auto-instrumentation
# pip install opentelemetry-instrumentation-grpc
from opentelemetry.instrumentation.grpc import GrpcInstrumentorServer, GrpcInstrumentorClient

GrpcInstrumentorServer().instrument()   # instrument all grpc.server instances
GrpcInstrumentorClient().instrument()   # instrument all grpc.channel instances

# Java: OpenTelemetry agent (zero code change)
# java -javaagent:opentelemetry-javaagent.jar \
#   -Dotel.service.name=my-grpc-service \
#   -Dotel.exporter.otlp.endpoint=http://collector:4317 \
#   -jar my-service.jar

# What you see in traces:
# - Full RPC call tree across microservices
# - Exact method that failed (rpc.method="OrderService/PlaceOrder")
# - gRPC status code as span attribute
# - Time breakdown: client send, server process, server send, client receive

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time your gRPC microservices goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for your gRPC microservices + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial

gRPC Health Protocol for Kubernetes

gRPC defines a standard health checking protocol (grpc.health.v1). Use grpc-health-probe for Kubernetes probes:

# Test health check manually
grpc-health-probe -addr=:50051

# Kubernetes deployment with gRPC health probes
spec:
  containers:
  - name: grpc-service
    livenessProbe:
      exec:
        command: ["/bin/grpc_health_probe", "-addr=:50051"]
      initialDelaySeconds: 10
      periodSeconds: 30
    readinessProbe:
      exec:
        command: ["/bin/grpc_health_probe", "-addr=:50051", "-service=MyService"]
      initialDelaySeconds: 5
      periodSeconds: 10

# Kubernetes 1.24+ supports native gRPC probes:
livenessProbe:
  grpc:
    port: 50051
    service: ""        # empty = overall server health

# Go: implement the health server
import "google.golang.org/grpc/health"
import healthpb "google.golang.org/grpc/health/grpc_health_v1"

healthServer := health.NewServer()
healthpb.RegisterHealthServer(srv, healthServer)

// Mark service healthy/unhealthy based on dependency checks
healthServer.SetServingStatus("MyService", healthpb.HealthCheckResponse_SERVING)
healthServer.SetServingStatus("MyService", healthpb.HealthCheckResponse_NOT_SERVING)

FAQ

What metrics should I monitor for gRPC services?

Core gRPC metrics: grpc_server_handled_total (by method and status code), grpc_server_handling_seconds histogram (latency), grpc_server_msg_received_total and grpc_server_msg_sent_total (streaming throughput), and grpc_client_handled_total on clients. Alert on UNAVAILABLE (14), INTERNAL (13), and DEADLINE_EXCEEDED (4) error rates. Monitor active stream count for server-streaming and bidirectional RPCs.

How do I add OpenTelemetry tracing to gRPC?

Use the OTel gRPC instrumentation library for your language. In Go: otelgrpc.NewServerHandler() and otelgrpc.NewClientHandler() as gRPC StatsHandlers. In Python: GrpcInstrumentorServer().instrument() auto-instruments all grpc.server instances. In Java: use the OpenTelemetry agent (zero code change). OTel propagates W3C Trace Context via gRPC metadata across service boundaries automatically.

How do gRPC status codes map to monitoring alerts?

Alert on: INTERNAL (13) and UNAVAILABLE (14) — server errors, alert at >1% rate. RESOURCE_EXHAUSTED (8) — rate limiting or overload, alert immediately. DEADLINE_EXCEEDED (4) — client timeout, may indicate slow server or tight client timeout config. UNKNOWN (2) — unexpected server error. CANCELLED (1) and NOT_FOUND (5) are usually normal application behavior, not infrastructure alerts.

How do I monitor gRPC streaming RPCs?

Track grpc_server_msg_received_total and grpc_server_msg_sent_total per method for streaming throughput. Monitor active stream count (monotonically increasing count indicates stream leaks). Define separate SLOs for time-to-first-message vs total stream duration. RST_STREAM frames indicate abnormal stream termination — monitor via Envoy access logs if using a service mesh.

How do I debug gRPC UNAVAILABLE errors?

UNAVAILABLE (14) means the server is not accepting connections. Check: (1) server readiness probe is passing, (2) TLS cert CN/SAN matches client dial address, (3) max_concurrent_streams limit — raise it or scale horizontally, (4) load balancer supports HTTP/2, (5) use grpc-health-probe to test manually: grpc-health-probe -addr=:50051. Implement exponential backoff retry on UNAVAILABLE on the client side.

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

Better StackBest for API Teams

Uptime Monitoring & Incident Management

Used by 100,000+ websites

Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.

We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.

Free tier · Paid from $24/moStart Free Monitoring
1PasswordBest for Credential Security

Secrets Management & Developer Security

Trusted by 150,000+ businesses

Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.

After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.

OpteryBest for Privacy

Automated Personal Data Removal

Removes data from 350+ brokers

Removes your personal data from 350+ data broker sites. Protects against phishing and social engineering attacks.

Service outages sometimes involve data breaches. Optery keeps your personal info off the sites attackers use first.

From $9.99/moFree Privacy Scan
ElevenLabsBest for AI Voice

AI Voice & Audio Generation

Used by 1M+ developers

Text-to-speech, voice cloning, and audio AI for developers. Build voice features into your apps with a simple API.

The best AI voice API we've tested — natural-sounding speech with low latency. Essential for any app adding voice features.

Free tier · Paid from $5/moTry ElevenLabs Free
SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

We use SEMrush to track how our API status pages rank and catch site health issues early.

From $129.95/moTry SEMrush Free
View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you

Related Guides

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time your gRPC microservices goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for your gRPC microservices + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial