Database MonitoringRedis2026 Guide

Redis Monitoring Guide: Key Metrics, Alerts & Tools (2026)

Redis failures are silent killers. Memory fills up, evictions spike, your cache hit ratio tanks — and your database suddenly gets 10× the load. This guide covers the metrics that matter, how to instrument them, and how to set alerts before your application degrades.

Updated April 202614 min readSRE / Backend Engineering
Staff Pick

📡 Monitor your APIs — know when they go down before your users do

Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.

Start Free →

Affiliate link — we may earn a commission at no extra cost to you

TL;DR — Redis Monitoring Checklist

  • ✅ Monitor used_memory vs maxmemory — alert at 80%
  • ✅ Track evicted_keys — any non-zero value is a problem
  • ✅ Calculate cache hit ratio: hits / (hits + misses) — alert below 90%
  • ✅ Monitor replication lag — alert above 1MB offset gap
  • ✅ Enable slow log (slowlog-log-slower-than 10000) to catch bad commands
  • ✅ Watch connected_clients — a spike usually means a connection leak
  • ✅ Use redis-exporter for Prometheus/Grafana integration

Why Redis Monitoring Is Different

Redis is single-threaded and lives in RAM. This makes it extremely fast — and extremely unforgiving when you get the configuration wrong. Unlike a database where a slow query just slows down that query, a Redis problem cascades:

1

Memory fills up → evictions start

Redis deletes cache keys to free memory. Your hit ratio drops.

2

Hit ratio drops → database gets slammed

Every cache miss hits the backing database. Load spikes 5-20×.

3

Database overwhelmed → application latency spikes

DB can't keep up, queries queue, response times blow up.

4

Application latency → timeout errors → users see failures

What started as Redis running out of memory ends as a user-visible outage.

Good Redis monitoring catches problems at step 1 — before the cascade. The goal is alerting on memory pressure and evictions, not waiting to see database CPU spike.

Core Redis Metrics

Memory Metrics

MetricCommandWhat It MeansAlert Threshold
used_memoryINFO memoryBytes allocated by Redis for data> 80% of maxmemory
used_memory_rssINFO memoryBytes allocated by OS (includes fragmentation)rss/used > 1.5 (fragmentation)
evicted_keysINFO statsKeys deleted to free memory (cumulative)Rate > 0/sec (any eviction)
mem_fragmentation_ratioINFO memoryrss_mem / used_mem — overhead from fragmentation> 1.5 warning, > 2.0 critical

Cache Performance Metrics

MetricCommandWhat It MeansTarget
keyspace_hitsINFO statsSuccessful key lookupsAs high as possible
keyspace_missesINFO statsFailed key lookups (key not in cache)Alert if miss rate > 10%
hit_rateCalculatedhits / (hits + misses) × 100Alert below 90%
expired_keysINFO statsKeys removed by TTL expiration (normal)Normal — just monitor trend

Connection & Throughput Metrics

MetricCommandWhat It MeansAlert Threshold
connected_clientsINFO clientsActive client connections right now> maxclients × 0.9
blocked_clientsINFO clientsClients waiting on BLPOP/BRPOP/WAITUnexpected spike > baseline
instantaneous_ops_per_secINFO statsCommands processed per secondAlert on 50%+ drop from baseline
rejected_connectionsINFO statsConnections rejected because maxclients reachedAny non-zero value
📡
Recommended

Monitor Redis availability with Better Stack

Better Stack runs TCP and HTTP checks against your Redis endpoints from 30+ global locations. Get alerted in seconds when Redis becomes unreachable — before your cache miss rate explodes.

Try Better Stack Free →

The INFO Command — Your First Diagnostic Tool

redis-cli INFO is the fastest way to see Redis health. Run it with a section name for focused output:

# Get all stats
redis-cli INFO

# Memory section only
redis-cli INFO memory

# Stats section (hits, misses, evictions, connections)
redis-cli INFO stats

# Replication section
redis-cli INFO replication

# Keyspace section (key counts per DB)
redis-cli INFO keyspace

# Example memory output:
# used_memory:1234567890
# used_memory_human:1.15G
# used_memory_rss:1456789012
# mem_fragmentation_ratio:1.18
# maxmemory:2147483648
# maxmemory_human:2.00G
# maxmemory_policy:allkeys-lru

Pro tip: Run redis-cli --stat for a live rolling view of ops/sec, used memory, keys, blocked clients, and requests every second — useful for watching trends in real time during an incident.

Prometheus Setup with redis-exporter

The oliver006/redis_exporter is the standard Prometheus exporter for Redis. It scrapes INFO and exposes 100+ metrics on port 9121.

# Docker Compose example
version: '3'
services:
  redis-exporter:
    image: oliver006/redis_exporter:v1.62
    environment:
      REDIS_ADDR: "redis://redis:6379"
      REDIS_PASSWORD: "$REDIS_PASSWORD"
    ports:
      - "9121:9121"
    depends_on:
      - redis

# Prometheus scrape config
scrape_configs:
  - job_name: 'redis'
    static_configs:
      - targets: ['localhost:9121']
    scrape_interval: 15s

# Key metrics exposed:
# redis_memory_used_bytes
# redis_memory_max_bytes
# redis_keyspace_hits_total
# redis_keyspace_misses_total
# redis_evicted_keys_total
# redis_connected_clients
# redis_blocked_clients
# redis_replication_lag
# redis_up (1 = healthy)

Essential Prometheus Alert Rules

groups:
  - name: redis
    rules:
      # Redis is down
      - alert: RedisDown
        expr: redis_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Redis instance is not responding"

      # Memory pressure
      - alert: RedisMemoryHigh
        expr: |
          redis_memory_used_bytes / redis_memory_max_bytes > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Redis memory usage above 80%"

      # Evictions happening (cache too small)
      - alert: RedisEvictions
        expr: rate(redis_evicted_keys_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Redis is evicting keys — cache may be undersized"

      # Cache hit ratio below 90%
      - alert: RedisLowHitRate
        expr: |
          rate(redis_keyspace_hits_total[5m]) /
          (rate(redis_keyspace_hits_total[5m]) + rate(redis_keyspace_misses_total[5m])) < 0.9
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Redis cache hit rate below 90%"

      # Connection limit approaching
      - alert: RedisTooManyConnections
        expr: redis_connected_clients > (redis_config_maxclients * 0.9)
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Redis approaching max client connections"

      # Replication lag
      - alert: RedisReplicationLag
        expr: redis_replication_lag > 1048576  # 1MB
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Redis replica is lagging behind primary"

Slow Query Log — Finding Bad Commands

Redis is single-threaded. One slow command blocks every other client. The slow log captures commands that exceed your threshold — it's your first stop when Redis latency spikes.

# Configure slow log (10ms threshold, keep 128 entries)
redis-cli CONFIG SET slowlog-log-slower-than 10000
redis-cli CONFIG SET slowlog-max-len 128

# View the 25 most recent slow commands
redis-cli SLOWLOG GET 25

# Output format:
# 1) 1) (integer) 14          # Entry ID
#    2) (integer) 1714500000  # Unix timestamp
#    3) (integer) 28000       # Execution time in microseconds (28ms)
#    4) 1) "KEYS"             # Command + arguments
#       2) "*"
#    5) "127.0.0.1:42321"
#    6) ""

# Reset the slow log
redis-cli SLOWLOG RESET

Common Slow Command Offenders

CommandProblemFix
KEYS *Full keyspace scan — blocks all other commandsUse SCAN with cursor + COUNT
LRANGE key 0 -1Reading entire list (could be millions of items)Use pagination: LRANGE key 0 99
SMEMBERS keyReturns all members of a large setUse SSCAN for large sets
SORT keySorting large lists is O(N+M log M)Pre-sort at write time or use sorted sets (ZADD)
HGETALL keyReturns entire hash with hundreds of fieldsUse HMGET for specific fields

Never use KEYS in production. Even with 10,000 keys, KEYS * holds Redis for the entire scan. With 1M keys and a busy server, it can lock Redis for hundreds of milliseconds.

Replication Monitoring

In production Redis setups, you typically have one primary and one or more replicas. Monitoring replication health is critical — if a replica falls too far behind and the primary fails, you lose data.

# Check replication status on the primary
redis-cli INFO replication

# Key output fields:
# role: master
# connected_slaves: 2
# slave0: ip=10.0.1.5,port=6379,state=online,offset=1234567,lag=0
# slave1: ip=10.0.1.6,port=6379,state=online,offset=1234500,lag=0
# master_repl_offset: 1234570
# repl_backlog_size: 1048576

# Lag calculation:
# slave lag = master_repl_offset - slave_offset
# slave0 lag = 1234570 - 1234567 = 3 bytes (healthy)
# slave1 lag = 1234570 - 1234500 = 70 bytes (tiny, normal)

# Check on a replica
redis-cli -h replica-host INFO replication
# role: slave
# master_host: 10.0.1.4
# master_link_status: up   # Should be "up"
# master_sync_in_progress: 0  # 1 = full resync in progress (expensive)

Watch for full resyncs: If master_sync_in_progress: 1, a replica is doing a full resync — it loaded all data from scratch. This is expensive (transfers the full RDB snapshot). It happens when a replica reconnects after falling too far behind the replication backlog. Make the backlog larger (repl-backlog-size) to reduce this.

Alert Pro

14-day free trial

Stop checking — get alerted instantly

Next time your Redis-backed services goes down, you'll know in under 60 seconds — not when your users start complaining.

  • Email alerts for your Redis-backed services + 9 more APIs
  • $0 due today for trial
  • Cancel anytime — $9/mo after trial

Redis Sentinel vs. Cluster Monitoring

Redis Sentinel

Monitors primary/replica pairs. Promotes a replica if primary fails.

Key checks:

  • sentinel masters — list monitored masters
  • sentinel slaves <name> — replica health
  • sentinel sentinels <name> — quorum count
  • → Alert on num-other-sentinels < 2 (can't achieve quorum)

Redis Cluster

Shards data across multiple nodes. Built-in HA without Sentinel.

Key checks:

  • CLUSTER INFO — cluster_state must be "ok"
  • cluster_slots_fail must be 0
  • CLUSTER NODES — all nodes connected
  • → Alert if any shard has no healthy replica

Redis Monitoring Tools (2026)

ToolTypeBest ForCost
redis-cli + INFOBuilt-in CLIQuick manual diagnostics, incident investigationFree
oliver006/redis_exporterPrometheus exporterTeams already running Prometheus/GrafanaFree (OSS)
Better StackSaaS monitoringTCP/HTTP monitoring + on-call alerting, fast setupFree tier, $25/mo+
Grafana CloudSaaS observabilityFull metrics/logs/traces stack, pre-built Redis dashboardsFree tier (10k series), $8/mo+
DatadogEnterprise APMEnterprises wanting Redis + app correlation$15-23/host/mo
RedisInsightRedis GUIVisual key browser, slow log viewer, memory analysisFree (by Redis Ltd)

Frequently Asked Questions

What are the most important Redis metrics to monitor?

The six critical metrics: (1) used_memory vs maxmemory — alert at 80%, (2) evicted_keys rate — any non-zero value means your cache is too small, (3) cache hit ratio — alert below 90%, (4) connected_clients — spike indicates a connection leak, (5) replication_lag — alert above 1MB offset gap, (6) instantaneous_ops_per_sec — drop indicates Redis is struggling.

How do I check Redis memory usage?

Run redis-cli INFO memory. Focus on used_memory (actual data size), used_memory_rss (OS-level allocation including fragmentation), and mem_fragmentation_ratio. A ratio above 1.5 means fragmentation overhead — consider MEMORY PURGE or restart during a maintenance window.

What is a good Redis cache hit ratio?

Aim for 90%+ for most caching workloads. Calculate as: keyspace_hits / (keyspace_hits + keyspace_misses). Below 80% usually means keys are expiring too aggressively, your cache is undersized (evictions before TTL), or keys are being written but never looked up.

How do I monitor Redis replication lag?

Run INFO replication on the primary. Each replica shows its offset — subtract from master_repl_offset to get lag in bytes. Alert when lag exceeds 1MB. Also monitor master_link_status on replicas — "down" means the replica is disconnected.

How do I find slow queries in Redis?

Enable the slow log: CONFIG SET slowlog-log-slower-than 10000 (10ms). Then run SLOWLOG GET 25. The most common offenders: KEYS * (full scan — never use in production), LRANGE on giant lists, SMEMBERS on huge sets. Replace with SCAN, SSCAN, or HSCAN respectively.

What should I set as my Redis eviction policy?

For pure caching: allkeys-lru. For mixed data (some persistent, some cached): volatile-lru. For data that must never be evicted (queues, counters): noeviction with very aggressive memory alerts. Configure with CONFIG SET maxmemory-policy allkeys-lru. Monitor evicted_keys — any eviction means your cache is undersized.

Related Monitoring Guides

🛠 Tools We Use & Recommend

Tested across our own infrastructure monitoring 200+ APIs daily

Better StackBest for API Teams

Uptime Monitoring & Incident Management

Used by 100,000+ websites

Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.

We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.

Free tier · Paid from $24/moStart Free Monitoring
1PasswordBest for Credential Security

Secrets Management & Developer Security

Trusted by 150,000+ businesses

Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.

After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.

OpteryBest for Privacy

Automated Personal Data Removal

Removes data from 350+ brokers

Removes your personal data from 350+ data broker sites. Protects against phishing and social engineering attacks.

Service outages sometimes involve data breaches. Optery keeps your personal info off the sites attackers use first.

From $9.99/moFree Privacy Scan
ElevenLabsBest for AI Voice

AI Voice & Audio Generation

Used by 1M+ developers

Text-to-speech, voice cloning, and audio AI for developers. Build voice features into your apps with a simple API.

The best AI voice API we've tested — natural-sounding speech with low latency. Essential for any app adding voice features.

Free tier · Paid from $5/moTry ElevenLabs Free
SEMrushBest for SEO

SEO & Site Performance Monitoring

Used by 10M+ marketers

Track your site health, uptime, search rankings, and competitor movements from one dashboard.

We use SEMrush to track how our API status pages rank and catch site health issues early.

From $129.95/moTry SEMrush Free
View full comparison & more tools →Affiliate links — we earn a commission at no extra cost to you