Is Grafana Down Right Now?
Dashboards are dark and alerts are silent — but is it Grafana Cloud or your self-hosted instance? This guide separates Cloud outages from infrastructure issues and gives you a step-by-step diagnostic playbook.
The average API outage costs $5,600 per minute
Gartner estimates downtime costs $5,600/min on average. 98% of organizations say a single hour of downtime costs over $100,000. Proactive monitoring catches issues in under 30 seconds.
📡 Monitor your APIs — know when they go down before your users do
Better Stack checks uptime every 30 seconds with instant Slack, email & SMS alerts. Free tier available.
Affiliate link — we may earn a commission at no extra cost to you
Check Grafana Status Now
Grafana Cloud publishes per-region health for all LGTM stack components:
Grafana Service Components
Grafana Cloud is composed of multiple independent services. A Loki outage does not mean dashboards are broken — identify which component is affected before escalating.
Visualization and panel rendering engine
Built-in Grafana alerting with Alertmanager
Log aggregation and LogQL query engine
Horizontally scalable Prometheus-compatible metrics
Distributed tracing backend
Continuous profiling
Telemetry collection agent
On-call scheduling and escalation (fka Amixr)
Monitor Grafana's availability from the outside
The irony of Grafana going down: your observability platform can't observe itself. Better Stack gives you an external synthetic check that alerts before your team notices dashboards are dark.
Try Better Stack Free →Diagnostic Playbook
Work through these steps in order to isolate whether the problem is Grafana Cloud, your datasource, or your self-hosted stack.
Check status.grafana.com
Look for active incidents or degraded components in your region (US, EU, AU, AP).
→ Grafana Status PageTest the datasource directly
If dashboards are blank, the datasource may be the issue — not Grafana. Test Prometheus: `curl http://prometheus:9090/-/healthy`. Test Loki: `curl http://loki:3100/ready`.
Check alert evaluation logs
In self-hosted Grafana, run: `grep "scheduler" /var/log/grafana/grafana.log | tail -50` to see if alert evaluation is happening.
Verify contact point delivery
Go to Alerting > Contact points > Test. If test notifications fail, the contact point (Slack/PagerDuty webhook) is broken — not Grafana alerting itself.
Hard refresh and clear plugin cache
Some dashboard failures are browser-side. Try Ctrl+Shift+R (hard refresh) and check the browser console for plugin JS errors.
Grafana Cloud vs Self-Hosted: Who Owns the Fix?
☁️ Grafana Cloud
- • Check status.grafana.com for your region
- • Subscribe to incident emails from status page
- • Open a support ticket for account-specific issues
- • No fix on your end — wait for Grafana Labs
- • SLA: 99.9% for paid tiers
🏗️ Self-Hosted
- • Check Grafana process/pod health
- • Verify database connection (PostgreSQL/MySQL)
- • Check reverse proxy (nginx/Traefik) logs
- • Verify disk space: Grafana logs fill quickly
- • Restart strategy: `systemctl restart grafana-server`
Why Grafana Alerts Stop Firing
Silent alerts are the worst failure mode — the system appears healthy while incidents go undetected.
Alert Pro
14-day free trialStop checking — get alerted instantly
Next time Grafana goes down, you'll know in under 60 seconds — not when your users start complaining.
- Email alerts for Grafana + 9 more APIs
- $0 due today for trial
- Cancel anytime — $9/mo after trial
Frequently Asked Questions
Is Grafana Cloud down right now?
Check the official Grafana Cloud status page at status.grafana.com. It shows per-region health for Grafana dashboards, Loki (logs), Mimir (metrics), Tempo (traces), and alerting. If the status page shows green but your instance is broken, the issue is likely account-specific or regional — open a support ticket.
Why are my Grafana dashboards not loading?
Grafana dashboards fail to load for several reasons: (1) Grafana Cloud infrastructure outage — check status.grafana.com, (2) Datasource connection failure — the underlying Prometheus/Loki/InfluxDB is unreachable, (3) Query timeout — your PromQL or LogQL query is scanning too much data, (4) Browser cached stale session — try a hard refresh or incognito window, (5) Plugin issue — a panel plugin failed to load. Check the browser console for JavaScript errors and the Grafana server logs for datasource errors.
Why are my Grafana alerts not firing?
Grafana alerts silently fail for these reasons: (1) Alert evaluation engine paused — check Alerting > Admin in Grafana UI, (2) Contact point misconfigured — verify notification channels (Slack, PagerDuty, webhook) in Alerting > Contact points, (3) Alert rule in "Error" state — broken datasource prevents evaluation, (4) Mimir ruler down (Grafana Cloud) — check status.grafana.com, (5) Self-hosted: alert manager pod crashed — run `kubectl logs -n monitoring alertmanager-0`. For Cloud: check the Alert History tab to confirm evaluation is running.
What is the difference between Grafana Cloud being down vs my self-hosted Grafana being down?
Grafana Cloud is a SaaS offering (status.grafana.com). Self-hosted Grafana runs on your own infrastructure. If Grafana Cloud is down, all tenants in the affected region are impacted and you must wait for Grafana Labs to resolve it. If your self-hosted instance is down, you own the fix: check the Grafana pod/process, database connection (SQLite/PostgreSQL), reverse proxy (nginx), and disk space. The telltale sign: if status.grafana.com shows green but your dashboards are dark, the issue is self-hosted or account-specific.
How do I get alerted when Grafana itself goes down?
The irony: Grafana going down disables your alerts. Use an external monitoring service to watch Grafana: (1) Better Stack — monitors your Grafana URL from multiple regions, alerts via SMS/Slack/PagerDuty, (2) API Status Check — tracks Grafana Cloud status page and can alert you, (3) StatusPage subscriptions — subscribe to status.grafana.com email/Slack updates directly. For self-hosted: add a synthetic check from a separate monitoring system that pings your Grafana health endpoint (`/api/health`).
🛠 Tools We Use & Recommend
Tested across our own infrastructure monitoring 200+ APIs daily
Uptime Monitoring & Incident Management
Used by 100,000+ websites
Monitors your APIs every 30 seconds. Instant alerts via Slack, email, SMS, and phone calls when something goes down.
“We use Better Stack to monitor every API on this site. It caught 23 outages last month before users reported them.”
Secrets Management & Developer Security
Trusted by 150,000+ businesses
Manage API keys, database passwords, and service tokens with CLI integration and automatic rotation.
“After covering dozens of outages caused by leaked credentials, we recommend every team use a secrets manager.”
SEO & Site Performance Monitoring
Used by 10M+ marketers
Track your site health, uptime, search rankings, and competitor movements from one dashboard.
“We use SEMrush to track how our API status pages rank and catch site health issues early.”