Grafana Cloud Outage History

50 incidents reported. Data sourced from the official Grafana Cloud status page.

50
Total Incidents
22
Major/Critical
22
Minor
49
Resolved

June 2026

IRM Degraded Performance

minor
Jun 8, 10:45 AMJun 8, 08:30 PMresolved
Jun 8, 08:30 PM
resolvedThis incident has been resolved.
Jun 8, 05:11 PM
monitoringWe've released a fix to the IRM app that should restore service for affected customers with issues related to labels. Thanks for your patience while investigating. We're continuing to monitor as we co...
Jun 8, 03:21 PM
identifiedWe are continuing to work on a fix for this. To further clarify, this issue is not about accessing IRM or alert ingestion/notification/delivery, but rather with handling labels.
+5 more updates

Brief Rule Evaluation Failures in prod-eu-west-3

major
Jun 7, 01:39 AMmonitoring
Jun 8, 09:36 PM
monitoringThe incident has been mitigated, and services are operating normally. We continue to monitor the service to ensure full stability.
Jun 7, 11:00 AM
monitoringThe incident has been mitigated, and services are operating normally. We are currently monitor the service to ensure full stability.
Jun 7, 06:00 AM
investigatingWe’re making ongoing progress on the investigation alongside our upstream provider.
+3 more updates

Permissions Issues with IRM

critical
Jun 5, 04:11 PMJun 5, 09:35 PMresolved
Jun 5, 09:35 PM
resolvedThis incident has been resolved.
Jun 5, 08:10 PM
monitoringContinuing to monitor progress. Most customers affected should have all services restored, with a few remaining customers receiving updates as the rollout finishes out. Thanks again for your patience.
Jun 5, 07:34 PM
monitoringA fix has been released to prod and rolling out across the fleet for IRM, restoring access to affected customers. Thanks for your patience through this work. We're continuing to monitor to confirm we'...
+5 more updates

Silences not Working as Expected

major
Jun 5, 02:02 PMJun 5, 03:01 PMresolved
Jun 5, 03:01 PM
resolvedThis incident has been resolved.
Jun 5, 02:02 PM
identifiedWe have identified an issue causing Silences to not work as expected in the Cloud (Mimir) Alertmanager. Grafana Alertmanager is working ok, this is only affecting Data source-managed alerts.

Grafana Assistant Skills Page Blank

major
Jun 4, 04:44 PMJun 4, 06:28 PMresolved
Jun 4, 06:28 PM
resolvedThis incident has been resolved.
Jun 4, 05:00 PM
identifiedThe issue has been identified, and we are working on a fix.
Jun 4, 04:56 PM
investigatingWe are continuing to investigate this issue.
+1 more updates

K6 Test Runs Degraded Performance

minor
Jun 3, 08:40 PMJun 4, 05:47 AMresolved
Jun 4, 05:47 AM
resolvedThis incident has been resolved.
Jun 3, 09:23 PM
monitoringWe have applied a fix, and are monitoring the results.
Jun 3, 08:40 PM
investigatingWe are currently investigating an issue causing k6 test runs to take longer than expected to complete, or to time out within Grafana Cloud.

Synthetic Scripted/Browser checks failure

major
Jun 3, 11:38 AMJun 3, 06:22 PMresolved
Jun 3, 06:22 PM
resolvedThis incident has been resolved.
Jun 3, 05:07 PM
identifiedWe are in the process of deploying a fix for this issue.
Jun 3, 11:38 AM
investigatingWe’re currently investigating an issue affecting Synthetic Monitoring where updates for Scripted/Browser checks might fail. Our team is actively working to identify the cause. Thank you for your patie...

tempo prod-25 write-path-down

minor
Jun 2, 11:15 PMJun 3, 07:46 AMresolved
Jun 3, 07:46 AM
resolvedThis incident has been resolved.
Jun 2, 11:15 PM
identifiedBetween 21:20 and 22:40 UTC, writes to tempo-prod-25 failed due to an outage. tempo-prod-24 was also affected during an overlapping window from 22:32 to 22:40 UTC."

Alert manager unavailable in prod-us-central-0

minor
Jun 1, 07:53 PMJun 1, 08:26 PMresolved
Jun 1, 08:26 PM
resolvedThis incident has been resolved.
Jun 1, 08:02 PM
monitoringA fix has been implemented and we are monitoring the results.
Jun 1, 07:53 PM
identifiedStarting at 18:30 UTC, we noticed alert manager unavailability limited to prod-us-central-0 which affects grafana-managed and datasource-managed alerting, causing disruption to updating alertmanager c...

May 2026

Grafana Loki Log Query Issues

major
May 29, 09:03 AMMay 29, 10:59 AMresolved
May 29, 10:59 AM
resolvedThis incident has been resolved.
May 29, 10:28 AM
monitoringWe have identified the cause of this incident and a fix has been applied. Normal functions are returning. We are currently monitoring the recovery process.
May 29, 09:03 AM
investigatingWe’re currently investigating an issue affecting Loki queries in Grafana. We have had reports from customers showing the logs are not loading or showing missing logs. Our team is actively working to i...

Prometheus Datasource Errors/Outage in prod-us-east-0

major
May 27, 08:22 PMMay 27, 10:59 PMresolved
May 27, 10:59 PM
resolvedThis incident has been resolved. Thank you for your patience.
May 27, 09:57 PM
investigatingWe are seeing recovery across affected Prometheus datasources, and error rates have significantly improved. The service is recovering without any required customer action, and our team continues to m...
May 27, 09:36 PM
investigatingWe continue to investigate an issue affecting Prometheus datasources causing intermittent timeouts and unexpected errors, primarily impacting alert rule evaluations. Our team is actively working to ...
+1 more updates

Grafana K6 metrics processing and test runs degradation

minor
May 18, 08:24 AMMay 18, 03:42 PMresolved
May 18, 03:42 PM
resolvedThis incident has been resolved.
May 18, 02:30 PM
monitoringWe've stabilized the system and test runs no longer result in timeout. There is a small delay (a few minutes) in processing metrics at the end of the test run, but most users shouldn't be too negative...
May 18, 10:27 AM
investigatingWe have identified that test runs are getting timed out as a result of the issue This issue first occurred on May 05/15/2026 at 8:00PM UTC.
+1 more updates

Intermittent Errors and High latency Writing to Cloud Metrics, Cloud Logs and Cloud Traces

minor
May 13, 08:50 AMMay 14, 07:06 AMresolved
May 14, 07:06 AM
resolvedWe continue to observe an extended period of recovery and we're marking the incident as resolved at this point in time.
May 13, 09:10 PM
monitoringWe continue to see signs of recovery and improved stability across impacted services. Our teams continue to closely monitor the situation while working with the cloud provider.
May 13, 03:41 PM
monitoringWe continue to see signs of recovery and improved stability across impacted services. Our teams continue to closely monitor the situation while working with the cloud provider.
+4 more updates

"Failed to Load Dashboard" Errors

major
May 11, 09:38 PMMay 12, 04:35 PMresolved
May 12, 04:35 PM
resolvedThis incident has been resolved. Thank you for your patience.
May 12, 02:13 PM
identifiedThe fix is currently being rolled out to all impacted environments.
May 12, 11:11 AM
identifiedOur teams continue working on a fix for this issue. We do not have additional information to share at this time, but we will continue to provide updates as progress is made.
+2 more updates

SSL/TLS Connectivity Issues

major
May 11, 08:49 PMMay 11, 10:40 PMresolved
May 11, 10:40 PM
resolvedThis incident has been resolved. Thank you for your patience.
May 11, 08:49 PM
investigatingWe are currently investigating reports of service disruption affecting a subset of customers. Customers may experience intermittent connectivity issues, degraded performance, or SSL/TLS certificate v...

Cloud Metrics -High Write Latency and Errors in prod-us-central-7

minor
May 8, 09:16 PMMay 8, 10:30 PMresolved
May 8, 10:30 PM
resolvedWe have continued to observe stability. This incident is now being considered as resolved. Thank you for your patience.
May 8, 09:16 PM
monitoringFrom approximately 20:40-21:00 UTc, we experienced an issue affecting Grafana Cloud Metrics in prod-us-central-7. Affected users may have experienced high latency and/or errors during ingestion and ru...

Metrics read errors in prod-ap-south-1 region

critical
May 7, 07:18 AMMay 7, 07:56 AMresolved
May 7, 07:56 AM
resolvedAt this time, we have confirmed that the query errors have gone and we are considering this issue resolved.
May 7, 07:53 AM
monitoringEngineering has released a fix and as of 07:50 UTC, customers should no longer experience errors when querying metrics. We will continue to monitor for recurrence and provide updates accordingly.
May 7, 07:18 AM
investigatingFrom approximately 06:24 UTC, we were alerted to an issue with read errors in mimir-prod-43. Users with instances hosted in the prod-ap-south-1 region experiencing this issue may encounter an error me...

Hardware failure on CSP within prod-us-west-0

major
May 6, 05:30 AMMay 6, 05:30 AMresolved
May 8, 11:04 AM
resolvedWe observed an underlying hardware failure on our CSP which triggered an automatic live VM migration. The situation caused a degradation in write performance for Grafana Cloud Metrics on prod-us-west-...

Elevated Error Rate of Browser Checks in PoP Oregon

minor
May 5, 04:11 PMMay 5, 08:13 PMresolved
May 5, 08:13 PM
resolvedThis incident has been resolved. Thank you for your patience.
May 5, 07:44 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.
May 5, 06:13 PM
identifiedWe’ve identified the cause of the issue impacting browser checks. Our team is currently implementing a fix.
+1 more updates

k6 Partial Outage

major
May 4, 10:58 PMMay 5, 02:09 AMresolved
May 5, 02:09 AM
resolvedThis incident has been resolved. Thank you for your patience.
May 5, 12:04 AM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.
May 4, 11:23 PM
investigatingAfter further investigation, this issue may also be affecting Synthetic Monitoring. We continue to identify the cause and will update as soon as we have more information.
+1 more updates

Ingestion Errors for AWS Cloud Provider Observability Metric Streams in prod-us-central-7

major
May 1, 09:14 AMMay 1, 10:27 AMresolved
May 1, 10:27 AM
resolvedThis incident has been resolved.
May 1, 09:43 AM
monitoringA fix has been implemented and we are monitoring the results.
May 1, 09:42 AM
investigatingWe are continuing to investigate this issue.
+1 more updates

April 2026

Gateway Slowness Detected in Prod (US-East-1)

minor
Apr 28, 09:20 AMApr 30, 03:11 PMresolved
Apr 30, 03:11 PM
resolvedAfter further review, this was a false alarm and should not have affected any users. This incident has been resolved. Thank you for your patience.
Apr 28, 09:20 AM
investigatingSuccessful requests have dropped, users may not be able to access their instances.. The issue is under investigation.

Investigating Issues Saving SQL Datasource Credentials

minor
Apr 28, 06:46 PMApr 29, 01:37 PMresolved
Apr 29, 01:37 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 28, 06:59 PM
monitoringWe’ve identified the cause of the issue impacting SQL datasources. Our team is currently implementing a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to...
Apr 28, 06:46 PM
investigatingWe are currently investigating reports of issues affecting SQL-based data sources where users are unable to save credentials. This appears to impact a subset of customers and may be occurring across ...

Performance Testing – Degraded Service (Resolved)

none
Apr 29, 12:00 PMApr 29, 12:00 PMresolved
Apr 29, 01:51 PM
resolvedWe experienced degraded performance affecting Performance Testing from 13:10 UTC to 13:20 UTC. During this time, users may not have been able to start new test runs. The issue has been resolved, and ...

Elevated write latency for AWS Metrics Streaming integration in us-east-3 region.

minor
Apr 29, 10:30 AMApr 29, 10:30 AMresolved
Apr 29, 12:57 PM
resolvedWe were facing an incident with AWS Metrics Streaming integration in us-east-3 region manifesting in elevated ingestion latency. The incident started at around 10:45 UTC and was resolved at around 12:...

InfluxDB Datasource - Intermittent Failures

major
Apr 27, 05:08 PMApr 27, 11:24 PMresolved
Apr 27, 11:24 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 27, 11:13 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.
Apr 27, 06:01 PM
identifiedWe’ve identified the cause of the issue impacting the InfluxDB datasource. Our team is currently implementing a fix.
+1 more updates

Restrictions on Alerts & Reports for Grafana Cloud Free/Trial Users

minor
Apr 20, 09:12 PMApr 24, 03:04 PMresolved
Apr 24, 03:04 PM
resolvedGrafana Labs has taken steps to safeguard the Grafana Cloud platform against the distribution of unauthorized emails. We have implemented the following changes to new Grafana Cloud Free and Trial acco...
Apr 22, 03:03 PM
monitoringGrafana Labs is implementing measures to safeguard the Grafana Cloud platform against ongoing unauthorized use while preserving the capabilities relied upon by our community. Effective immediately, we...
Apr 20, 10:07 PM
monitoringWe are continuing to monitor for any further issues.
+1 more updates

Cloudwatch Datasource Outage

major
Apr 23, 02:26 PMApr 23, 08:01 PMresolved
Apr 23, 08:01 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 23, 02:39 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.
Apr 23, 02:26 PM
investigatingWe’re currently investigating an issue affecting Cloudwatch datasources. Our team is actively working to identify the cause. Thank you for your patience.

Elevated 429 Errors Impacting Metrics Querying Across Multiple Regions

critical
Apr 20, 02:09 PMApr 20, 02:30 PMresolved
Apr 20, 02:30 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 20, 02:21 PM
investigatingThe issue is now confirmed to be widespread, affecting Prometheus across all regions. Customers may continue to experience elevated 429 (rate limit) errors, particularly when querying metrics, with f...
Apr 20, 02:09 PM
investigatingWe are currently experiencing a major incident causing elevated 429 (rate limit) errors across multiple regions, primarily impacting metrics querying. This is a high-priority issue, and our engineeri...

Query Caching - Degraded Performance

minor
Apr 17, 09:23 PMApr 17, 10:58 PMresolved
Apr 17, 10:58 PM
resolvedThis incident has been resolved
Apr 17, 10:09 PM
monitoringCurrently prod-us-east-0 and prod-eu-west-3 have recovered, and we are continuing to monitor prod-us-central-0 which is in the process of recovery.
Apr 17, 09:23 PM
investigatingAs of 20:52 UTC, we are currently investigating degraded Query Caching performance in multiple regions. For datasources where query caching is configured, some queries may take longer than usual. Our...

Issues on Stack creation

minor
Apr 16, 12:52 PMApr 16, 02:02 PMresolved
Apr 16, 02:02 PM
resolvedThis incident has been resolved.
Apr 16, 01:19 PM
monitoringThe issue is fixed and we are currently monitoring the service.
Apr 16, 12:52 PM
identifiedSince today 16th at ~12:11UTC we are seeing issues on stack creation across all our regions. Customers will experience error message when attempting to create a stack. Our engineering team has identif...

Degraded Ticket Visibility in Support System

minor
Apr 15, 04:07 PMApr 15, 04:25 PMresolved
Apr 15, 04:25 PM
resolvedThis incident has been resolved and our ticketing system is fully operational. Thank you for your patience.
Apr 15, 04:07 PM
monitoringWe are currently experiencing an issue with our ticketing system provider that is affecting how tickets appear within our internal support views. We are continuing to receive all new tickets successf...

K6 Sporadic DNS Issues

minor
Apr 14, 09:22 AMApr 15, 12:59 PMresolved
Apr 15, 12:59 PM
resolvedThis incident is now resolved. We had intermediary issues with a flaky DNS server that caused random tests to not start properly. Since the DNS server was fixed, we haven't been seeing the issue anymo...
Apr 14, 02:29 PM
monitoringOur engineering team has deployed a fix and we are currently monitoring the behaviour of the system until full resolution.
Apr 14, 02:29 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time.
+1 more updates

k6 Cloud Service Disruption

none
Apr 14, 11:30 AMApr 14, 11:30 AMresolved
Apr 14, 01:44 PM
resolvedBetween approximately 12:30 UTC and 13:15 UTC, k6 Cloud experienced a service disruption due to issues introduced in a recent API release. During this time, users were unable to access the k6 Cloud ap...

Loki write instability in prod-eu-west-2.loki-prod-012

none
Apr 13, 11:30 AMApr 13, 11:30 AMresolved
Apr 14, 12:02 PM
resolvedThere was a period of write instability yesterday. It was between ~1330 -1730 UTC yesterday.  This was due to a scheduled maintenance.

Grafana Cloud Logs - Write degradation in us-east-3

major
Apr 10, 11:53 PMApr 11, 12:36 AMresolved
Apr 11, 12:36 AM
resolvedThis incident has been resolved.
Apr 11, 12:10 AM
monitoringA fix has been implemented and we are monitoring the results.
Apr 10, 11:53 PM
investigatingWe are seeing issues on the write path for Loki in cluster in us-east-3, and we are actively investigating this issue.

Tempo Write Outage

major
Apr 10, 07:42 PMApr 10, 09:02 PMresolved
Apr 10, 09:02 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 10, 07:53 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time. We’ll update again within an hour.
Apr 10, 07:42 PM
investigatingWe are currently investigating a write outage affecting prod-us-east-3. The issue began at 18:50 UTC. Users may experience errors, timeouts, or unavailability while we work to identify the cause and r...

K6 Browser Testing/Timeline Not Available

minor
Apr 9, 05:34 PMApr 9, 06:50 PMresolved
Apr 9, 06:50 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 9, 06:39 PM
identifiedWe’ve identified the cause of the issue impacting k6 browser testing/timeline. Our team is currently implementing a fix. We’ll provide another update in two hours or sooner if the situation changes.
Apr 9, 05:34 PM
investigatingWe’re currently investigating an issue affecting browser testing. Users running browser tests will not be able to see the browser timeline. Our team is actively working to identify the cause and wi...

Stability Issues for Some Customers in the prod-gb-south-1 Region.

minor
Apr 8, 05:00 PMApr 8, 05:00 PMresolved
Apr 8, 05:00 PM
resolvedWe had a stability issue for a subset of customers in the prod-gb-south-1 region. The impact was between UTC 15:20-16:30 which impacted roughly 30% of queries and rules evaluations. We've applied miti...

Unable to Edit Notification Policies

minor
Apr 7, 03:17 PMApr 7, 08:17 PMresolved
Apr 7, 08:17 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 7, 06:03 PM
identifiedWe’ve identified the cause of the issue impacting notification policies. Our team is currently implementing a fix. We’ll provide another update in 2 hours or sooner if the situation changes.
Apr 7, 04:52 PM
identifiedWe’ve identified the cause of the issue impacting notification policies. Our team is currently implementing a fix. We’ll provide another update in 2 hours or sooner if the situation changes.
+1 more updates

Notification Policies and Contact Points Missing in UI on the Slow Release Channel

minor
Apr 6, 02:48 PMApr 7, 12:26 PMresolved
Apr 7, 12:26 PM
resolvedThis incident has been resolved.
Apr 6, 11:58 PM
monitoringWe’ve implemented a fix and are monitoring the results to confirm the issue is fully resolved. Services may start to recover during this time. We’ll update again within 2 hours.
Apr 6, 09:04 PM
identifiedWe’ve identified the cause of the issue impacting the Notification Policy and Contact Point UI. Our team is currently implementing a fix. We’ll provide another update when the fix is deployed and we...
+2 more updates

Partial K6 Test Run Outage

major
Apr 3, 03:29 PMApr 3, 05:38 PMresolved
Apr 3, 05:38 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 3, 03:29 PM
investigatingWe're experiencing an outage affecting test runs that use k6 extensions. The issue prevents users from executing these types of test runs both locally and in Grafana Cloud. Test runs that do not use ...

Query degradation and possible rule evaluation failure on prod-eu-west-0.cortex-prod-01

minor
Apr 1, 09:56 AMApr 1, 09:13 PMresolved
Apr 1, 09:13 PM
resolvedThis incident has been resolved.
Apr 1, 10:12 AM
monitoringA fix has been implemented and we are monitoring the results.
Apr 1, 10:11 AM
investigatingWe are continuing to investigate this issue.
+1 more updates

AWS integration Degraded Performance

minor
Apr 1, 08:17 PMApr 1, 09:03 PMresolved
Apr 1, 09:03 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 1, 08:17 PM
investigatingWe are investigating a noticeable drop in active series for the AWS integration that began around 18:15 UTC. This issue may cause scrapes to hit rate limits, which can result in individual data point...

March 2026

Prometheus writes in prod-eu-west-3 are degraded

critical
Mar 25, 02:11 PMApr 23, 08:07 PMresolved
Apr 23, 08:07 PM
resolvedThis incident has been resolved. Thank you for your patience.
Apr 20, 03:08 PM
monitoringWe are continuing to monitor for any further issues.
Apr 14, 08:11 PM
monitoringWe have deployed mitigation and seen improvement in write failures over the past week. We are still seeing intermittent spikes in latency and continue to monitor.
+7 more updates

k6 Cloud Degradation

none
Mar 31, 02:59 PMMar 31, 02:59 PMresolved
Mar 31, 02:59 PM
resolvedFrom approximately 11:00 UTC - 15:00 UTC we had a degradation that caused test start errors for a large percentage of Cloud runs managed as scripts in the GCK6 app. This has since been resolved.

Synthetic Monitoring: Some Check Creations & Updates Might be Blocked.

none
Mar 31, 02:32 PMMar 31, 02:32 PMresolved
Mar 31, 02:32 PM
resolvedThis is a retroactive status page linked to the following incident: https://status.grafana.com/incidents/38wwbz50ggrp This retroactive status page is meant to clarify the time of impact. This issue f...

Synthetic Monitoring: Some Check Creations & Updates Might be Blocked.

major
Mar 31, 02:01 PMMar 31, 02:25 PMresolved
Mar 31, 02:25 PM
resolvedThis incident has been resolved.
Mar 31, 02:01 PM
identifiedSynthetic Monitoring check creation/update for scripted and browser checks might be blocked in the plugin app for some probes. The issue only impacts creating/updating checks from the plugin app. It d...

Some of the CloudWatch queries are failing

major
Mar 31, 09:48 AMMar 31, 10:24 AMresolved
Mar 31, 10:24 AM
resolvedThis incident has been resolved.
Mar 31, 09:49 AM
monitoringWe are continuing to monitor for any further issues.
Mar 31, 09:48 AM
monitoringSome of the CloudWatch queries were failing. Started at 08:37 UTC Monitoring from 09:21 UTC

Tempo Reads Outage for Small Subset of Customers

none
Mar 30, 04:30 PMMar 30, 04:30 PMresolved
Mar 30, 06:34 PM
resolvedWe encountered an issue impacting only a small subset of customers in the prod-us-central-0 region. The incident occurred between 16:20 and 17:50 UTC on 3/30/26. This incident is now resolved.

Get Grafana Cloud Outage Alerts

Be the first to know when Grafana Cloud go down.