Grafana Cloud Outage History

50 incidents reported. Data sourced from the official Grafana Cloud status page.

50
Total Incidents
21
Major/Critical
23
Minor
49
Resolved

March 2026

Authentication API Database Down in prod-eu-west-2 and prod-eu-west-4

major
Mar 20, 03:00 PMMar 20, 03:41 PMresolved
Mar 20, 03:41 PM
resolvedThis incident has been resolved.
Mar 20, 03:08 PM
investigatingWe have observed impact in prod-eu-west-4 as well.
Mar 20, 03:00 PM
investigatingWe are currently investigating an issue impacting the main database for Authentication API's in the prod-eu-west-2 region. Writes are currently failing, but reads are operational.

Various Datasource Issues

major
Mar 19, 04:46 PMMar 19, 06:44 PMresolved
Mar 19, 06:44 PM
resolvedThis incident has been resolved.
Mar 19, 05:56 PM
monitoringWe are continuing to monitor for any further issues.
Mar 19, 05:56 PM
monitoringWe have observed recovery for the Cloudwatch Datasource. We are now seeing failures for the following Datasources: Aurora Opensearch X-Ray Timestream Redshift Sitewise A fix for the above is being...
+2 more updates

Degraded performance of Grafana Cloud k6 test runs

major
Mar 19, 11:17 AMMar 19, 06:11 PMresolved
Mar 19, 06:11 PM
resolvedOur engineering team has deployed a fix and we continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Mar 19, 11:17 AM
investigatingSome customers are seeing degraded performance and errors from certain v6 API endpoints. We are investigating the issue.

Grafana Cloud Logs - Write degradation in Azure Netherlands (eu-west-3)

minor
Mar 13, 10:28 AMMar 18, 07:13 AMresolved
Mar 18, 07:13 AM
resolvedWe have been observing stability for a period of time and will mark the incident as resolved at this time.
Mar 13, 09:22 PM
investigatingWe are continuing to investigate this issue with our CSP, and will provide updates as they become available.
Mar 13, 10:28 AM
investigatingWe are seeing issues on the write path for Loki in cluster Azure Netherlands (eu-west-3). Impact will reflect in degradation of logs ingestion on that cluster. Our engineering team is already working ...

Rule Evaluation Outage in prod-us-west-0

major
Mar 11, 05:10 PMMar 13, 06:15 PMresolved
Mar 13, 06:15 PM
resolvedThis incident has been resolved.
Mar 11, 06:02 PM
monitoringA fix has been implemented and we are monitoring the results.
Mar 11, 05:10 PM
investigatingWe are currently investigating an issue impacting rule evaluation for a subset of customers in the prod-us-west-0 region. We will provide updates as they become available.

Increased number of Aborted-by-Systems with a k6 binary building errors

major
Mar 13, 07:41 AMMar 13, 06:11 PMresolved
Mar 13, 06:11 PM
resolvedThis incident has been resolved.
Mar 13, 12:49 PM
monitoringA fix has been implemented and we are monitoring the results.
Mar 13, 08:45 AM
identifiedThe issue has been identified and a fix is being implemented.
+1 more updates

Grafana Cloud Logs - Write degradation in Azure Netherlands (eu-west-3)

minor
Mar 11, 08:31 AMMar 12, 01:18 PMresolved
Mar 12, 01:18 PM
resolvedThis incident has been resolved.
Mar 11, 09:13 AM
investigatingWe are also reporting impact to Faro performance in the same region. We are continuing to investigate this issue.
Mar 11, 08:31 AM
investigatingWe are seeing issues on the write path for Loki in cluster Azure Netherlands (eu-west-3). Impact will reflect in degradation of logs ingestion on that cluster. Our engineering team is already working ...

Some Write Failures in prod-eu-west-3.

major
Mar 10, 06:00 PMMar 11, 09:48 PMresolved
Mar 11, 09:48 PM
resolvedThis incident has been resolved.
Mar 11, 03:51 PM
monitoringThings have been stable, and we have a potential mitigation should this issue arise again. We are monitoring the issue in the meantime.
Mar 11, 01:35 AM
identifiedThere are ongoing intermittent elevated transient write failures. We will continue to provide additional updates as more information becomes available.
+2 more updates

Metrics write path outage in prod-us-central-0 and prod-us-central-5

minor
Mar 9, 06:03 PMMar 10, 09:17 PMresolved
Mar 10, 09:17 PM
resolvedThis incident has been resolved.
Mar 9, 06:03 PM
monitoringFrom 15:30 to 15:45 UTC and from 16:53 to 17:03 UTC, the prod-us-central-0 and prod-us-central-5 regions saw elevated latency and error rates on the write path. We're monitoring now.

Fleet Managment Elevated Rate of Errors

minor
Mar 9, 02:20 PMMar 10, 08:54 PMresolved
Mar 10, 08:54 PM
resolvedThis incident has been resolved.
Mar 10, 06:11 PM
investigatingOur engineering team continues to work towards a resolution for this issue.
Mar 9, 02:20 PM
investigatingSome users in prod-us-central-0 may be seeing elevated rate of errors when fetching configurations. Our engineers are currently investigating this issue.

Service degradation on Logs Read path in AWS US West (us-west-0)

minor
Mar 10, 03:26 PMMar 10, 08:39 PMresolved
Mar 10, 08:39 PM
resolvedThis incident has been resolved.
Mar 10, 03:26 PM
identifiedThere has been a reoccurrence o the issues on the Read path of Loki services on AWS US West since yesterday 9th around ~17:15UTC. The issue has been identified, and resolutions steps has been taken t...

Various Issues with HG Pages

major
Mar 10, 06:06 PMMar 10, 07:17 PMresolved
Mar 10, 07:17 PM
resolvedThis incident has been resolved.
Mar 10, 06:06 PM
investigatingWe are noticing issues with various HG pages. Our engineering team is actively looking into it.

Outage for prod-eu-central-0 due to AWS S3 outage.

none
Mar 7, 08:07 PMMar 9, 08:59 AMresolved
Mar 9, 08:59 AM
resolvedThis incident has been resolved.
Mar 8, 11:30 AM
monitoringSince about 20:03 UTC we have seen AWS S3 recover and also our services are recovering, we are monitoring.
Mar 7, 08:10 PM
investigatingSince about 20:03 UTC we have seen AWS S3 recover and also our services are recovering, we are monitoring.
+2 more updates

Service degradation on Logs Read path in AWS US West (us-west-0)

minor
Mar 8, 02:17 PMMar 8, 08:31 PMresolved
Mar 8, 08:31 PM
resolvedWe continue to observe a continued period of stability since 19:40 UTC. At this time, we are considering this issue resolved
Mar 8, 06:29 PM
monitoringSince 16:35 UTC we have experienced stability and services are recovering. We are actively monitoring and working to fully stabilize
Mar 8, 02:17 PM
investigatingOur engineering team is investigating issues on the read path of Loki services on AWS US West since today aroun ~13:25UTC. These issues can cause timeouts and 5xx errors when query logs for customers...

Some Grafana Instances Unavailable

critical
Mar 6, 03:03 PMMar 6, 04:31 PMresolved
Mar 6, 04:31 PM
resolvedThis incident has been resolved.
Mar 6, 03:03 PM
identifiedWe have identified an issue which is causing some instances to become unavailable. Our engineering team is actively working on mitigating the issue. We will continue to share updates as they become a...

Write failures in prod-eu-west-0

major
Mar 5, 10:27 PMMar 5, 11:36 PMresolved
Mar 5, 11:36 PM
resolvedWe continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Mar 5, 10:41 PM
monitoringEngineering has released a fix and as of 22:00 UTC, customers should no longer experience write failures and delays in rule evaluation. We will continue to monitor for recurrence and provide updates...
Mar 5, 10:27 PM
investigatingA recent incident affecting the data read path and rule execution within prod-eu-west-0 began at ~21:05 UTC on March 5, 2026. Customers with instances in this region may experience write failures and ...

Grafana Cloud Logs - Write degradation in Azure Netherlands (eu-west-3)

minor
Mar 3, 12:07 PMMar 5, 06:31 PMresolved
Mar 5, 06:31 PM
resolvedThis incident has been resolved.
Mar 4, 10:41 PM
identifiedWe continue to monitor mitigation efforts and work with our CSP.
Mar 3, 10:19 PM
identifiedThe impacted has been reduced to slight intermittency. We continue to work with our CSP to reach a complete resolution.
+2 more updates

Elevated rate of errors for Fleet Management in prod-us-central-0

minor
Mar 4, 07:47 AMMar 4, 09:29 AMresolved
Mar 4, 09:29 AM
resolvedThis incident has been resolved.
Mar 4, 08:46 AM
monitoringA fix has been implemented and we are monitoring the results.
Mar 4, 07:47 AM
investigatingWe are currently experiencing an issue with Fleet Management in prod-us-central-0. Users in prod-us-central-0 may observe elevated rate of errors when fetching configurations.

Test Run Browser Screenshot Upload Failing

none
Mar 3, 01:00 PMMar 3, 01:00 PMresolved
Mar 3, 06:35 PM
resolvedTest run browser screenshot upload experienced failures from 13:12 to 14:51 UTC. The issue has been resolved

Write outage for logs in prod-eu-west-3

critical
Mar 2, 07:37 AMMar 2, 03:48 PMresolved
Mar 2, 03:48 PM
resolvedThis incident has been resolved.
Mar 2, 08:08 AM
investigatingWe are now experiencing write outage for logs in prod-eu-west-3. Our Engineering team is aware and currently investigating this. We will provide further updates accordingly.
Mar 2, 07:37 AM
investigatingWe are experiencing increased write latency for logs in prod-eu-west-3. Our Engineering team is aware and currently investigating this. We will provide further updates accordingly.

Complete outage in prod-me-central-1

critical
Mar 2, 06:43 AMinvestigating
Mar 19, 12:13 PM
investigatingWe have not received any further updates from AWS at this time. However, we are actively monitoring the outage and will provide additional information as it becomes available. Also, please continue t...
Mar 4, 10:22 PM
investigatingWe are actively monitoring the situation, but at this time there are no new updates to share. The next update will be provided once we have more information to share. Please reach out to our Support t...
Mar 4, 10:28 AM
investigatingWe are continuing to investigate this issue.
+7 more updates

February 2026

Grafana Cloud Metrics - Intermittent Write Latency in prod-us-central, prod-us-central-5, and prod-eu-west-0

minor
Feb 25, 07:54 PMMar 17, 06:22 PMresolved
Mar 17, 06:22 PM
resolvedThis incident is now resolved. During the incident the Cloud Metrics platform experienced intermittent latency spikes communicating with a backend cloud service in the prod-us-central-0 and prod-us-c...
Mar 6, 09:44 PM
monitoringWe are rolling out a mitigation across the environments in these regions, and preemptively where possible to ensure it doesn’t spread elsewhere.
Mar 6, 08:53 PM
monitoringWe have seen an increase in latency in our cloud providers services, and are rolling out a change to mitigate the issue. We are monitoring.
+5 more updates

Trace querying issue in all Tempo clusters

minor
Feb 27, 01:46 PMFeb 27, 11:38 PMresolved
Feb 27, 11:38 PM
resolvedThis incident has been resolved.
Feb 27, 07:27 PM
identifiedOur team has identified the issue, and are in the process of testing a fix.
Feb 27, 01:46 PM
investigatingWe're currently working on an issue where portions of data may be temporarily unretrievable, affecting a small percentage of tenants in all Tempo clusters.

Increased Latency for Small Subset of Customers

minor
Feb 27, 04:25 PMFeb 27, 04:25 PMresolved
Feb 27, 04:25 PM
resolvedA recent rollout caused the AuthZ (RBAC) service to perform many redundant folder-tree fetches for each authorization check. For a small number of tenants in the prod-us-east-0 and prod-eu-west-2 regi...

Incorrect pipeline assignment after custom attributes are assigned

minor
Feb 27, 12:57 PMFeb 27, 03:24 PMresolved
Feb 27, 03:24 PM
resolvedThis incident has been resolved.
Feb 27, 01:39 PM
identifiedThe issue has been identified and we are working on a fix.
Feb 27, 12:57 PM
investigatingWe are investigating issues with incorrect pipeline assignment after custom attributes are assigned.

Grafana Cloud Faro slowness of listing and uploading sourcemaps in all regions.

minor
Feb 26, 01:00 PMFeb 27, 02:49 AMresolved
Feb 27, 02:49 AM
resolvedThis incident has been resolved.
Feb 26, 02:43 PM
identifiedUploads should work without an issue now. However, listing might still result in occasional timeouts - we're actively addressing this problem.
Feb 26, 01:00 PM
identifiedWe're experiencing an issue for all Grafana Cloud regions, which manifest in slowness when uploading and listing sourcemaps. The issue most significantly affects users who have a large sourcemap files...

Issues Loading Dashboards and Alert Folders in Hosted Grafana

major
Feb 25, 05:44 PMFeb 25, 07:51 PMresolved
Feb 25, 07:51 PM
resolvedThis incident has been resolved.
Feb 25, 06:46 PM
monitoringA fix has been implemented, and we are observing recovery across all impacted regions. We will continue to monitor progress.
Feb 25, 06:31 PM
identifiedThe issue has been identified, and we are in the process of rolling out a fix.
+2 more updates

Partial Write & Rule Evaluation Outage in prod-eu-west-3

major
Feb 25, 03:05 PMFeb 25, 05:20 PMresolved
Feb 25, 05:20 PM
resolvedThis incident has been resolved.
Feb 25, 03:55 PM
monitoringA fix has been implemented and we are monitoring the results.
Feb 25, 03:05 PM
investigatingWe are currently investigating an issue which is causing a partial write, and rule evaluation outage in the specified region. We will continue to provide updates as they are available

Grafana Cloud Traces prod-eu-west-6 region (AWS Ireland) wrong URL endpoint shown for traces ingestion.

major
Feb 25, 12:41 PMFeb 25, 03:05 PMresolved
Feb 25, 03:05 PM
resolvedThis incident has been resolved.
Feb 25, 12:53 PM
monitoringThe fix was deployed to all affected, already existing tenants. All newly created tenants will not face the issue as well. We're monitoring the incident, but it should be resolved by now.
Feb 25, 12:41 PM
identifiedWe identified an issue with the incorrect URL endpoint being shown for traces ingestion in prod-eu-west-6 region (AWS Ireland). Using the displayed URL will result in traces not being able to be inges...

Some Alert Rule Evaluations Failing

major
Feb 24, 02:31 PMFeb 24, 05:09 PMresolved
Feb 24, 05:09 PM
resolvedThis incident has been resolved.
Feb 24, 04:27 PM
monitoringA fix has been implemented, and we are monitoring results.
Feb 24, 02:31 PM
investigatingWe are currently investigating an issue impacting a subset of users in the prod-us-east-0 region. Impacted customers will receive a "failed to execute query" error when evaluating alert rules.

Degraded performance of Grafana Cloud k6 test runs

minor
Feb 18, 08:27 AMFeb 18, 09:17 PMresolved
Feb 18, 09:17 PM
resolvedThis incident has been resolved.
Feb 18, 12:57 PM
monitoringA fix has been implemented and we are monitoring the results.
Feb 18, 10:20 AM
investigatingWe are continuing to investigate this issue.
+1 more updates

Brief Disruption in Azure prod-us-7-central

minor
Feb 18, 02:00 PMFeb 18, 02:00 PMresolved
Feb 18, 02:56 PM
resolvedWe experienced an issue impacting a cell within the Azure prod-us-central-7 region, which occurred between 14:26 and 14:36. Affected users may have noticed increased errors with rule evaluations, as w...

Grafana Cloud metrics degredation

minor
Feb 18, 03:43 AMFeb 18, 05:31 AMresolved
Feb 18, 05:31 AM
resolvedThis incident has been resolved.
Feb 18, 03:47 AM
investigatingWe are continuing to investigate this issue.
Feb 18, 03:43 AM
investigatingWe've been alerted to issues querying and are investigating

Maintenance task for Synthetic Monitoring ProbeFailedExecutionsTooHigh alert rule

minor
Feb 17, 02:53 PMFeb 17, 04:27 PMresolved
Feb 17, 04:27 PM
resolvedThis incident has been resolved.
Feb 17, 02:53 PM
monitoringAlert instances for Synthetic Monitoring ProbeFailedExecutionsTooHigh provisioned alert rule that are firing during this maintenance might resolve and fire again in the next evaluation. Only the API i...

Degradation of service on Synthetic Monitoring Public Probe AWS Canada (Calgary)

none
Feb 17, 12:47 PMFeb 17, 12:47 PMresolved
Feb 17, 12:50 PM
resolvedThere was a service degradation today from ~12:09 UTC until ~12:35 UTC on the Public Probe of Calgary for Synthetic Monitoring. Impact may include SM check fails where the probe was used.

Self-Serve Users Unable to Sign Up

major
Feb 13, 06:40 PMFeb 13, 07:08 PMresolved
Feb 13, 07:08 PM
resolvedThis incident has been resolved.
Feb 13, 06:40 PM
investigatingWe are currently investigating an issue which is causing users the inability to sign up for self-serve Grafana. We will continue to update with more information as we progress our investigation.

Loki Delete Endpoint Bug

critical
Feb 12, 11:16 PMFeb 13, 05:03 PMresolved
Feb 13, 05:03 PM
resolvedThis incident has been resolved.
Feb 13, 07:56 AM
identifiedWe are continuing to work on a fix for this issue.
Feb 13, 07:56 AM
identifiedA fix is being made to mitigate the issue. We will provide further updates accordingly.
+1 more updates

Loki writes outage in prod-ca-east-0

critical
Feb 13, 06:59 AMFeb 13, 07:29 AMresolved
Feb 13, 07:29 AM
resolvedWe continue to observe a continued period of recovery. At this time, we are considering this issue resolved.
Feb 13, 07:09 AM
monitoringWe have scaled up to handle the increased traffic and are seeing marked improvement. We will continue to monitor and provide updates.
Feb 13, 06:59 AM
investigatingWe have been alerted to an ongoing Loki writes outage in the prod-ca-east-0 region. Our Engineering team is actively investigating this.

Essential Maintenance for Faro Services

maintenance
Feb 12, 04:09 PMFeb 12, 07:07 PMresolved
Feb 12, 07:07 PM
resolvedThis incident has been resolved.
Feb 12, 04:09 PM
monitoringWe are undergoing essential maintenance for Faro services. Users may experience a short outage for the service of <1 minute during this time. We expect this to be finished within an hour.

Grafana Cloud Metrics elevated write and rule evaluation latency in prod-eu-west-2 region.

minor
Feb 12, 12:33 PMFeb 12, 02:30 PMresolved
Feb 12, 02:30 PM
resolvedWe no longer observed any problems with our services - this incident has been resolved.
Feb 12, 12:50 PM
monitoringThe fix has been implemented and services are back to normal. We're currently monitoring health of the services before resolving this incident.
Feb 12, 12:40 PM
identifiedThe issue has been identified and our team is currently working on a fix.
+1 more updates

Unable to Install Slack Integration

major
Feb 11, 02:21 PMFeb 11, 09:47 PMresolved
Feb 11, 09:47 PM
resolvedThis incident has been resolved.
Feb 11, 06:20 PM
monitoringWe are in the process of rolling out the fix.
Feb 11, 04:22 PM
identifiedWe have identified the issue, and are working on a fix.
+1 more updates

Loki error response rate spike on prod-ap-southeast-1

minor
Feb 11, 06:51 AMFeb 11, 07:25 AMresolved
Feb 11, 07:25 AM
resolvedThis incident has been resolved.
Feb 11, 06:54 AM
monitoringWe have deployed temporary measures to mitigate the issue, but there was a writes outage from  06:26 to 06:37 UTC.
Feb 11, 06:51 AM
investigatingcloud logging is facing write issues in this region, our team is looking into this.

Write failures in prod-us-central-0

major
Feb 10, 12:39 AMFeb 10, 01:45 AMresolved
Feb 10, 01:45 AM
resolvedWe continue to observe a continued period of recovery. At this time, we are considering this issue resolved.
Feb 10, 12:39 AM
investigatingAs of 00:10, we are currently experiencing write failures in a single cell affecting customers in prod-us-central-0. Impacted customers may see failed or dropped writes. Engineering is actively engag...

Athena Queries Broken

minor
Feb 9, 03:35 PMFeb 9, 07:07 PMresolved
Feb 9, 07:07 PM
resolvedThis incident has been resolved.
Feb 9, 05:01 PM
monitoringWe are seeing recovery in impacted environments. We will continue to monitor the progress.
Feb 9, 04:23 PM
investigatingOur engineering team is still investigating this issue.
+1 more updates

Grafana Cloud Logs – Write Ingestion Degradation

minor
Feb 9, 10:32 AMFeb 9, 11:21 AMresolved
Feb 9, 11:21 AM
resolvedThis incident has been resolved.
Feb 9, 10:36 AM
monitoringWe are continuing to monitor for any further issues.
Feb 9, 10:32 AM
monitoringBetween 09:47 and 10:14 UTC, Grafana Cloud Logs within a single cell residing in the prod-ap-southeast-1 region experienced an issue affecting write ingestion only. During this time, some log writes m...

Multiple free tier customers are getting "no fields to display" when viewing logs instead of labels and structured metadata

minor
Feb 6, 11:29 AMFeb 6, 05:57 PMresolved
Feb 6, 05:57 PM
resolvedThis incident has been resolved.
Feb 6, 11:29 AM
investigatingWe are currently investigating this issue.

Grafana Cloud Metrics – Write Ingestion Degradation

none
Feb 5, 06:30 PMFeb 5, 06:30 PMresolved
Feb 5, 09:10 PM
resolvedBetween 18:32 and 18:46 UTC, Grafana Cloud Metrics within a single cell residing in the prod-us-west-0 region experienced an issue affecting write ingestion only. During this time, some metric writes ...

Tempo write path degradation in prod-us-west-0

none
Feb 5, 06:00 PMFeb 5, 06:00 PMresolved
Feb 10, 11:44 AM
resolvedFrom 17:43 UTC to 18:05 UTC, a subset of customers experienced elevated latency and a peak error rate of approximately 22% for trace ingestion.

Hosted Metrics partial outage of read path in us-central-0 region.

major
Feb 5, 02:14 PMFeb 5, 05:41 PMresolved
Feb 5, 05:41 PM
resolvedThis incident has been resolved.
Feb 5, 02:40 PM
monitoringServices recovered and there's no active issue anymore. We're still monitoring the overall health.
Feb 5, 02:14 PM
investigatingWe're experiencing an issue in us-central-0 region for Hosted Metrics offering - the issue manifest in rule evaluations failing, and possibility of queries returning stale data. We're actively investi...

Inconsistent threshold check results reported intermittently

minor
Feb 4, 05:20 PMFeb 5, 03:31 PMresolved
Feb 5, 03:31 PM
resolvedThis incident has been resolved.
Feb 5, 09:36 AM
monitoringThe issue causing the incident has been identified, and the fix has been deployed. All new test runs work consistently
Feb 4, 07:57 PM
identifiedWe are continuing to work on a fix for this issue.
+1 more updates

Get Grafana Cloud Outage Alerts

Be the first to know when Grafana Cloud go down.