Fly.io Outage History
50 incidents reported. Data sourced from the official Fly.io status page.
50
Total Incidents
15
Major/Critical
23
Minor
50
Resolved
March 2026
Private networking issues in SYD region
majorMar 7, 02:42 PM→Mar 7, 03:56 PMresolved
Mar 7, 03:56 PM
resolved — This incident has been resolved.
Mar 7, 03:10 PM
monitoring — A fix has been implemented and we are monitoring the results.
Mar 7, 02:42 PM
investigating — We are investigating a private networking failure between SYD and other regions. Apps continue to run, and private networking within SYD is unaffected.
Routing issues in NA regions
noneMar 5, 07:24 PM→Mar 5, 07:50 PMresolved
Mar 5, 07:50 PM
resolved — This incident has been resolved. Due to a BGP issue, we saw some North American traffic routed to edges in Singapore (sin). Users in North America would have seen additional request latency during thi...
Mar 5, 07:38 PM
monitoring — A fix has been implemented and we are monitoring the results.
Mar 5, 07:24 PM
investigating — We're aware of routing issues affecting some customers in North America regions, and we're actively investigating.
Elevated GraphQL API errors
majorMar 3, 08:18 PM→Mar 3, 09:15 PMresolved
Mar 3, 09:15 PM
resolved — This incident was caused by a failed Redis node that powers our GraphQL API. We were able to recreate the Redis node and restore service.
We are still investigating the root cause of the failure. In ...
Mar 3, 08:36 PM
monitoring — A fix has been implemented and we are monitoring the results.
Mar 3, 08:18 PM
investigating — We're investigating elevated GraphQL errors that affect some API endpoints.
Cost Explorer fails to load
minorMar 3, 10:50 AM→Mar 3, 12:10 PMresolved
Mar 3, 12:10 PM
resolved — This incident has been resolved.
Mar 3, 10:50 AM
investigating — We are currently investigating this issue.
The page currently displays: "We’re having trouble loading the cost breakdown."
Certificates issues affecting API and proxy
noneMar 3, 12:54 AM→Mar 3, 12:54 AMresolved
Mar 3, 02:05 AM
resolved — Between 19:54 and 20:06 UTC, our Vault cluster serving app certificates was unavailable. This caused various API requests to fail, mainly operations on certificates but also app creates and IP assignm...
Machines failing to boot in EWR
majorMar 2, 05:42 PM→Mar 2, 10:49 PMresolved
Mar 2, 10:49 PM
resolved — This incident has been resolved.
Mar 2, 08:35 PM
monitoring — A fix has been implemented and we are monitoring the results.
Mar 2, 06:21 PM
identified — The issue has been identified and a fix is being implemented.
+1 more updates
Issues with the Machines API
minorMar 2, 09:19 PM→Mar 2, 09:50 PMresolved
Mar 2, 09:50 PM
resolved — This incident has been resolved.
Mar 2, 09:47 PM
monitoring — A fix has been implemented and we are monitoring the results.
Mar 2, 09:39 PM
identified — The issue has been identified and a fix is being implemented.
+1 more updates
February 2026
Slow API requests
majorFeb 27, 06:50 PM→Feb 27, 08:21 PMresolved
Feb 27, 08:21 PM
resolved — This incident has been resolved. All platform and API operations are working normally.
Feb 27, 08:05 PM
monitoring — API and platform operations have normalized. We are continuing to monitor to ensure full and stable recovery.
Background jobs are almost fully caught up. Users may still see slightly slower requests...
Feb 27, 07:41 PM
identified — A second fix has been deployed and database load has returned to normal, resulting in API response times beginning to normalize. Most Machines API requests should succeed as normal, and deploys to exi...
+6 more updates
Capacity issues in iad and dfw
minorFeb 27, 03:34 PM→Feb 27, 05:54 PMresolved
Feb 27, 05:54 PM
resolved — This incident has been resolved.
Feb 27, 05:31 PM
monitoring — We have provisioned additional capacity in dfw and iad and are monitoring to ensure machine and builder starts are succeeding consistently.
Feb 27, 03:34 PM
identified — These regions (Dallas, TX dfw and Ashburn, VA iad) are currently low on capacity. New machine creates in these regions might fail temporarily, and Depot builders may be unavailable, causing deploys to...
Capacity isssues in iad and dfw
noneFeb 26, 05:00 PM→Feb 26, 10:28 PMresolved
Feb 26, 10:28 PM
resolved — This incident has been resolved.
Feb 26, 08:19 PM
monitoring — We're continuing to monitor after having added more capacity to our DFW and IAD regions.
Deploys or machine starts using existing volumes in these regions may still hit a capacity issue. Users shoul...
Feb 26, 06:57 PM
identified — We have added additional capacity in DFW and IAD regions and are monitoring the impact.
New machine creates and deploys without volumes are seeing improved success rates. Deploys using depot builde...
+3 more updates
Sprites API degradation
noneFeb 24, 05:23 PM→Feb 24, 05:51 PMresolved
Feb 24, 05:51 PM
resolved — This incident has been resolved.
Feb 24, 05:24 PM
identified — A slow deploy is causing Sprites API degradation. We are implementing a fix.
Feb 24, 05:23 PM
identified — A slow deploy is causing Sprites API degradation. We are implementing a fix.
Metrics are degraded
minorFeb 24, 04:33 AM→Feb 24, 11:06 AMresolved
Feb 24, 11:06 AM
resolved — Metrics processing has caught up, and we don't see any data loss.
Feb 24, 09:35 AM
monitoring — Delayed metrics are still being processed.
Feb 24, 06:46 AM
monitoring — Metrics are coming back online, but it will take a little time to process what's backed up in the queues.
+2 more updates
Sprite creations failing
minorFeb 24, 09:39 AM→Feb 24, 10:44 AMresolved
Feb 24, 10:44 AM
resolved — This incident has been resolved.
Feb 24, 10:25 AM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 24, 09:39 AM
investigating — We are currently investigating issues creating new Sprites.
Degraded Managed Postgres Control Plane
noneFeb 23, 03:00 PM→Feb 23, 08:30 PMresolved
Feb 24, 12:31 AM
resolved — This incident has been resolved as of 20:30 UTC.
Feb 23, 03:00 PM
investigating — We are currently investigating issues with the MPG control plane. Users may experience delays or hanging when creating or deleting databases via the dashboard or CLI.
Deploys hanging at waiting for Depot Builder
minorFeb 20, 04:14 PM→Feb 20, 08:49 PMresolved
Feb 20, 08:49 PM
resolved — This incident has been resolved.
Feb 20, 07:38 PM
monitoring — The fix has been rolled out and we are seeing deploys using depot builder succeeding normally. We continue to monitor to ensure full recovery.
Depot builders have been reenabled as the default optio...
Feb 20, 05:59 PM
identified — A fix is being rolled out. Fly builders continue to be the default while this is deployed
+2 more updates
Networking issues for users connecting through lhr
minorFeb 20, 10:52 AM→Feb 20, 11:57 AMresolved
Feb 20, 11:57 AM
resolved — Network traffic in LHR has been stable for some time now, we are not seeing any further issues.
Feb 20, 11:21 AM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 20, 10:52 AM
investigating — We’re currently investigating this issue.
Investigating registry issues affecting deploys
minorFeb 19, 09:14 PM→Feb 20, 12:05 AMresolved
Feb 20, 12:05 AM
resolved — This incident has been resolved.
Feb 19, 10:24 PM
identified — While we have seen some improvement from the previous fix, we are still seeing elevated rates of Registry connection issues. Users may continue to see slower machine creates and deploys due to slow im...
Feb 19, 09:49 PM
monitoring — A fix has been implemented and we are monitoring the results.
+2 more updates
Control plane state delayed on some hosts possibly causing network or deployment disruption
majorFeb 18, 04:22 PM→Feb 18, 04:44 PMresolved
Feb 18, 04:44 PM
resolved — This incident has been resolved.
Feb 18, 04:28 PM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 18, 04:23 PM
identified — We are continuing to work on a fix for this issue.
+1 more updates
flyctl deploy timeouts
majorFeb 17, 01:06 PM→Feb 17, 02:24 PMresolved
Feb 17, 02:24 PM
resolved — Earlier today, an issue caused elevated rate limiting and some deployment timeouts. A fix is in place and deployments are back to normal.
Feb 17, 01:42 PM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 17, 01:06 PM
identified — We’re investigating elevated 429 errors from flaps causing deployment timeouts. Affected deploys are failing with:
✖ Failed: error waiting for release_command machine XX to finish running: timeout rea...
Degraded Managed Postgres Control Plane in ORD
majorFeb 14, 11:33 AM→Feb 14, 02:27 PMresolved
Feb 14, 02:27 PM
resolved — This incident has been resolved.
Feb 14, 02:07 PM
monitoring — A fix has been implemented and we are seeing full recovery of the control plane in ORD. With that recovery we are seeing impacted replicas catching up and clusters returning to normal health. We're co...
Feb 14, 01:47 PM
identified — We are continuing to work on a fix for this issue.
+2 more updates
Issues with deploying apps using Depot builders for new accounts
minorFeb 11, 08:44 PM→Feb 11, 09:30 PMresolved
Feb 11, 09:30 PM
resolved — This incident has been resolved.
Feb 11, 09:24 PM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 11, 08:57 PM
identified — The issue has been identified and a fix is being implemented.
+1 more updates
Creating new sprites is degraded
minorFeb 11, 06:07 AM→Feb 11, 07:22 AMresolved
Feb 11, 07:22 AM
resolved — This incident has been resolved.
Feb 11, 06:57 AM
monitoring — Sprite creation appears to be back to normal operation now.
Feb 11, 06:52 AM
identified — We've identified the cause of the delay following creates and we're deploying a fix.
+3 more updates
Degraded MPG clusters in IAD
minorFeb 10, 07:00 PM→Feb 10, 08:44 PMresolved
Feb 10, 08:44 PM
resolved — This incident has been resolved.
Feb 10, 08:00 PM
monitoring — We've rolled out a fix for the remaining impacted clusters, and we're now monitoring the results.
Feb 10, 07:53 PM
identified — We've rolled out a fix for some additional impacted clusters, and we're continuing to work on the remaining clusters.
+2 more updates
Issue creating new Sprites in IAD
minorFeb 9, 08:29 PM→Feb 9, 09:38 PMresolved
Feb 9, 09:38 PM
resolved — This incident has been resolved.
Feb 9, 09:19 PM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 9, 08:45 PM
identified — The issue has been identified and a fix is being implemented.
+1 more updates
Degraded network in AMS
majorFeb 9, 07:17 AM→Feb 9, 10:55 AMresolved
Feb 9, 10:55 AM
resolved — This incident has been resolved.
Feb 9, 09:47 AM
monitoring — A fix has been implemented and we are monitoring the results.
Feb 9, 08:58 AM
identified — We are still working on restoring the MPG clusters. Most of them should be operational already.
+3 more updates
Machines API issues
majorFeb 7, 04:23 PM→Feb 7, 06:13 PMresolved
Feb 7, 06:13 PM
resolved — This incident has been resolved.
Feb 7, 05:17 PM
monitoring — A fix has been implemented and we are seeing Machines API connectivity improve in APAC regions. We continue monitoring for full recovery.
Feb 7, 04:40 PM
identified — The issue has been identified and we are seeing Machines API performance improve in most regions since ~16:20 UTC.
Machines API calls in the SYD, NRT, SIN region may continue to see 5xx errors or hi...
+1 more updates
Private Networking and Certificate Resolution Issues in SYD
majorFeb 7, 03:19 PM→Feb 7, 06:12 PMresolved
Feb 7, 06:12 PM
resolved — This incident has been resolved.
Feb 7, 04:24 PM
monitoring — A fix has been implemented and we are seeing private networking / certificates in SYD improving. We are continuing to monitor for full recovery.
Feb 7, 03:46 PM
investigating — Private Networking (6PN) is degraded in SYD region. Communication between Machines in SYD region and Machines in other regions may fail at this time. Newly created Machines in SYD may fail to sync to ...
+2 more updates
Network issues on newly-created machines
minorFeb 5, 09:52 PM→Feb 6, 07:07 AMresolved
Feb 6, 07:07 AM
resolved — This issue is now resolved.
Feb 6, 02:49 AM
monitoring — We have successfully run a fix to re-sync our global and regional state stores in order to bring machines back to a healthy state, and we're monitoring the situation to confirm that there are no more ...
Feb 5, 09:52 PM
identified — Machines created after the delayed machine registration incident (https://status.flyio.net/incidents/3npj6935byt4) may have incomplete networking configurations and could be unable to receive traffic....
MPG Degraded clusters in AMS, IAD and SIN regions
majorFeb 5, 05:03 PM→Feb 6, 04:58 AMresolved
Feb 6, 04:58 AM
resolved — All MPG clusters are back to full, normal operations.
Feb 6, 04:22 AM
monitoring — All MPG clusters are reachable.
Feb 6, 03:36 AM
identified — We are still continuing cleanup on some clusters.
+5 more updates
Delayed Machine Registration + Token Errors
majorFeb 5, 04:43 PM→Feb 5, 08:53 PMresolved
Feb 5, 08:53 PM
resolved — This incident has been resolved.
Feb 5, 08:11 PM
monitoring — A fix has been deployed across all impacted hosts. We are seeing a sharp reduction in Token errors since 20:00 UTC and other metrics are recovering as well. We are continuing to monitor closely
Feb 5, 07:03 PM
identified — We saw some improvement from the previous fix, however errors remained elevated on some hosts.
We have identified the root cause of the remaining errors as a communication issue between the hosts and...
+4 more updates
Network maintenance in YYZ
minorFeb 5, 05:46 AM→Feb 5, 09:22 AMresolved
Feb 5, 09:22 AM
resolved — Network maintenance has concluded.
Feb 5, 09:01 AM
monitoring — Managed Postgres clusters in YYZ should be operating normally.
Feb 5, 05:46 AM
identified — An upstream network provider is performing an emergency network maintenance in the YYZ region. Machines in YYZ may see some packet loss.
Managed Postgres clusters in YYZ are experiencing management p...
IPv6 Issues in YYZ
majorFeb 3, 03:33 PM→Feb 3, 03:53 PMresolved
Feb 3, 03:53 PM
resolved — This incident has been resolved.
Feb 3, 03:44 PM
monitoring — A fix has been implemented and we're seeing IPv6 networking return to normal in YYZ. We'll continue to monitor to ensure full recovery.
Feb 3, 03:33 PM
investigating — We are currently investigating degraded IPv6 networking in the YYZ (Toronto) region.
Users with machines in this region may see issues connecting to their machines over IPv6. Users with static egre...
Elevated latency and packetloss in North American regions
minorFeb 3, 02:56 AM→Feb 3, 03:43 AMresolved
Feb 3, 03:43 AM
resolved — This incident has been resolved.
Feb 3, 03:26 AM
monitoring — Network performance issues between North American regions have resolved and we're continuing to monitor.
Feb 3, 02:56 AM
investigating — We are currently investigating intermittent spikes of increased latency and packet loss between North American regions over the past hour. Users may see degraded network performance on traffic in and ...
Congestion in CDG and FRA
minorFeb 1, 08:15 PM→Feb 1, 09:37 PMresolved
Feb 1, 09:37 PM
resolved — This incident has been resolved.
Feb 1, 08:15 PM
investigating — We are experiencing elevated weekend congestion in CDG (France) and FRA (Germany).
Sprites are returning not found or unauthorized when they shouldn't be.
noneFeb 1, 02:16 AM→Feb 1, 05:48 AMresolved
Feb 1, 05:48 AM
resolved — This incident has been resolved.
Feb 1, 05:32 AM
monitoring — We've been able to restore missing sprites and tokens. We're monitoring for any additional issues.
Feb 1, 04:52 AM
identified — We're working on a fix to restore missing sprites and tokens.
+3 more updates
January 2026
Grafana Log Search Display Issue
noneJan 31, 05:35 PM→Jan 31, 06:29 PMresolved
Jan 31, 06:29 PM
resolved — This has been resolved. If you are still experiencing any issues, you may need to log out and then back in.
Jan 31, 05:49 PM
investigating — No logs are displayed in Grafana Log Search when using the default `*` query.
You can try the following workarounds:
1. Replace the default `*`query with `NOT ""`
2. Viewing logs from the “fly app” ...
Jan 31, 05:35 PM
investigating — No logs are displayed in Grafana Log Search when using the default `*` query.
As a temporary workaround, please replace `*` with `NOT ""` query. Thank you for your kind understanding as we work throu...
Delayed metric reporting in NRT and SIN regions
minorJan 27, 07:40 PM→Jan 29, 08:04 PMresolved
Jan 29, 08:04 PM
resolved — This incident has been resolved. All hosts in SIN and NRT are reporting up to date metrics.
Jan 29, 03:15 PM
identified — Currently one host in SIN is still finishing working through it's metrics backlog and is reporting delayed metrics. Other hosts in NRT and SIN are reporting metrics correctly.
If needed, users with i...
Jan 28, 03:27 PM
identified — Most hosts in NRT and SIN have completed backfilling their metrics and are up to date in fly-metrics.net.
Four hosts are still working through the backlog; machines on those hosts are still reporting...
+2 more updates
Congestion in CDG and FRA
minorJan 24, 04:00 PM→Jan 24, 04:00 PMresolved
Jan 24, 11:06 PM
resolved — We are experiencing elevated weekend congestion in CDG (France) and FRA (Germany).
Delays issuing certificates
minorJan 21, 03:21 AM→Jan 21, 05:24 AMresolved
Jan 21, 05:24 AM
resolved — This incident has been resolved.
Jan 21, 04:59 AM
monitoring — We have identified the congestion and released a fix, we'll continue to monitor while the jobs catch up.
Jan 21, 03:21 AM
investigating — We are currently investigating possible delays issuing ACME certificates for new hostnames.
Errors creating new Sprites
minorJan 20, 10:08 PM→Jan 20, 10:15 PMresolved
Jan 20, 10:15 PM
resolved — This incident has been resolved.
Jan 20, 10:10 PM
investigating — We are continuing to investigate this issue.
Jan 20, 10:08 PM
investigating — We're currently investigating an issue that's preventing new Sprites from being created.
MPG network instability in LAX
noneJan 19, 02:34 PM→Jan 19, 08:07 PMresolved
Jan 19, 08:07 PM
resolved — This incident has been resolved.
Jan 19, 03:11 PM
monitoring — Connections are back to normal. We'll keep monitoring the region.
Jan 19, 02:34 PM
investigating — We identified network partitions in the LAX region. We are investigating the problem.
Machines errors in JNB region
majorJan 19, 12:17 PM→Jan 19, 12:29 PMresolved
Jan 19, 12:29 PM
resolved — This incident has been resolved.
Jan 19, 12:17 PM
identified — A bad deploy of an internal service in JNB region may cause Machines API requests for JNB region machines to fail. At this time, it may not be possible to create or update machines in JNB region, but ...
Delayed app logs
minorJan 18, 01:08 PM→Jan 18, 03:12 PMresolved
Jan 18, 03:12 PM
resolved — This incident has been resolved.
Jan 18, 01:08 PM
monitoring — App logs after 12:00 UTC may be delayed to show up on fly-metrics.net. We are monitoring log insert rate and will update once log insertion is current.
Log streaming via `flyctl logs` or NATS is curre...
Network issues in SIN region
minorJan 17, 05:12 PM→Jan 17, 05:46 PMresolved
Jan 17, 05:46 PM
resolved — This incident has been resolved.
Jan 17, 05:12 PM
investigating — We are investigating network issues in SIN region. Apps running in this region may experience elevated latency or packet loss.
Elevated Machines errors in ORD
noneJan 15, 05:52 PM→Jan 16, 01:35 AMresolved
Jan 16, 01:35 AM
resolved — This incident has been resolved. Machine placement improvements are now deployed in all regions.
Jan 15, 08:19 PM
monitoring — The fix has been deployed to ORD and we are monitoring results. New Machines will be placed using stricter memory thresholds. Existing Machines on impacted hosts will be migrated to new hosts on start...
Jan 15, 07:04 PM
identified — We are deploying a fix to enforce stricter memory thresholds for Machine placements. This will steer new and migrated workloads towards hosts with optimal capacity.
+1 more updates
MPG clusters in SIN experiencing network issues
noneJan 15, 10:51 PM→Jan 16, 12:43 AMresolved
Jan 16, 12:43 AM
resolved — Connections normal for all clusters in SIN.
Jan 15, 11:57 PM
monitoring — Cluster connections are stable. We'll keep monitoring.
Jan 15, 11:50 PM
monitoring — Network performance in SIN has normalized and MPG clusters are working as expected. We're continuing to monitor to ensure continued stability, but customers should not see an impact on their cluster a...
+1 more updates
MPG instability in LAX
noneJan 15, 02:21 AM→Jan 15, 04:54 PMresolved
Jan 15, 04:54 PM
resolved — Clusters are stable.
Jan 15, 03:12 AM
investigating — The root cause is a temporary network degradation in LAX; multiple clusters lost contact with the DCS. Connections are being reestablished.
Jan 15, 02:23 AM
investigating — We are continuing to investigate this issue.
+1 more updates
App logs unavailable
majorJan 15, 01:55 AM→Jan 15, 04:18 AMresolved
Jan 15, 04:18 AM
resolved — This incident has been resolved.
Jan 15, 03:47 AM
monitoring — App log services are functioning normally. We are continuing to monitor.
Jan 15, 02:57 AM
identified — We have identified the issue and applied a mitigation. App logs should now be available again through Grafana. Some logs may be delayed or missing. We are still working to address the root cause.
+2 more updates
Authentication token issues
noneJan 14, 01:02 PM→Jan 14, 01:25 PMresolved
Jan 14, 01:25 PM
resolved — This incident has been resolved.
Jan 14, 01:02 PM
investigating — We are investigating intermittent issues with authentication. Apps continue to run, and APIs are accessible with existing tokens, but operations such as creating new tokens may fail.
Metrics display issue in Mumbai (BOM) region
minorJan 9, 12:53 PM→Jan 12, 04:29 AMresolved
Jan 12, 04:29 AM
resolved — This incident has been resolved and all hosts in BOM are accurately reporting metrics.
Jan 11, 06:43 PM
identified — One host in BOM remains reporting delayed metrics as it continues to catch up. All other hosts in BOM are reporting metrics correctly.
If needed, users with impacted machines can use `fly machine cl...
Jan 11, 05:36 AM
identified — Metrics have returned to normal for most hosts in BOM. Two hosts are still reporting delayed metrics, but are continuing to catch up.
Users with impacted machines can use `fly machine clone` to add ...
+6 more updates
Related Incident Histories
Get Fly.io Outage Alerts
Be the first to know when Fly.io go down.