G

GitHub Outage History

Past incidents and downtime events

Complete history of GitHub outages, incidents, and service disruptions. Showing 50 most recent incidents.

May 2026(16 incidents)

minormonitoringMay 26, 03:44 PM

Disruption with some GitHub services

3 updates
monitoringMay 26, 04:24 PM

The degradation affecting Copilot has been mitigated. We are monitoring to ensure stability.

investigatingMay 26, 03:48 PM

Copilot is experiencing degraded performance. We are continuing to investigate.

investigatingMay 26, 03:44 PM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedMay 26, 10:57 AM — Resolved May 26, 01:18 PM

Incident with Actions and Pages

8 updates
resolvedMay 26, 01:18 PM

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

monitoringMay 26, 01:01 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringMay 26, 01:00 PM

The degradation affecting Actions and Pages has been mitigated. We are monitoring to ensure stability.

investigatingMay 26, 12:37 PM

We have identified the cause of the authentication issues affecting GitHub Actions and are actively working on mitigation

investigatingMay 26, 12:17 PM

Actions is experiencing degraded performance. We are continuing to investigate.

investigatingMay 26, 11:53 AM

We are investigating authentication issues leading to failure in starting Actions runs and downloading actions. At this time the majority of Actions runs is impacted.

investigatingMay 26, 11:19 AM

Actions is experiencing degraded availability. We are continuing to investigate.

investigatingMay 26, 10:57 AM

We are investigating reports of degraded performance for Actions and Pages

minorresolvedMay 23, 04:00 PM — Resolved May 23, 07:32 PM

Intermittent errors with app installation token authentication

8 updates
resolvedMay 23, 07:32 PM

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

investigatingMay 23, 07:32 PM

This is fully mitigated, we will continue to monitor to ensure it does not reoccur.

investigatingMay 23, 07:06 PM

We have identified and are applying additional mitigation and will continue to monitor for complete mitigation.

investigatingMay 23, 06:42 PM

We see significant signs of mitigation and are monitoring for full mitigation.

investigatingMay 23, 05:41 PM

We are seeing signs of mitigation and are continuing to monitor for complete mitigation.  Next update in one hour.

investigatingMay 23, 04:35 PM

We are continuing to investigate an elevated error rate of authentication failures for app installation tokens.  Next update in one hour.

investigatingMay 23, 04:01 PM

We are seeing an increased rate of authentication failures for app installation tokens, affecting approximately 1% of tokens.  We are continuing to investigate.

investigatingMay 23, 04:00 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedMay 20, 04:58 PM — Resolved May 20, 08:14 PM

Incident with Actions

6 updates
resolvedMay 20, 08:14 PM

On May 20, 2026, between 16:00 UTC and 17:45 UTC, GitHub Actions customers experienced run start delays exceeding 5 minutes. Approximately 4.5% of all runs were delayed during the impact window, with scale set jobs disproportionately affected. 30% of scale set jobs were delayed and 4% failed to start entirely. The incident was caused by a misconfigured health check on an internal service that assigns jobs to runners. A brief latency spike in an upstream dependency triggered health check failures across several pods, removing them from service and concentrating load on the remaining capacity. The added load drove memory pressure that escalated into a cascading failure in one regional cluster, leaving it unable to self-recover. Responders mitigated the incident by scaling capacity in the healthy regional clusters and draining traffic away from the impaired one, after which run start latency recovered. To prevent recurrence, we are strengthening our health check configuration to avoid cascading failure scenarios and evaluating automated mitigations to rebalance traffic when a region is degraded.

monitoringMay 20, 07:41 PM

Customer impact has fully subsided. We are maintaining yellow status while we deploy a permanent fix to prevent recurrence.

monitoringMay 20, 06:17 PM

We've applied a mitigation to fix the issues with queuing and running Actions jobs. We are seeing improvements in telemetry and are monitoring for full recovery.

monitoringMay 20, 05:52 PM

The degradation affecting Actions has been mitigated. We are monitoring to ensure stability.

investigatingMay 20, 05:46 PM

A subset of runners are taking longer than expected to connect, which may delay some jobs from beginning execution. We are actively working to mitigate the issue.

investigatingMay 20, 04:58 PM

We are investigating reports of degraded performance for Actions

criticalresolvedMay 15, 08:13 AM — Resolved May 15, 08:48 AM

Actions is experiencing degraded availability

7 updates
resolvedMay 15, 08:48 AM

On May 15, 2026, from approximately 07:43 UTC to 08:48 UTC, GitHub Actions experienced a degradation that caused workflow runs to fail or experience delayed starts for a subset of customers. The incident was triggered by a planned failover of supporting infrastructure used by GitHub Actions. During that operation, an automated service discovery update did not propagate correctly, which caused traffic to be routed incorrectly and increased request timeouts in a core dependency for workflow orchestration. At peak impact, 42% of Actions runs failed. Downstream services that depend on Actions workflow execution were also impacted, including GitHub Pages and Copilot cloud services. At 08:12 UTC, responders manually corrected the service discovery routing issue. Timeout and failure rates recovered shortly after, and we continued monitoring until full stabilization was confirmed across all affected services. The incident was marked resolved at 08:48 UTC. To prevent recurrence, we are implementing failover guardrails that validate service discovery state before completing failover operations, strengthening pre-flight and post-flight verification checks, and improving dependency resilience to reduce timeout cascades during infrastructure events.

monitoringMay 15, 08:41 AM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingMay 15, 08:29 AM

We are monitoring an issue that was affecting GitHub Actions and causing downstream issues in GitHub Coding Agent and GitHub Code Review Agent. The issue has resolved now but we are closely monitoring our systems for full recovery.

investigatingMay 15, 08:27 AM

The degradation affecting Pages has been mitigated. We are monitoring to ensure stability.

investigatingMay 15, 08:26 AM

The degradation affecting Actions has been mitigated. We are monitoring to ensure stability.

investigatingMay 15, 08:14 AM

Pages is experiencing degraded availability. We are continuing to investigate.

investigatingMay 15, 08:13 AM

We are investigating reports of degraded availability for Actions

noneresolvedMay 15, 01:30 AM — Resolved May 15, 02:30 AM

[Retroactive] Incident with GitHub.com

1 update
resolvedMay 15, 03:57 AM

Beginning at 02:49 UTC on May 15 2026 and lasting until 03:04 UTC, GitHub.com was unavailable for a subset of customers. This impact has been mitigated and normal service resumed. The issue was rooted in a sudden spike in traffic, with intermittent impact. We've identified the source of the traffic and prevented further disruption.

minorresolvedMay 13, 02:41 PM — Resolved May 13, 04:03 PM

Incident with CodeQL

6 updates
resolvedMay 13, 04:03 PM

On May 13, 2026, between 14:31 and 16:03 UTC, the Code Scanning service experienced processing delays and 12% of check runs took over 15 minutes to complete. The delays were caused by replication lag due to an internal database migration, resulting in insufficient worker capacity for our high rate of job enqueues. We mitigated the impact by scaling our processing workers by 34%. Code Scanning results returned to normal processing times after the mitigation was applied. The capacity increases are permanent, and we are looking into more ways to decrease the load on our workers to help prevent this in the future.

monitoringMay 13, 03:30 PM

CodeQL impact has been mitigated. We are continuing to monitor for durable recovery.

monitoringMay 13, 03:26 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingMay 13, 02:58 PM

We have applied a mitigation to increase processing capacity. We are continuing to monitor to confirm full recovery. We will provide another update by 15:30 UTC.

investigatingMay 13, 02:43 PM

We are investigating delays affecting CodeQL, the code analysis engine used by Code Scanning. Some users may experience delayed or incomplete code scanning results. Our engineering team is investigating. We will provide another update by 15:15 UTC.

investigatingMay 13, 02:41 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedMay 12, 02:38 PM — Resolved May 12, 05:43 PM

Incident with CodeQL, Webhooks, Notifications, and Slack Integration

10 updates
resolvedMay 12, 05:43 PM

On May 12, 2026, between 13:41 and 17:43 UTC, some services experienced delays in processing. For the Code Scanning service, 53% of check runs took over 15 minutes to complete. Additionally, notifications took an average of 22 minutes to be delivered and Slack integration webhooks took an average of 20 minutes to be delivered. The delays were caused by replication lag due to an internal database migration, resulting in insufficient worker capacity for our high rate of job enqueues. We mitigated the impact by scaling our processing workers to handle the increased load. All services returned to normal processing times after the mitigation was applied. We are working to create dedicated worker pools for some of our high usage shared queues to help prevent this in the future.

investigatingMay 12, 05:43 PM

All services have fully recovered.

investigatingMay 12, 04:59 PM

CodeQL has fully recovered. We're continuing to work on recovery for the remaining impacted services.

investigatingMay 12, 04:29 PM

Webhooks have fully recovered. Continuing to work on recovery for the other services.

investigatingMay 12, 04:28 PM

Webhooks is operating normally.

investigatingMay 12, 04:18 PM

We've established that most delays are related to a queuing service and are working to scale out. Early signals from the scale-out are showing signs of recovery for some services. We'll provide an update when services are fully recovered.

investigatingMay 12, 03:44 PM

Webhooks is experiencing degraded performance. We are continuing to investigate.

investigatingMay 12, 03:42 PM

We're continuing to investigate issues with CodeQL actions workflows. We're additionally seeing delays for notifications, webhooks, and the Slack integration.

investigatingMay 12, 03:13 PM

CodeQL actions are currently experiencing delays, which may result in those actions being stuck in a pending state or having failed due to a timeout.

investigatingMay 12, 02:38 PM

We are investigating reports of degraded performance for CodeQL

minorresolvedMay 11, 02:25 PM — Resolved May 11, 02:33 PM

Incident with high errors on Git Operations

2 updates
resolvedMay 11, 02:33 PM

On May 11th, 2026, between 14:00 UTC and 14:33 UTC, HTTP-based Git read operations were degraded. On average, the error rate was 2.8% and peaked at 7.5% of requests to the service. This was due to resource exhaustion in a networking gateway between GitHub.com’s frontend service for Git operations and a dependency service that performs authentication and authorization. Following the initial spike, the frontend service became stuck in a degraded state in one of our data centers, increasing time to mitigation. We mitigated the incident by scaling the networking gateway and re-deploying the frontend service. To reduce our time to detection and mitigation in the future, we are adding auto-scaling to the networking gateway, and resolving a bug which caused the frontend service to remain degraded.

investigatingMay 11, 02:25 PM

We are investigating reports of degraded performance for Git Operations

criticalresolvedMay 7, 05:02 AM — Resolved May 7, 06:56 AM

CCR and CCA failing to start for PR comments

4 updates
resolvedMay 7, 06:56 AM

On May 7, 2026, between 04:12 UTC and 06:13 UTC, Copilot Cloud Agent and Copilot Code Review Agent sessions for pull requests were delayed or failed to start.The issue was caused by follow-up recovery work from a separate Pull Requests incident (https://www.githubstatus.com/incidents/f5pb5d5mr9yh). As part of that recovery, we ran a large database migration, which caused replication delays on several replica hosts.Although those replicas were not serving user traffic, our safeguards correctly treated the elevated replication lag as a signal to slow down writes to the affected database cluster. As a result, some pull request background processing was temporarily delayed. That processing is responsible for sending the internal events that Copilot agents use to begin work, so affected agents did not start until the database replicas caught up.The system recovered once replication lag returned to normal and pull request processing resumed. We are reviewing how this safeguard interacts with recovery migrations so we can reduce the chance of similar secondary impact during future incident recovery work.

monitoringMay 7, 06:14 AM

Copilot code review and cloud agents are starting again for pull requests, we are monitoring for full recovery.

monitoringMay 7, 06:13 AM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingMay 7, 05:02 AM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedMay 6, 03:25 PM — Resolved May 6, 07:04 PM

Incident with Pull Requests

8 updates
resolvedMay 6, 07:04 PM

On May 6, 2026 between 15:12 and 19:02 UTC creation of new pull request review threads on GitHub.com failed. This included new line comments and file comments on pull requests. Existing PRs and previously created comments were unaffected. This incident was caused by a 32-bit integer key reaching its maximum value in a Vitess lookup table used during PR thread creation. The primary table had been migrated to a 64-bit integer key but the Vitesse lookup table remained 32-bit. Once the values in the primary table passed the available 32-bit ID space in the lookup table, attempts to create new review threads began failing, resulting in near 100% failure rate for new thread creation requests. We mitigated the issue by updating the impacted lookup table definitions across all shards to use 64-bit integer column types, increasing the available ID range and restoring normal operation. Service was fully restored once the schema changes competed globally. To help prevent similar incidents, we are expanding existing monitoring of database columns to include Vitess lookup tables to notify in advance of any tables that is approaching a column size limit. This work is intended to provide earlier detection of columns approaching size limits before customer impact occurs.

investigatingMay 6, 07:04 PM

Mitigations have been fully applied and we are seeing full recovery of functionality on Pull Request threads. We are continuing to monitor to ensure sustained recovery.

investigatingMay 6, 05:52 PM

Creation of new Pull Request threads (including line and file comments) continues to be affected although we are seeing partial recovery.A mitigation is being applied to continue to accelerate recovery with complete recovery expected by 8:00pm UTC.Top-level comments on pull requests still function and should remain usable during recovery. Opening and merging pull requests, actions, and other pull request operations remain functional.

investigatingMay 6, 04:20 PM

Creation of new Pull Request threads (including line and file comments) continues to be affected. Top-level comments on pull requests still function and should remain usable during recovery. Opening and merging pull requests, actions, and other pull request operations remain functional. A mitigation is being applied. Recovery is expected to be gradual, with complete recovery expected by 8:00pm UTC.

investigatingMay 6, 04:07 PM

Pull Requests is experiencing degraded availability. We are continuing to investigate.

investigatingMay 6, 03:55 PM

Creation of new Pull Request threads (including line and file comments) continues to be affected. We have identified the cause of the issue and have started taking steps to mitigate this issue.

investigatingMay 6, 03:28 PM

We are investigating failures for new thread creation on Pull Requests. Responses to existing pull request threads are unaffected.

investigatingMay 6, 03:25 PM

We are investigating reports of degraded performance for Pull Requests

criticalresolvedMay 6, 11:21 AM — Resolved May 6, 11:59 AM

Disruption with some GitHub services

4 updates
resolvedMay 6, 11:59 AM

On May 6, 2026 between 11:02 UTC and 11:13 UTC, users were unable to start or view Copilot Cloud Agent or remote sessions. During this time, requests to the session API returned errors, preventing users from creating new sessions or viewing existing ones. The issue was caused by a configuration change to the service's network routing that inadvertently removed the ingress path for the service. The team reverted the change at 11:13 UTC which restored service. The incident remained open until 11:59 UTC while the team verified full recovery. We are taking steps to improve our deployment validation process to prevent similar configuration changes from impacting production traffic in the future.

investigatingMay 6, 11:59 AM

We have applied a mitigation and Copilot services have recovered.

investigatingMay 6, 11:25 AM

We are investigating issues with the ability to start Copilot Cloud Agent sessions and view them.

investigatingMay 6, 11:21 AM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedMay 6, 07:19 AM — Resolved May 6, 09:44 AM

Incident with Actions, we are investigating reports of degraded availability

6 updates
resolvedMay 6, 09:44 AM

On May 6, 2026, from approximately 06:45 UTC to 09:15 UTC, GitHub Actions Standard Ubuntu hosted runners were degraded. 17.1% of jobs requesting a standard runner failed.This was caused by an unexpected data shape in the allocation configuration data for standard runners. That data was introduced as part of post-incident remediation work for an incident the previous day and caused new allocations to be blocked as load ramped up for the day. Removing that data at 08:51 allowed allocations to proceed and hosted runner pools to scale up and recover.We are updating the filter logic for this allocation data to be resilient to abnormal data shapes and improving monitoring to alert when allocations are blocked, allowing the team to respond before customer impact starts.

monitoringMay 6, 09:44 AM

Actions wait times have fully recovered.

monitoringMay 6, 09:19 AM

The degradation affecting Actions has been mitigated. We are monitoring to ensure stability.

investigatingMay 6, 09:08 AM

We've applied a mitigation to fix the issues with queuing and running Actions jobs. We are seeing improvements in telemetry and are monitoring for full recovery.

investigatingMay 6, 08:00 AM

Actions is experiencing issues with ubuntu standard hosted runners leading to high wait times. We are actively investigating the issue

investigatingMay 6, 07:19 AM

We are investigating reports of degraded availability for Actions

minorresolvedMay 5, 04:49 PM — Resolved May 5, 06:35 PM

Increased Latency and Failures for SSH Git Operations

7 updates
resolvedMay 5, 06:35 PM

Between approximately 14:00 and 16:10 UTC on May 5, 2026, SSH-based Git operations experienced elevated latency and intermittent failures. On average, the error rate was 0.46% and peaked at 0.6% of SSH write requests. HTTP-based Git operations, including web UI and HTTPS clones, were not affected. The impact was caused by reduced SSH capacity at one of our data center sites. During a period of high traffic, the remaining hosts became overloaded, leading to connection exhaustion and some failures for SSH-based operations. Additional capacity was provisioned to expand SSH capacity and resolve the incident. The expanded capacity was fully online by 18:18 UTC. To reduce the likelihood of similar incidents, we will implement faster scaling solutions for SSH infrastructure and improved alerting for host availability and capacity thresholds.

monitoringMay 5, 06:35 PM

We've completed our mitigation to prevent further impact. At this time the incident is considered resolved.

monitoringMay 5, 06:25 PM

The degradation affecting Git Operations has been mitigated. We are monitoring to ensure stability.

investigatingMay 5, 05:26 PM

We're continuing to work on preventing further impact from the earlier issue. No SSH-based impact is expected at this time. We'll post new updates if impact recurs or once our mitigation is in place.

investigatingMay 5, 05:23 PM

Git Operations is experiencing degraded performance. We are continuing to investigate.

monitoringMay 5, 04:54 PM

Between approximately 14:00 and 16:10 UTC, customers using SSH-based Git operations may have experienced elevated latency and failures. HTTP-based operations were not impacted. We've identified a suspected root cause and are working to implement a mitigation to prevent further impact.

investigatingMay 5, 04:49 PM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedMay 5, 01:37 PM — Resolved May 5, 05:26 PM

Incident with Actions

9 updates
resolvedMay 5, 05:26 PM

On May 5, 2026, from approximately 13:22 UTC to 17:05 UTC, GitHub Actions hosted runners in the East US region were degraded. 13.5% of jobs requesting a standard runner failed and ~16% of requested Larger Runners with private networking pinned to East US failed or were delayed by more than 5 minutes. Copilot Code Review requests were also impacted. Approximately 8,500 code review requests timed out during this window. Affected users saw an error comment on their pull requests and were able to retry by re-requesting a review. Most runner requests were picked up by other regions automatically, but a portion of requests still routing to East US were impacted.This was triggered by a scale-up operation for hosted runner VMs in the East US region. This is a regular operation, but the VM create load hit an internal rate limit when VM creates pull images from storage. Existing backoff logic was not triggered because of the response code returned in this case. The rate limiting and VM creation failures were mitigated by reducing load to allow for recovery and allowing queued work to be processed. By 15:34 UTC, queued and failed job assignments were mostly mitigated, with less than 0.5% of runner assignments impacted between 15:34 and full recovery at 17:05.We are improving our system’s throttling behavior when limits occur, improving our controls to more quickly mitigate similar situations in the future, and reviewing all limits end-to-end for similar operations. We also immediately paused all scale and similar operations until these changes are in place and validated.

investigatingMay 5, 05:11 PM

Actions is experiencing degraded performance. We are continuing to investigate.

investigatingMay 5, 05:11 PM

Standard hosted runners have now reached full recovery. Hosted Runners with Private Networking in the East US region remain degraded as we continue working with our compute provider to restore capacity. Hosted Runners with private networking can fail over to a different Region to mitigate the issue.

investigatingMay 5, 04:33 PM

We've seen signs of recovery for Standard Hosted Runners and are continuing to monitor for full recovery. Hosted Runners with Private Networking in the East US region remain affected as we continue working with our compute provider to restore capacity.

investigatingMay 5, 03:54 PM

We've applied a mitigation for long queue times and failures on Standard Hosted Runners and are monitoring for full recovery. Hosted Runners with Private Networking in the East US region remain affected as we continue working with our compute provider to restore capacity.

investigatingMay 5, 03:12 PM

We are working with our compute provider to alleviate elevated queue times and failures for Actions Jobs running on Hosted Runners in the East US region affecting 10% of runs. Hosted Runners with private networking can fail over to a different Region to mitigate the issue.

investigatingMay 5, 02:14 PM

We are investigating elevated queue times and failures on Actions Jobs running on Hosted Runners in East US affecting 8% of runs. Hosted Runners with private networking can fail over to a different Azure region to mitigate the issue.

investigatingMay 5, 01:48 PM

We are investigating elevated queue times on Actions Jobs running on Standard Hosted Runners in East US affecting 10% of runs

investigatingMay 5, 01:37 PM

We are investigating reports of degraded availability for Actions

criticalresolvedMay 4, 03:45 PM — Resolved May 4, 04:40 PM

Incident with Issues and Webhooks

19 updates
resolvedMay 4, 04:40 PM

On 2026-05-04 at 3:37:17 PM UTC we detected increased latency on issues resulting in timeouts, and elevated 500 errors on webhooks. A scheduled workload drove high utilization on the primary host of a critical datastore, saturating the connection pool. We paused the job to mitigate the problem at 4:40:05 PM UTC and have implemented measures to prevent recurrence.

monitoringMay 4, 04:36 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingMay 4, 04:35 PM

Webhooks is operating normally.

investigatingMay 4, 04:35 PM

The degradation affecting Codespaces has been mitigated. We are monitoring to ensure stability.

investigatingMay 4, 04:34 PM

The degradation affecting Issues has been mitigated. We are monitoring to ensure stability.

investigatingMay 4, 04:32 PM

Pull Requests is operating normally.

investigatingMay 4, 04:29 PM

Pages is operating normally.

investigatingMay 4, 04:29 PM

Latency across services has normalized. We are continuing to investigate the root cause and prevent reoccurrence.

investigatingMay 4, 04:28 PM

Actions and Packages are operating normally.

investigatingMay 4, 04:25 PM

Git Operations is operating normally.

investigatingMay 4, 04:06 PM

Pages is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 04:05 PM

Codespaces is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 03:56 PM

Pull Requests is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 03:51 PM

Actions is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 03:51 PM

Pull Requests is experiencing degraded availability. We are continuing to investigate.

investigatingMay 4, 03:50 PM

Packages is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 03:48 PM

Git Operations is experiencing degraded performance. We are continuing to investigate.

investigatingMay 4, 03:48 PM

We are investigating Increased latency and timeouts across multiple GitHub services.

investigatingMay 4, 03:45 PM

We are investigating reports of degraded performance for Issues and Webhooks

April 2026(27 incidents)

minorresolvedApr 28, 02:17 PM — Resolved May 1, 04:15 AM

Incomplete pull request results in repositories

10 updates
resolvedMay 1, 04:15 AM

On April 28, 2026, at approximately 14:07 UTC, GitHub received reports that pull requests were missing from search results across global and repository /pulls pages. The issue was caused by a manually invoked repair job intended for a single repository, which was executed without the required safety flags. During execution of the repair job, the database query remained correctly scoped to the repo’s PR IDs. However, the Elasticsearch reconciliation logic did not apply the same scope. It interpreted the min and max PR IDs as a continuous range, causing unrelated PR documents across other repos to be marked for deletion. This resulted in the removal of 1,789,756,838 PR documents from the search index, approximately 49% of indexed PR documents. Customer impact was limited to PR search and list discoverability. Primary storage was unaffected, and there was no impact to opening, updating, or merging PRs. The issue was identified ~10 minutes after initial customer reports. Because it affected search index completeness rather than service availability, it was not caught by existing monitoring. The root cause was a flaw in the search document repair framework: it allowed a scoped reconciliation to run without enforcing a matching Elasticsearch query scope. This created a destructive mismatch between the source-of-truth and the index. The issue was compounded by the ability to trigger the job from the production console without safety defaults. Prior testing focused only on safe backfill scenarios and did not cover this reconciliation path. Additionally, there was no automated detection for large-volume deletions in Elasticsearch. We mitigated the incident through three parallel actions: (1) Deployed a MySQL-backed search fallback for the most active repos by traffic to restore PR visibility for highly impacted users (2) Initiated a snapshot restore and reindex process to repopulate missing pull request documents in Elasticsearch (3) Added a degradation notice on PR pages to inform users of incomplete search results while recovery was in progress. The incident was resolved on May 1, 2026 at 4:15 UTC, following completion and validation of the reindex process. To prevent recurrence, we are prioritizing improvements to the repair framework and safeguards. These include enforcing scoped query alignment between primary storage and Elasticsearch, preventing destructive operations without explicit opt-in, strengthening guardrails for manual repair jobs, and evaluating restrictions on production console access. In parallel, we are expanding automated test coverage for reconciliation safety invariants and introducing detection for anomalous deletion patterns in Elasticsearch so similar issues can be identified or blocked earlier. We are committed to improving the safety and reliability of our repair systems and ensuring that operational workflows are resilient to both software defects and manual invocation risks.

investigatingMay 1, 04:11 AM

This incident has been resolved. Search and indexing functionality for pull requests are now fully restored. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

investigatingApr 30, 03:49 AM

We have repaired the missing search records for affected Pull Requests and are working to identify and repair records left in a stale state after the recovery.

investigatingApr 29, 10:22 PM

We have restored search/indexing functionality for over 99% of impacted pull requests. We are continuing to address the remaining affected pull requests and are reviewing outstanding gaps as part of the restoration process.

investigatingApr 29, 12:40 AM

Mitigation is in progress, with full recovery of impacted pull request listings expected within approximately 24 hours.

investigatingApr 28, 10:46 PM

We have made an interim mitigation to improve availability for some impacted repositories while reindexing continues, and we are actively monitoring the indexing progress.

investigatingApr 28, 09:43 PM

Elastic search reindexing of pull requests is continuing. All data is preserved, but may not be available on pages relying on elasticsearch until the reindex is complete.Pages and APIs that do not rely on elasticsearch, including the GitHub CLI (gh pr list) and API (/repos/{owner}/{repo}/pulls), are not impacted and can be used to retrieve pull request data in the interim.

investigatingApr 28, 03:58 PM

We are actively reindexing the remaining ElasticSearch indexes. Our priority is ensuring correctness and avoiding further impact.  We are taking a measured approach to safely backfill data and will share additional updates as progress continues.

investigatingApr 28, 02:51 PM

After yesterday’s incident, we are investigating cases where /pulls and /repo/pulls pages are not showing all indexed pull requests. This is because our Elasticsearch cluster does not currently contain all indexed documents.No pull request data has been lost. As pull requests are updated, they will be reindexed. We are also working on accelerating a full reindex so these pages return complete results again.

investigatingApr 28, 02:17 PM

We are investigating reports of degraded performance for Pull Requests

minorresolvedApr 28, 01:59 PM — Resolved Apr 28, 05:09 PM

Disruption with some GitHub services

9 updates
resolvedApr 28, 05:09 PM

On April 28, 2026, from approximately 12:41 UTC to 17:09 UTC, GitHub Actions jobs using Standard Ubuntu 22 and Ubuntu 24 hosted runners experienced run start delays. Approximately 8% of hosted runner jobs using Ubuntu 22 and Ubuntu 24 experienced delays greater than 5 minutes or failures. Larger and self-hosted runners were not impacted.This was caused by a performance regression introduced in the VM reimage process. That reimage delay lowered the overall capacity of runners available to pick up new jobs. This was mitigated with a rollback to a known good image version.We are addressing the core issue with reimage performance and improving the granularity of reimage telemetry across our services and our compute provider to more quickly diagnose similar issues in the future. Finally, we are evaluating other rollout changes to automatically detect similar regressions.

monitoringApr 28, 05:08 PM

Actions is operating normally.

investigatingApr 28, 04:36 PM

Less than 1% of hosted ubuntu-latest runs are delayed. We’re working through remaining steps to restore runner capacity.

investigatingApr 28, 03:41 PM

Currently less than 2% of hosted ubuntu-latest and ubuntu-24.04 runs are delayed or failing. We are continuing to monitor for full recovery.

investigatingApr 28, 03:20 PM

We've applied a mitigation to unblock running Actions. We're continuing to monitor.

investigatingApr 28, 02:49 PM

We're still investigating the root cause for run start delays and failures for Actions hosted Ubuntu jobs, around 5% of jobs are impacted as of now.

investigatingApr 28, 02:02 PM

Actions is experiencing degraded performance. We are continuing to investigate.

monitoringApr 28, 01:59 PM

Actions is experiencing capacity constraints with hosted ubuntu-latest and ubuntu-24.02, leading to high wait times. Other hosted labels and self-hosted runners are not impacted.

investigatingApr 28, 01:59 PM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedApr 27, 04:31 PM — Resolved Apr 27, 10:46 PM

GitHub search is degraded

15 updates
resolvedApr 27, 10:46 PM

On April 27, 2026 between 16:15 UTC and 22:46 UTC, GitHub search services experienced degraded connectivity due to saturation of the load balancing tier deployed in front of our search infrastructure. This resulted in intermittent failures for services relying on our search data including Issues, Pull Requests, Projects, Repositories, Actions, Package Registry and Dependabot Alerts. The impact was varied by search target, with services seeing up to 65% of searches timing out or returning an error between 16:15 UTC and 18:00 UTC. We detected the drop in search results through our ongoing monitoring and declared an incident at 16:21 UTC when we determined the issues would not self-heal. We tracked the incident as mitigated as of 21:33 UTC and monitored the systems until 22:46 UTC when we declared the incident resolved. Our existing monitoring did not classify the increased scraping as a risk and this dimension of the incident was only discovered while working to mitigate. The saturation was caused by a large influx of anonymous distributed scraping traffic that was crafted to avoid our public API rate limits. This scraping traffic made up 30% of the day’s total search traffic, but it was concentrated within a four-hour period. The traffic originated from over 600,000 Unique IP addresses, with matching actor information across the board. To mitigate, we immediately focused on relieving pressure from the load balancers while simultaneously working on scaling the load balancing tier, blocking the anomalous traffic and applying tuning to the balancers to fully resolve the incident. Looking ahead, we’ve not only scaled the load balancer tier, but applied optimizations to improve our connection handling and re-use to reduce the possibility that a saturation event like this can re-occur. We’ve also added new monitors and controls within the platform to allow us to restrict anonymous traffic to mitigate the impact to our registered users.

monitoringApr 27, 10:44 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 27, 10:35 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 27, 09:33 PM

The degradation affecting Actions, Issues, Packages and Pull Requests has been mitigated. We are monitoring to ensure stability.

investigatingApr 27, 09:32 PM

We've applied a mitigation and continuing to monitor

investigatingApr 27, 08:06 PM

Pull Requests is experiencing degraded performance. We are continuing to investigate.

investigatingApr 27, 07:50 PM

We have identified the source of the additional load causing stress on our ElasticSearch clusters. We have disabled the source of that load and are seeing signs of recovery

investigatingApr 27, 06:19 PM

Pull Requests is experiencing degraded availability. We are continuing to investigate.

investigatingApr 27, 06:17 PM

We're continuing to see connectivity issues reaching elasticsearch. Impact on downstream services will be intermittent as we find the root cause

investigatingApr 27, 05:35 PM

Users are experiencing intermittent failures to view issues, pull requests, projects and Actions workflow runs.We are still investigating and attempting mitigations. We will provide further updates.

investigatingApr 27, 04:53 PM

Pull Requests is experiencing degraded performance. We are continuing to investigate.

investigatingApr 27, 04:39 PM

Packages is experiencing degraded performance. We are continuing to investigate.

investigatingApr 27, 04:36 PM

Issues is experiencing degraded performance. We are continuing to investigate.

investigatingApr 27, 04:33 PM

Customers across GitHub are experiencing failures with searches. Examples include: workflow run failures, projects failing to load, and timed out search requests. This is due to an ongoing infrastructure issue that we have been investigating.

investigatingApr 27, 04:31 PM

We are investigating reports of degraded performance for Actions

minorresolvedApr 27, 04:48 PM — Resolved Apr 27, 07:02 PM

Disruption with some GitHub services

5 updates
resolvedApr 27, 07:02 PM

On April 22, 2026 from 18:49 to 19:32 UTC , the Copilot Cloud Agent service began failing during session execution for users running the Agent HQ Codex agent. Codex agent sessions failed to start for all entry points (issue assignment, @copilot comment mentions). 0.5% of total Copilot Cloud Agent jobs were impacted (~2,000 failed jobs). Copilot and other agent sessions were unaffected.This was caused by a model resolution mismatch in Codex agent sessions, resulting in an incompatible model being used at runtime. A mitigation was deployed to select a stable default model for Codex agent sessions.We are working to harden the underlying model-resolution path so it correctly scopes to the requesting agent's supported models to prevent similar failure mode in the future.

monitoringApr 27, 07:01 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 27, 05:26 PM

We've found the issue and are working on deploying a solution to get Codex agent runs working again.

investigatingApr 27, 05:01 PM

Copilot Cloud Agent (CCA) jobs using the Codex agent are failing after starting. To avoid this issue, please choose a different agent. We are investigating the cause and working towards remediation

investigatingApr 27, 04:48 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 24, 07:02 PM — Resolved Apr 25, 12:36 AM

Delays with Actions Jobs for Larger Runners using VNet Injection in the East US region

4 updates
resolvedApr 25, 12:36 AM

On April 24, 2026, from approximately 11:39 UTC to April 25, 2026 at 00:15 UTC, GitHub Actions experienced delays and timeouts for Larger Hosted Runner jobs using VNet injection in the East US region without a failover region configured. Standard and Self-hosted runners were not impacted. This was caused by backend failures in our compute provider’s provisioning, scaling, and update operations for VMs in the East US region and mitigated by a rollback across all affected Availability Zones. More detail is available at https://azure.status.microsoft/en-us/status/history/?trackingId=5GP8-W0G.We are working to improve the reliability of our annotations for jobs impacted by regional issues and are adding system log notifications as an additional customer communication channel alongside annotations.VNet Failover is also now in public preview, allowing customers to evacuate Larger Hosted Runners using VNet injection in cases like this.

investigatingApr 24, 07:14 PM

This is related to the public impact, "Multiservice impact for Azure Workloads in East US" shared at https://azure.status.microsoft/

investigatingApr 24, 07:09 PM

We are investigating reports of degraded performance for Larger Runners with vnet injection in East US and we are working with our service provider on mitigation.

investigatingApr 24, 07:02 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 23, 07:50 PM — Resolved Apr 23, 09:43 PM

Incident with Pull Requests

5 updates
resolvedApr 23, 09:43 PM

On April 23, 2026, between 16:05 UTC and 20:43 UTC, the Pull Requests service experienced a regression affecting merge queue operations. PRs merged via merge queue using the squash merge method produced incorrect merge commits when the merge group contained more than one PR. In affected cases, changes from previously merged PRs and prior commits were inadvertently reverted by subsequent merges.During the impact window 2,092 pull requests were affected. The issue did not affect pull requests merged outside of merge queue, nor merge queue groups using the merge or rebase methods.It took approximately 3 hours and 33 minutes to identify the issue. The change completed deployment at approximately 16:05 UTC, and we became aware at 19:38 UTC following an increase in customer support inquiries. Because the issue affected merge commit correctness rather than availability, it was not detected by existing automated monitoring and was identified through customer reports.The regression was introduced by a new code path that adjusted merge base computation for merge queue ref updates. This code path was intended to be gated behind a feature flag for an unreleased feature, but the gating was incomplete.As a result, the new behavior was inadvertently applied to squash merge groups, producing an incorrect three-way merge. This caused subsequent squash merges to revert changes from earlier pull requests and, in some cases, changes between their starting points.We mitigated the incident by reverting the code change and force-deploying the fix across all environments. After resolution, we identified affected repositories and sent targeted remediation instructions to repository administrators with step-by-step recovery guidance.The regression was not identified during internal validation. Existing test coverage primarily exercised single-PR merge queue groups, which did not exhibit the faulty base-reference calculation. Because automated checks did not validate merge correctness for multi-PR squash groups, the defect surfaced only in production.To prevent recurrence, GitHub is expanding test coverage for merge correctness validation. We are broadening automated coverage for merge queue operations, including regression checks that validate resulting Git contents across supported configurations, so issues affecting merge correctness are caught before reaching production.We are committed to ensuring the correctness and reliability of merge queue operations. These actions will reduce the risk of similar regressions and improve confidence in future changes to the Pull Requests service.

investigatingApr 23, 09:18 PM

We have resolved a regression present when using merge queue with either squash merges or rebases. If you use merge queue in this configuration, some pull requests may have been merged incorrectly between 2026-04-23 16:05-20:43 UTC. This behavior is still present in GitHub Enterprise Cloud with Data Residency, and we are rolling out the same fix.

investigatingApr 23, 08:47 PM

Pull Requests is operating normally.

investigatingApr 23, 07:58 PM

We have identified a regression in merge queue behavior present when squash merging or rebasing. We have identified the root-cause and are in the process of reverting the change.

investigatingApr 23, 07:50 PM

We are investigating reports of degraded performance for Pull Requests

minorresolvedApr 23, 07:28 PM — Resolved Apr 23, 07:42 PM

Disruption with users unable to start Claude and Codex agent task from the web

3 updates
resolvedApr 23, 07:42 PM

Between 18:45 and 19:42 UTC on April 23, users were unable to start new agent tasks using either Claude or Codex agent on github.com. This was caused by a code change to how Copilot mission control routes task creation requests. Ongoing agent tasks and other Copilot agent features were not affected. We mitigated the impact by reverting the breaking change. We are adding extra monitoring and integration test coverage for the task creation path to prevent future recurrence.

investigatingApr 23, 07:33 PM

We have identified the root cause of the issue and are working on mitigation.

investigatingApr 23, 07:28 PM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedApr 23, 04:12 PM — Resolved Apr 23, 05:30 PM

Incident with multiple GitHub services

8 updates
resolvedApr 23, 05:30 PM

On April 23, 2026, between 16:03 UTC and 17:27 UTC, multiple GitHub services experienced elevated error rates and degraded performance due to DNS resolution failures originating from our DNS infrastructure in our VA3 datacenter. Approximately 5–7% of overall traffic was affected during the impact window: - Webhooks: ~0.35% of API requests returned 5xx (peak ~0.39%). ~0.88% of requests exceeded 3s latency; at peak, >3s responses represented ~10% of Webhooks API traffic. - Copilot Metrics: ~9% of Copilot Insights dashboard requests returned 5xx. - Copilot cloud agents: ~10% of cloud agent sessions were affected and failing. - Octoshift: 0.88% of active repo migrations failed and 79% saw elevated durations (avg. 5.2 min) during this period. - Git Operations: averaged 1.25% errors over the duration of the incident, with a peak of 2.07% errors. - Actions: Workflow run status updates experienced delays of up to ~8s over the duration of the incident window. Our DNS infrastructure in VA3 entered a degraded state and began intermittently returning NXDOMAIN responses and timing out on lookups for both internal service discovery and external endpoints. This caused a cascading impact across the dependent services listed above. We identified a specific load pattern under which our DNS resolvers began failing. The evidence points to a recently introduced traffic-balancing mechanism, rolled out progressively to support our growth, as the root cause. We have since reverted this change. We are immediately prioritizing investments in a more controlled rollout and validation process, including a dedicated environment to safely shadow production DNS traffic and detect these failure modes before they can affect production.

investigatingApr 23, 05:10 PM

Webhooks is operating normally.

investigatingApr 23, 05:04 PM

Many services are mitigated and are validating the remaining services.

investigatingApr 23, 05:03 PM

The degradation affecting Actions and Copilot has been mitigated. We are monitoring to ensure stability.

investigatingApr 23, 04:52 PM

We have identified the root problem and are working on mitigation.

investigatingApr 23, 04:34 PM

Actions is experiencing degraded performance. We are continuing to investigate.

investigatingApr 23, 04:19 PM

We are investigating multiple unavailable services.

investigatingApr 23, 04:12 PM

We are investigating reports of degraded availability for Copilot and Webhooks

minorresolvedApr 23, 02:40 PM — Resolved Apr 23, 03:18 PM

Investigating errors on GitHub

8 updates
resolvedApr 23, 03:18 PM

On April 23, 2026 between 14:30 UTC and 15:18 UTC multiple services were degraded on github.com. During this time approximately 1.5% of all web requests resulted in a 5xx status and unicorn pages for github.com users. We also saw elevated error rates across Actions workflow runs, Copilot, Codespaces and Packages, leading to degraded experiences during this timeframe. Codespaces impact peaked at 45% failures for create requests and 65% failures for resume requests. Packages impact was mainly Maven related with 50% failure rates in downloads and 70% failure rates in uploads. Actions experienced a peak of 8% of failed jobs and up to 85% of jobs impacted by run start delays of more than 5 minutes.This was due to a configuration change to an internal billing service that led to a cache being overwhelmed and causing requests to time out. These timeouts cascaded across multiple services and eventually caused requests to queue up and exhaust web request workers.This configuration change was reverted at 14:42 UTC and following this, all services began to see recovery immediately.To prevent this situation in the future, we are taking steps to ensure that failures and timeouts in the billing service don’t cascade to other services causing impact. This includes implementing more aggressive timeouts on callers of these billing services, adding circuit breaker configurations for cache timeouts and using more resilient cache options. We have also decreased max request timeouts within the billing service that caused impact and added more capacity to our cache to prevent traffic spikes from having the same impact.

monitoringApr 23, 03:02 PM

The degradation affecting Actions, Codespaces, Copilot and Packages has been mitigated. We are monitoring to ensure stability.

investigatingApr 23, 03:02 PM

A mitigation was applied and services have recovered.  Actions is working through queued work before fully recovering.

investigatingApr 23, 02:51 PM

Users are experiencing errors loading various web pages on github.com. Actions and Copilot Cloud Agent runs will be delayed.

investigatingApr 23, 02:44 PM

Copilot is experiencing degraded performance. We are continuing to investigate.

investigatingApr 23, 02:42 PM

Codespaces is experiencing degraded performance. We are continuing to investigate.

investigatingApr 23, 02:41 PM

Packages is experiencing degraded performance. We are continuing to investigate.

investigatingApr 23, 02:40 PM

We are investigating reports of degraded performance for Actions

minorresolvedApr 22, 07:53 PM — Resolved Apr 22, 10:43 PM

Disruption with some GitHub services

6 updates
resolvedApr 22, 10:43 PM

On April 22, 2026, between 09:00 UTC and 22:05 UTC, the Copilot coding agent and pull request comment event processing were degraded. During this period, approximately 0.5% of total pull request and issue comments mentioned @copilot (~23,000 invocations), explicitly requested work from the Copilot coding agent but were not acted upon.Creating, viewing, and replying to pull request comments was unaffected, and other Copilotfunctionality continued to operate normally. The impact was limited to @copilot mentions on pull request comments not triggering Copilot coding agent runs, and to some downstream systems not receiving new pull request comment events during the impact window.The cause was a serialization error that prevented pull request comment events from being published to downstream consumers, including the Copilot coding agent. This was related to the same class of issue as incident #4295 on April 20, affecting a another event type.We mitigated the incident by deploying a fix that restored event publishing, after which the Copilot coding agent and other downstream consumers resumed processing pull request comment events normally.We are working to complete our audit of related event schemas, migrate remaining consumers to usethe updated identifier fields, and improve monitoring to detect drops in publishing on critical event topics, to reduce our time to detection and mitigation of issues like this one in the future.

investigatingApr 22, 10:09 PM

We have identified the root cause of the disruption affecting Copilot Coding Agent and Issues. A fix is being deployed.

investigatingApr 22, 08:37 PM

We have identified the root cause of the disruption affecting Copilot Coding Agent and Issues. Copilot @-mentions on pull requests are not being processed, and some issue-related functionality may be degraded. A fix has been developed and is being applied.

investigatingApr 22, 08:02 PM

Copilot @-mentions on pull requests are currently not being processed by Copilot Cloud Agent. We have found the issue and are investigating remediations.

investigatingApr 22, 07:55 PM

Issues is experiencing degraded performance. We are continuing to investigate.

investigatingApr 22, 07:53 PM

We are investigating reports of impacted performance for some GitHub services.

criticalresolvedApr 22, 03:35 PM — Resolved Apr 22, 07:18 PM

Disruption with Copilot chat and Copilot Coding Agent

8 updates
resolvedApr 22, 07:18 PM

On April 22, 2026, between 15:16 UTC and 19:18 UTC, users experienced errors when interacting with Copilot Chat on github.com and Copilot Cloud Agent. During this time, users were unable to use Copilot Chat or Copilot Cloud Agent. Copilot Memory (in preview) was not available to Copilot agent sessions during this time. The issue was caused by an infrastructure configuration change that resulted in connectivity issues with our databases. The team identified the cause and restored connectivity to the database. Copilot Chat and Cloud Agent for github.com were restored by 18:16 UTC. Remaining regional deployments were restored incrementally, with full resolution at 19:18 UTC. We have taken steps to prevent similar infrastructure changes from causing these kinds of database operations in the future.

investigatingApr 22, 06:05 PM

Copilot cloud agent and chat are mitigated for github.com.

investigatingApr 22, 05:49 PM

We are now seeing recovery for Copilot cloud agent.

investigatingApr 22, 05:40 PM

Mitigation is progressing for Copilot chat and cloud agent recovery.

investigatingApr 22, 04:58 PM

Mitigation is progressing for Copilot chat and cloud agent.

investigatingApr 22, 04:24 PM

We continue to work on mitigation for Copilot chat and cloud agent.

investigatingApr 22, 03:43 PM

We are aware of users seeing errors interacting with Copilot chat on github.com and Copilot cloud agent. We have identified the cause and are investigating remediations.

investigatingApr 22, 03:35 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 21, 03:03 PM — Resolved Apr 22, 01:24 AM

Disruption with projects service

11 updates
resolvedApr 22, 01:24 AM

On April 21, 2026, between 13:35 UTC and 01:24 UTC the following day the projects service was degraded. During this time period, projects may have been out of sync and users may have experienced delays in changes to projects and their items. Delays in reflected changes peaked at approximately 45 minutes. The delays were caused by serialization errors that failed events and triggered a flood of resyncs, overloading our event processing layers.We mitigated the incident by speeding up processing time for incoming changes and otherwise waiting for all changes to be processed.We are working to increase our capacity for processing updates to projects to reduce our time to mitigation of issues like this one in the future.

monitoringApr 22, 12:00 AM

The issue remains mitigated. Users may still experience small delays in changes to projects while we process the backlog of events. We expect a full recovery in approximately two hours.

monitoringApr 21, 10:49 PM

The issue remains mitigated. Users may still experience delays in changes to projects while we process the backlog of events. We expect a full recovery in approximately three hours.

monitoringApr 21, 09:15 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 21, 07:41 PM

Recovery from the delays affecting GitHub Projects continues to progress. We have deployed additional mitigations that are accelerating processing of the backlog. Users may still experience delays where changes to projects are not reflected immediately. We expect full recovery within approximately six hours.

investigatingApr 21, 05:21 PM

The queues are continuing to decrease and we are working to accelerate the rate of processing through the queues.

investigatingApr 21, 04:45 PM

The mitigation is deployed and we are seeing recorvery in the queues and will provide an update as to when full recovery will be realized.

investigatingApr 21, 04:18 PM

We are deploying a fix to relieve the queue of delayed data. Some users may still experience delays with GitHub Projects where changes are not reflected immediately as remaining backlogs are processed.

investigatingApr 21, 03:42 PM

We continue to investigate delays with GitHub Projects where changes may not be reflected immediately. Our team has identified the cause and applied mitigations to address the issue. We are seeing initial signs of recovery, though some delays may persist as the system works through a backlog of pending updates.

investigatingApr 21, 03:07 PM

We are investigating reports of delays with GitHub Projects. Users may notice that changes made to projects are not reflected immediately. Our team has identified the source of the delays and is actively working to resolve the issue.

investigatingApr 21, 03:03 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 20, 01:28 PM — Resolved Apr 21, 05:04 AM

Partial degradation for code scanning default setup and for code quality

15 updates
resolvedApr 21, 05:04 AM

On April 20, 2026 between 10:28 UTC and 15:04 UTC GitHub experienced degraded service for code scanning default setup, code quality, and project boards. Repair of affected project boards additionally lasted until April 21, 05:04 UTC During this time, code scanning default setup and code quality analyses were not triggered on newly opened pull requests. Additionally, newly created issues were not appearing on project boards. The cause was a serialization error that prevented proper triggering of code scanning, code quality analyses, and project board updates. We mitigated the issue by deploying a fix, restoring event publishing for code scanning and code quality. For project boards, an additional code change was deployed to update event consumers, followed by a reindex of affected project items. We are working to prevent recurrence by strengthening our schema validations and improving monitoring for drops in publishing on critical hydro topics.

monitoringApr 21, 04:18 AM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 21, 03:10 AM

The issue remains mitigated. Issues that were linked to projects during the incident may take approximately three more hours to render correctly while we complete a re-index.

monitoringApr 20, 09:36 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 20, 06:21 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 20, 06:20 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 20, 06:20 PM

The issue has been mitigated. Newly created issues linked to projects should now function as expected. Issues that were linked to projects during the incident may take approximately five hours to render correctly while we complete a re-index.

investigatingApr 20, 06:08 PM

A deployment to fix this issue of new issues not showing up in projects is underway.

investigatingApr 20, 05:32 PM

We continue to work on mitigation regarding new issues not showing on project boards.

investigatingApr 20, 04:48 PM

We continue to work on mitigation regarding new issues not showing on project boards.

investigatingApr 20, 04:16 PM

Code scanning default setup and Code Quality triggers are back up and running. PRs not processed before or during this incident will require a new push to trigger code scanning or code quality analysis.We are seeing problems with new issues not showing on project boards and are working on mitigation.

investigatingApr 20, 03:20 PM

We are continuing to work on a mitigation to unblock code scanning default setup and code quality features on pull requests.

investigatingApr 20, 02:38 PM

We are currently deploying mitigations that should unblock code scanning default setup and code quality features on pull requests.

investigatingApr 20, 01:57 PM

We are actively working to mitigate an issue affecting code scanning default setup and code quality features on pull requests. Users may experience pull request code scanning and code quality analyses not being triggered on new pull requests. Our engineering team has identified the root cause and working on mitigating the issue.

investigatingApr 20, 01:28 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 17, 02:56 PM — Resolved Apr 17, 03:18 PM

Disruption with some GitHub services

6 updates
resolvedApr 17, 03:18 PM

On April 17, 2026, between 14:46 UTC and 15:12 UTC, users experienced a degraded web experience on GitHub.com. During this time, approximately 1.5% of web requests resulted in errors, with some users encountering slow page loads or failed requests. The issue was caused by capacity saturation of a caching component in one of our data center regions. We mitigated the issue by redirecting traffic to an unaffected region and rolling back a recent deployment. The incident was fully resolved at 15:18 UTC. We are taking steps to provide appropriate capacity for this caching path to prevent recurrence.

monitoringApr 17, 03:18 PM

The degradation affecting Issues has been mitigated. We are monitoring to ensure stability.

investigatingApr 17, 03:08 PM

We have isolated a problematic component in our infrastructure and are working to mitigate. We will continue to post updates as we work toward resolution.

investigatingApr 17, 02:57 PM

We are experiencing an issue that impacts approximately 10% of traffic to the web, resulting in slow and failed calls. We are investigating and will continue to post updates as we work toward mitigation.

investigatingApr 17, 02:56 PM

Issues is experiencing degraded performance. We are continuing to investigate.

investigatingApr 17, 02:56 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 16, 03:06 PM — Resolved Apr 16, 06:28 PM

Incident with Codespaces

7 updates
resolvedApr 16, 06:28 PM

On April 16, 2026 between 09:30 UTC and 17:15 UTC, users experienced failures when attempting to connect to GitHub Codespaces via the VS Code editor. During this time, approximately 40% of codespace start operations failed. Users connecting via SSH were not impacted. The issue was caused by a failure in an upstream download service that prevented the VS Code Server from being retrieved during codespace startup. The impact was mitigated by implementing a workaround to use an alternative download path when the primary endpoint is degraded. We are working with the upstream dependency to address the root cause of the download service failure, and we are improving our fallback mechanisms to reduce the impact of similar upstream failures in the future.

monitoringApr 16, 06:22 PM

The degradation affecting Codespaces has been mitigated. We are monitoring to ensure stability.

investigatingApr 16, 04:37 PM

Our provider is implementing a mitigation and we are seeing signs of recovery.

investigatingApr 16, 03:49 PM

We found an issue that impacts 70% of Codespaces. We are engaged with the provider and working towards mitigation.

investigatingApr 16, 03:41 PM

Codespaces is experiencing degraded availability. We are continuing to investigate.

investigatingApr 16, 03:08 PM

We are experiencing degraded performance in Codespaces related to creating a new Codespace or starting an existing Codespace from the VS Code editor. SSH connections to Codespaces are not impacted. We are working toward mitigation and will continue to keep you updated on progress.

investigatingApr 16, 03:06 PM

We are investigating reports of degraded performance for Codespaces

minorresolvedApr 14, 01:57 AM — Resolved Apr 14, 06:08 AM

Disruption with some GitHub services

8 updates
resolvedApr 14, 06:08 AM

On April 14, between 00:58 UTC and 06:08 UTC, GitHub Enterprise Cloud customers experienced 500 errors when attempting to access Copilot Insights pages which was caused by an authentication failure in our metrics pipeline. We fully mitigated the issue and validated the fix in production. Approximately 709 users were impacted. The total impact duration was approximately 5 hours and 10 minutes. Our investigation determined the incident was caused by a change in a tenant credential which caused authentication errors to retrieve the required data needed on our Copilot Insights pages. We understand this disruption impacted customers' ability to access the Copilot Insights page. To prevent similar issues and reduce resolution time in the future, we are investing in improved diagnostics tooling to quickly identify the root cause of failures, enhanced monitoring, and alerting to detect issues at a more granular level. GitHub is a critical infrastructure for your work, your teams, and your businesses. We are focused on these remediations and continued reliability improvements for Copilot Insights and related metrics experiences.

monitoringApr 14, 06:07 AM

This incident has been resolved. We will continue to monitor to ensure stability. Thank you for your patience and understanding as we addressed this issue.

monitoringApr 14, 06:07 AM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 14, 04:40 AM

We identified an issue that impacts the Copilot Dashboard on the Insights tab and are working on mitigation. We will continue to keep you updated on progress.

investigatingApr 14, 03:47 AM

The team continues to investigate issues accessing with Copilot Dashboard on the Insights tab.  We will continue providing updates on the progress towards mitigation.

investigatingApr 14, 02:40 AM

The Copilot Dashboard on the Insights tab is not accessible and we are continuing to investigate.

investigatingApr 14, 02:37 AM

Degradation of Service - Insights Page

investigatingApr 14, 01:57 AM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 13, 07:56 PM — Resolved Apr 13, 08:35 PM

Incident with Pages

5 updates
resolvedApr 13, 08:35 PM

On Sunday April 13th, 2026, between 18:53 UTC and 20:30 UTC, the GitHub Pages service experienced elevated error rates. On average, the error rate was 10.58% and peaked at 12.77% of requests to the service, resulting in approximately 17.5 million failed requests returning HTTP 500 errors. This was due to an automated DNS management tool (octodns) erroneously deleting a DNS record for a Pages backend storage host after its upstream data source intermittently failed to return the record, causing the tool to treat it as stale and remove it.We mitigated the incident by re-creating the deleted DNS record. To prevent future incidents, we are implementing availability-zone-tolerant routing in the Pages frontend so that an unresolvable backend host triggers failover to healthy hosts rather than returning errors, adding safeguards to prevent automated deletion of DNS records owned by other systems, and improving logging and alerting for DNS resolution failures in the Pages serving path.

monitoringApr 13, 08:32 PM

We have mitigated the issue with Pages.

monitoringApr 13, 08:30 PM

The degradation affecting Pages has been mitigated. We are monitoring to ensure stability.

investigatingApr 13, 07:57 PM

We are investigating reports of issues with Pages. We will continue to keep users updated on progress towards mitigation.

investigatingApr 13, 07:56 PM

We are investigating reports of degraded availability for Pages

minorresolvedApr 13, 04:41 PM — Resolved Apr 13, 05:40 PM

Disruption with some GitHub services

3 updates
resolvedApr 13, 05:40 PM

On April 13, 2026, between 14:41 UTC and 17:29 UTC, the Copilot service experienced degraded performance. All Copilot users were impacted by increased latency, and approximately 20% experienced request failures when interacting with Copilot Cloud Agent (CCA). On average, request latency increased to approximately 950ms. The GitHub User Dashboard also displayed intermittent errors loading Copilot quota information. CCA and the User Dashboard were impacted for approximately 2 hours and 56 minutes. This was due to an infrastructure change that reduced the available compute capacity for a backend service responsible for Copilot rate limiting and quota management. The reduced capacity caused resource exhaustion under normal traffic load, leading to cascading failures in downstream request processing. We mitigated the incident by increasing compute resources allocated to the affected service and scaling out the number of service instances to distribute load more effectively. We are working to improve proactive capacity monitoring to detect resource degradation before it impacts users, reviewing retry and timeout configurations across dependent services to reduce amplification during degraded states, and evaluating connection management strategies to improve resilience under constrained resources.

investigatingApr 13, 04:59 PM

We have identified the root cause and are rolling out a fix for Copilot. The services should now be in recovery, with expected full recovery in 5 to 10 minutes.

investigatingApr 13, 04:41 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 10, 01:07 PM — Resolved Apr 10, 01:28 PM

Problems with third-party Claude and Codex Agent sessions not being listed in the agents tab dashboard

3 updates
resolvedApr 10, 01:28 PM

On April 9, 2026, between 22:59 UTC and April 10, 2026, 13:24 UTC, the Copilot Mission Control service was degraded and did not display Claude and Codex Cloud Agent sessions in the agents tab dashboard. Customers were unable to see, list, or manage their third party agent sessions during this period. The underlying agent sessions continued to function normally. This was a visibility and management issue only, and no HTTP errors were generated. The API returned successful responses with incomplete results, with an average error rate of 0% and a maximum error rate of 0%. This was due to a code change that introduced a filter which inadvertently excluded third party agent sessions.We mitigated the incident by reverting the problematic code change and deploying the fix to production.We are working to add automated monitoring for dashboard content visibility and improve integration test coverage for third party agent session listing to reduce our time to detection and mitigation of issues like this one in the future.

investigatingApr 10, 01:08 PM

We are investigating third party Claude and Codex Cloud Agent sessions not being listed in the agents tab dashboard.

investigatingApr 10, 01:07 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 9, 04:20 PM — Resolved Apr 9, 08:36 PM

Disruption with some GitHub services

7 updates
resolvedApr 9, 08:36 PM

On April 9, 2026, between 16:05 UTC and 20:36 UTC, the Copilot cloud agent service was degraded, causing new agent sessions to be delayed or fail to start. Users who attempted to start Copilot cloud agent sessions during this period experienced jobs getting stuck in the queue, with wait times peaking at 54 minutes compared to the normal 15–40 seconds. On average, approximately 84% of requests to start agent sessions failed, peaking at 97.5% during the worst period.This was due to an internal service exceeding API rate limits, compounded by a caching bug that persisted the rate-limited state beyond the actual rate limit window, causing recurring outage waves rather than a single recovery.We mitigated the incident by deploying a configuration change to bypass the affected cache and shifting API traffic to an alternative authentication path that reduced rate limit exposure. We have since added automated monitoring and alerting for this failure mode, deployed per-endpoint rate limit controls, and added caching for high-traffic API calls to reduce overall load. We are also working on longer-term improvements to rate limit isolation and traffic management to prevent similar issues in the future.This incident shared the same underlying root causes with an incident declared in the time frame https://www.githubstatus.com/incidents/zn1t56bfxdzg

investigatingApr 9, 07:52 PM

We continue to investigate periodic delays in Copilot Cloud Agent job processing

investigatingApr 9, 06:57 PM

We are continuing to investigate Copilot Cloud Agent job delays

investigatingApr 9, 05:48 PM

Copilot Cloud Agent jobs are being processed and we are monitoring recovery

investigatingApr 9, 04:57 PM

We are investigating delays processing Copilot Cloud Agent jobs

investigatingApr 9, 04:20 PM

We are experiencing issues where jobs are being delayed to start for copilot coding agent

investigatingApr 9, 04:20 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 9, 09:50 AM — Resolved Apr 9, 10:15 AM

Disruption with some GitHub services

4 updates
resolvedApr 9, 10:15 AM

On April 9, 2026, between 09:05 UTC and 19:05 UTC, the Copilot coding agent service was degraded and users experienced significant delays starting new agent sessions. Approximately 84% of new agent session requests were delayed across four separate outage waves, with queue wait times peaking at 54 minutes compared to a normal baseline of 15–40 seconds. On average, the error rate was 83.9% and peaked at 97.5% of requests to the service. Approximately 22,700 workflow creations were delayed or failed during the incident.This was due to a bug in our rate limiting logic that incorrectly applied a rate limit globally across all users, rather than scoping it to the individual installation that triggered the limit. A contributing factor was a surge in API traffic from a client update that increased requests to an internal endpoint by 3–4x, which accelerated rate limit exhaustion.We mitigated the incident by disabling the faulty rate limit caching mechanism via feature flag and updating our service to use per-installation credentials for API calls, ensuring rate limits are correctly scoped to individual installations.We have since added automated monitoring and alerting to detect this failure mode proactively, deployed fixes to reduce unnecessary API traffic through caching improvements, and are continuing work to further isolate rate limit scoping across client types to prevent similar issues in the future.This incident shared the same underlying root causes with an incident declared in the time frame https://www.githubstatus.com/incidents/2rqwxl8y7m0j

monitoringApr 9, 10:15 AM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 9, 09:57 AM

We are investigating an issue affecting GitHub Copilot coding agent. Users may experience significant delays when starting new agent sessions, with jobs remaining queued longer than expected. Our team has identified increased load as a contributing factor and is actively working to restore normal performance.

investigatingApr 9, 09:50 AM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 9, 04:42 AM — Resolved Apr 9, 04:57 AM

Disruption with GitHub notifications

3 updates
resolvedApr 9, 04:57 AM

On April 9, 2026, between 03:22 UTC and 04:49 UTC, GitHub Notifications experienced degraded availability. During this time, approximately 45% of requests to the notifications service returned errors, with a peak error rate of approximately 54%, preventing affected users from successfully viewing or interacting with their notifications service. The issue was identified and resolved, restoring the service to full availability.We are working to improve our metrics to reduce time to detection and mitigation for similar issues in the future.

monitoringApr 9, 04:57 AM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 9, 04:42 AM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 2, 05:49 PM — Resolved Apr 2, 09:48 PM

Disruption with some GitHub services

7 updates
resolvedApr 2, 09:48 PM

Between 15:20 and 20:18 UTC on Thursday April 2, Copilot Cloud Agent entered a period of reduced performance. Due to an internal feature being developed for Copilot Code Review, the Copilot Cloud Agent infrastructure started to receive an increased number of jobs. This load eventually caused us to hit an internal rate limit, causing all work to suspend for an hour. During this hour, some new jobs would time out, while others would resume once rate limiting ended. Roughly 40% of jobs in this period were affected.Once the cause of this rate limiting was identified, we were able to disable the new CCR feature via a feature flag. Once the jobs that were already in the queue were able to clear, we didn't see additional instances of rate limiting afterwards.

monitoringApr 2, 09:48 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingApr 2, 08:35 PM

Although we are observing recovery once again, we expect continued periods of degradation. Work that is queued during times of degradation does eventually get processed. We continue to investigate and find a mitigation, and will update again within 2 hours.

investigatingApr 2, 07:28 PM

This issue has recurred. Customers will once again experience false job starts when assigning tasks to Copilot Cloud Agent. We are still investigating and trying to understand the pattern of degradation.

investigatingApr 2, 06:25 PM

We are once again seeing recovery with Copilot Cloud Agent job starts. We are keeping this open while we verify this won't recur.

investigatingApr 2, 05:59 PM

When assigning tasks to Copilot Cloud Agent, the task will appear to be working, but may not actually be running.We are investigating.

investigatingApr 2, 05:49 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 2, 04:18 PM — Resolved Apr 2, 04:30 PM

Copilot Coding Agent failing to start some jobs

3 updates
resolvedApr 2, 04:30 PM

Between 15:20 and 20:18 UTC on Thursday April 2, Copilot Cloud Agent entered a period of reduced performance. Due to an internal feature being developed for Copilot Code Review, the Copilot Cloud Agent infrastructure started to receive an increased number of jobs. This load eventually caused us to hit an internal rate limit, causing all work to suspend for an hour. During this hour, some new jobs would time out, while others would resume once rate limiting ended. Roughly 40% of jobs in this period were affected.Once the cause of this rate limiting was identified, we were able to disable the new CCR feature via a feature flag. Once the jobs that were already in the queue were able to clear, we didn't see additional instances of rate limiting afterwards.This was the same incident declared in https://www.githubstatus.com/incidents/d96l71t3h63k

investigatingApr 2, 04:28 PM

When assigning tasks to Copilot Cloud Agent, the task will appear to be working, but may not actually be running. We are investigating.

investigatingApr 2, 04:18 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 1, 03:02 PM — Resolved Apr 1, 11:45 PM

Disruption with GitHub's code search

7 updates
resolvedApr 1, 11:45 PM

On April 1st, 2026 between 14:40 and 17:00 UTC the GitHub code search service had an outage which resulted in users being unable to perform searches.The issue was initially caused by an upgrade to the code search Kafka cluster ZooKeeper instances which caused a loss of quorum. This resulted in application-level data inconsistencies which required the index to be reset to a point in time before the loss of quorum occurred. Meanwhile, an accidental deploy resulted in query services losing their shard-to-host mappings, which are typically propagated by Kafka.We remediated the problem by performing rolling restarts in the Kafka cluster, allowing quorum to be reestablished. From there we were able to reset our index to a point in time before the inconsistencies occurred.The team is working on ways to improve our time to respond and mitigate issues relating to Kafka in the future.

investigatingApr 1, 11:45 PM

Code search has recovered and is serving production traffic.

investigatingApr 1, 10:00 PM

We have stabilized Code Search infrastructure, and are in the final stages of validation before slowly reintroducing production traffic.

investigatingApr 1, 07:37 PM

We are still working on recovering back to a serviceable state and expect to have a more substantial update within another two hours.

investigatingApr 1, 05:48 PM

We are observing some recovery for Code Search queries, but customers should be aware that the data being served may be stale, especially for changes that took place after 07:00 UTC today (1 April 2026). We are still working on recovering our ingestion pipeline, and synchronizing the indexed data.We will update again within 2 hours.

investigatingApr 1, 04:00 PM

We identified an issue in our ingestion pipeline that degraded the freshness of Code Search results. While fixing the issue with the ingestion pipeline, a deployment caused a loss of dynamic configuration which is causing most requests for Code Search results to fail. We are working to restore the service and to re-ingest the misaligned data.

investigatingApr 1, 03:02 PM

We are investigating reports of impacted performance for some GitHub services.

majorresolvedApr 1, 04:06 PM — Resolved Apr 1, 04:10 PM

GitHub audit logs are unavailable

3 updates
resolvedApr 1, 04:10 PM

On April 1, 2026, between 15:34 UTC and 16:02 UTC, our audit log service lost connectivity to its backing data store due to a failed credential rotation. During this 28-minute window, audit log history was unavailable via both the API and web UI. This resulted in 5xx errors for 4,297 API actors and 127 github.com users. Additionally, events created during this window were delayed by up to 29 minutes in github.com and event streaming. No audit log events were lost; all audit log events were ultimately written and streamed successfully. Customers using GitHub Enterprise Cloud with data residency were not impacted by this incident. We were alerted to the infrastructure failure at 15:40 UTC — six minutes after onset — and resolved the issue by recycling the affected environment, restoring full service by 16:02 UTC. We are conducting a thorough review of our credential rotation process to strengthen its resiliency and prevent recurrence. In parallel, we are strengthening our monitoring capabilities to ensure faster detection and earlier visibility into similar issues going forward.

investigatingApr 1, 04:07 PM

A routine credential rotation has failed for our our audit logs service; we have re-deployed our service and are waiting for recovery.

investigatingApr 1, 04:06 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedApr 1, 09:58 AM — Resolved Apr 1, 12:41 PM

Incident with Copilot

9 updates
resolvedApr 1, 12:41 PM

On April 1, 2026, between 07:29 and 12:41 UTC, some customers experienced elevated 5xx errors and increased latency when using GitHub Copilot features that rely on `/agents/sessions` endpoints (including creating or viewing agent sessions). The issue was caused by resource exhaustion in one of the Copilot backend services handling these requests, in turn, causing timeouts and failed requests. We mitigated the incident by increasing the service’s available compute resources and tuning its runtime concurrency settings. Service health returned to normal and the incident was fully resolved by 12:41 UTC.

monitoringApr 1, 12:10 PM

The success rate and latency for creating and viewing agent sessions has stabilized at baseline levels, we are continuing to monitor recovery

monitoringApr 1, 12:02 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 1, 11:37 AM

The success rate for creating and viewing agent sessions has stabilized, and we're continuing to monitor latency, which is trending toward baseline levels.

monitoringApr 1, 11:24 AM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringApr 1, 10:56 AM

The degradation affecting Copilot has been mitigated. We are monitoring to ensure stability.

investigatingApr 1, 10:31 AM

Users may see increased latency and intermittent errors when viewing or creating agent sessions. We are working on mitigations to return to baseline performance and success rate.

investigatingApr 1, 10:00 AM

We are investigating reports of issues with service(s): Copilot Dotcom Agents. We will continue to keep users updated on progress towards mitigation.

investigatingApr 1, 09:58 AM

We are investigating reports of degraded performance for Copilot

March 2026(7 incidents)

minorresolvedMar 31, 03:05 PM — Resolved Mar 31, 09:23 PM

Incident with Pull Requests: High percentage of 500s

11 updates
resolvedMar 31, 09:23 PM

On Monday March 31st, 2026, between 13:53 UTC and 21:23 UTC the Pull Requests service experienced elevated latency and failures. On average, the error rate was 0.15% and peaked at 0.28% of requests to the service. This was due to a change in garbage collection (GC) settings for a Go-based internal service that provides access to Git repository data. The changes caused more frequent GC activity and elevated CPU consumption on a subset of storage nodes, increasing latency and failure rates for some internal API operations.We mitigated the incident by reverting the GC changes. To prevent future incidents and improve time to detection and mitigation, we are instrumenting additional metrics and alerting for GC-related behavior, improving our visibility into other signals that could cause degraded impact of this type, and updating our best practices and standards for garbage collection in Go-based services.

monitoringMar 31, 09:16 PM

The degradation affecting Pull Requests has been mitigated. We are monitoring to ensure stability.

investigatingMar 31, 09:12 PM

We continue to see a small subset of repositories experiencing timeouts and elevated latency in Pull Requests, affecting under 1% of requests.

investigatingMar 31, 07:28 PM

Error rates remain elevated across multiple pull request endpoints. We are pursuing multiple potential mitigations.

investigatingMar 31, 06:42 PM

We continue to experience elevated error rates affecting Pull Requests. An earlier fix resolved one component of the issue, but some users may still encounter intermittent timeouts when viewing or interacting with pull requests. Our teams are actively investigating the remaining causes.

investigatingMar 31, 05:16 PM

We identified an issue causing increased errors when accessing Pull Requests. The mitigation is being applied across our infrastructure and we will continue to provide updates as the mitigation rolls out.

investigatingMar 31, 04:35 PM

We are seeing recovery in latency and timeouts of requests related to pull requests, even though 500s are still elevated. While we are continuing to investigate, we are applying a mitigation and expect further recovery after it is applied.

investigatingMar 31, 04:15 PM

We are continuing to investigate increased 500 errors affecting GitHub services. You may experience intermittent failures when using Pull Requests and other features. We are actively working to identify and resolve the underlying cause.

investigatingMar 31, 03:39 PM

We are investigating increased 500 errors affecting GitHub services. You may experience intermittent failures when using Pull Requests and other features. We are actively working to identify and resolve the underlying cause.

investigatingMar 31, 03:06 PM

We are seeing a higher than average number of 500s due to timeouts across GitHub services. We have a potential mitigation in flight and are continuing to investigate.

investigatingMar 31, 03:05 PM

We are investigating reports of degraded performance for Pull Requests

minorresolvedMar 31, 01:47 PM — Resolved Mar 31, 03:10 PM

Issues with metered billing report generation

7 updates
resolvedMar 31, 03:10 PM

On March 31, 2026, between 06:15 UTC and 15:30 UTC, the GitHub billing usage reports feature was degraded due to reduced server capacity. Customers requesting billing usage reports and loading the top usage by organization and repository on the billing overview and usage pages were impacted. The average error rate for usage report requests was 15%, peaking at 98% over an eight-minute window. For the billing pages, an average of 56% of requests failed to load the top usage cards. The root cause was an increase in billing usage report requests with large datasets, which exhausted the capacity of the nodes responsible for reporting data. There was no impact on billing charges. We mitigated the incident by adjusting our auto-scaling thresholds to better meet our capacity needs. We are working to improve our metrics to reduce time to detection and mitigation for similar issues in the future.

monitoringMar 31, 03:01 PM

The degradation has been mitigated. We are monitoring to ensure stability.

investigatingMar 31, 02:59 PM

We have applied mitigations to a data store related to billing reports, and are seeing partial recovery to billing report generation. We continue to monitor for full recovery.

investigatingMar 31, 02:56 PM

We are seeing a high number of 500s due to timeouts across GitHub services. We are redeploying some of our core services and we expect that this allow us to recover.

investigatingMar 31, 02:39 PM

We're continuing to see high failure rates on billing report generation, and are working on mitigations for a data store related to billing reports.

investigatingMar 31, 01:56 PM

We're seeing issues related to metered billing reports, intermittently affecting metered usage graphs and reports on the billing page. We have identified an issue with a data store, and are working on mitigations.

investigatingMar 31, 01:47 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedMar 30, 01:02 PM — Resolved Mar 30, 01:25 PM

Elevated delays in Actions workflow runs and Pull Request status updates

4 updates
resolvedMar 30, 01:25 PM

On March 30, 2026, between 10:11 UTC and 13:25 UTC, GitHub Actions experienced degraded performance. During this time, approximately 2.65% of workflow jobs triggered by pull request events experienced start delays exceeding 5 minutes. The issue was caused by replication lag on an internal database cluster used by Actions, which triggered write throttling in our database protection layer and slowed job queue processing. The replication lag originated from planned maintenance to scale the internal database. Newly added database hosts triggered guardrails in the throttling layer, restricting write throughput. The incident was mitigated by excluding the new hosts from replication delay calculations. To prevent recurrence, we have updated our maintenance procedures to ensure new hosts are excluded from throttling assessments during scaling operations. Additionally, we are investing in automation to streamline this type of maintenance activity.

monitoringMar 30, 01:25 PM

The degradation has been mitigated. We are monitoring to ensure stability.

monitoringMar 30, 01:20 PM

The degradation affecting Actions and Pull Requests has been mitigated. We are monitoring to ensure stability.

investigatingMar 30, 01:02 PM

We are investigating reports of degraded performance for Actions and Pull Requests

noneresolvedMar 27, 05:00 AM — Resolved Mar 27, 05:00 AM

Incident with Copilot

1 update
resolvedMar 27, 06:42 PM

On March 27, 2026, from 02:30 to 04:56 UTC, a misconfiguration in our rate limiting system caused users on Copilot Free, Student, Pro, and Pro+ plans to experience unexpected rate limit errors. The configuration that was incorrectly applied was intended solely for internal staff testing of rate-limiting experiences. Copilot Business and Copilot Enterprise accounts were not affected. During this period, affected users received error messages instructing them to retry after a certain time. Approximately 32% of active Free users, 35% of active Student users, 46% of active Pro users, and 66% of active Pro+ users were affected. After identifying the root cause, we reverted the change and restored the expected rate limits. We are reviewing our deployment and validation processes to help ensure configurations used for internal testing cannot be inadvertently applied to production environments.

minorresolvedMar 24, 08:18 PM — Resolved Mar 24, 08:56 PM

Disruption with some GitHub services

6 updates
resolvedMar 24, 08:56 PM

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

investigatingMar 24, 08:38 PM

We are investigating elevated error rates affecting multiple GitHub services including Actions, Issues, Pull Requests, Webhooks, Codespaces, and login functionality. Some users may have experienced errors when accessing these features. Most services are now showing signs of recovery. We'll post another update by 21:00 UTC.

investigatingMar 24, 08:23 PM

Issues is experiencing degraded performance. We are continuing to investigate.

investigatingMar 24, 08:23 PM

Pull Requests is experiencing degraded performance. We are continuing to investigate.

investigatingMar 24, 08:20 PM

Webhooks is experiencing degraded performance. We are continuing to investigate.

investigatingMar 24, 08:18 PM

We are investigating reports of degraded performance for Actions

majorresolvedMar 24, 04:59 PM — Resolved Mar 24, 07:51 PM

Teams Github Notifications App is down

5 updates
resolvedMar 24, 07:51 PM

On March 24, 2026, between 15:57 UTC and 19:51 UTC, the Microsoft Teams Integration and Teams Copilot Integration services were degraded and unable to deliver GitHub event notifications to Microsoft Teams. On average, the error rate was 37.4% and peaked at 90.1% of requests to the service -- approximately 19% of all integration installs failed to receive GitHub-to-Teams notifications in this time period.This was due to an outage at one of our upstream dependencies, which caused HTTP 500 errors and connection resets for our Teams integration.We coordinated with the relevant service teams, and the issue was resolved at 19:51 UTC when the upstream incident was mitigated.We are working to update observability and runbooks to reduce time to mitigation for issues like this in the future.

investigatingMar 24, 06:50 PM

We are experiencing degraded availability from Azure Teams APIs, which is impacting notifications from GitHub to Microsoft Teams. We are awaiting resolution from Azure.

investigatingMar 24, 05:43 PM

We are experiencing degraded availability from Azure APIs, which is impacting notifications from GitHub to Microsoft Teams. We are working with Azure to resolve the issue.

investigatingMar 24, 05:09 PM

We found an issue impacting notifications from GitHub to Microsoft Teams. We are working on mitigation and will keep users updated on progress towards mitigation.

investigatingMar 24, 04:59 PM

We are investigating reports of impacted performance for some GitHub services.

minorresolvedMar 22, 09:08 AM — Resolved Mar 22, 10:02 AM

Disruption with some GitHub services

3 updates
resolvedMar 22, 10:02 AM

On March 22, 2026, between 09:05 UTC and 10:02 UTC, users may have experienced intermittent errors and increased latency when performing Git http read operations. On average, the error rate was 3.84% and peaked at 15.55% of requests to the service. The issue was caused by elevated latency in an internal authentication service within one of our regional clusters. We mitigated the issue by redirecting traffic away from the affected cluster at 09:39 UTC, after which error rates returned to normal. The incident was fully resolved at 10:02 UTC. We are working to scale the authentication service and reduce our time to detection and mitigation of issues like this one in the future.

investigatingMar 22, 09:27 AM

We are investigating intermittently high latency and errors from Git operations.

investigatingMar 22, 09:08 AM

We are investigating reports of impacted performance for some GitHub services.

📡 Tired of checking GitHub status manually?

Better Stack monitors uptime every 30 seconds and alerts you instantly when GitHub goes down.

Start Free Monitoring →