Is OpenAI Down? How to Check OpenAI Status & What to Do
Is OpenAI Down? How to Check OpenAI API Status & What to Do
API returning 500 errors. ChatGPT stuck on "Thinking..." ChatGPT Plus not responding. DALL-E timing out. When OpenAI has issues, millions of applications break simultaneously. But "OpenAI down" can mean different things: the consumer website, the developer API, specific models, or regional infrastructure. This guide helps you diagnose what's actually broken and how to respond.
Check OpenAI Status Right Now
Check real-time OpenAI API and ChatGPT status on API Status Check →
We monitor OpenAI's API endpoints, ChatGPT web app, and individual model availability every 60 seconds from multiple regions. See exactly which services are affected.
Understanding OpenAI's Service Architecture
OpenAI runs several independent (but related) services. An outage in one doesn't always affect others:
ChatGPT (chat.openai.com)
The consumer web interface where you chat with GPT-4, GPT-4o, etc. via browser.
Infrastructure: Separate from API, runs on Cloudflare + custom backend When down: Website won't load, returns 502/503 errors, or "ChatGPT is at capacity" Impact: Website users affected, but API developers often fine
Key distinction: ChatGPT can be down while API works perfectly. They're different systems.
OpenAI API (api.openai.com)
The REST API for developers integrating GPT, DALL-E, Whisper, etc. into applications.
Endpoints:
/v1/chat/completions- Chat models (GPT-4, GPT-3.5)/v1/completions- Legacy completion API/v1/images/generations- DALL-E/v1/audio/transcriptions- Whisper/v1/embeddings- Text embeddings
When down: HTTP 500/503 errors across all endpoints, or specific model failures Impact: All applications using OpenAI API break
Individual Models
Models run on dedicated infrastructure. One can be down while others work:
GPT-4o - Latest, fastest GPT-4 variant
GPT-4 - Classic, most capable (slower)
GPT-4 Turbo - Larger context window version
GPT-3.5 Turbo - Cheap, fast baseline
DALL-E 3 - Image generation
Whisper - Speech-to-text
Pattern: New model launches (like GPT-4o) often have 48-hour instability windows from demand spikes.
Regional Infrastructure
OpenAI uses multiple data centers. Outages can be regional:
US-East (primary) - Virginia data center, most traffic
US-West - California, backup
Europe - Dublin, serves EU traffic
During regional outage: Some users affected, others fine. Depends on routing.
Common OpenAI Error Codes
HTTP 429: Rate Limit Exceeded
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"param": null,
"code": "rate_limit_exceeded"
}
}
NOT an outage - you exceeded your quota. OpenAI enforces:
Free tier:
- 3 requests per minute (RPM)
- 200 requests per day (RPD)
Paid tiers (usage-based):
- Tier 1 ($5+ spent): 500 RPM, 10K RPD
- Tier 2 ($50+): 5K RPM, 50K RPD
- Tier 3 ($100+): 10K RPM, 100K RPD
- Tier 4 ($500+): 50K RPM, 500K RPD
- Tier 5 ($1000+): 10K RPM, 1M RPD
Check your tier:
- Go to platform.openai.com
- Settings → Organization → Usage Limits
- See current tier and limits
Response headers tell you limits:
x-ratelimit-limit-requests: 10000
x-ratelimit-remaining-requests: 9995
x-ratelimit-reset-requests: 6s
Fix: Wait until reset-requests timestamp, or implement exponential backoff:
import time
from openai import RateLimitError
def call_with_backoff(client, **kwargs):
max_retries = 5
for attempt in range(max_retries):
try:
return client.chat.completions.create(**kwargs)
except RateLimitError as e:
if attempt < max_retries - 1:
wait = 2 ** attempt # 1s, 2s, 4s, 8s, 16s
time.sleep(wait)
else:
raise
HTTP 500: Internal Server Error
{
"error": {
"message": "The server had an error processing your request. Sorry about that!",
"type": "server_error",
"param": null,
"code": null
}
}
This IS an outage. OpenAI's infrastructure is failing. During major incidents (like the November 2025 GPT-4 outage), 500 errors hit 30%+ of requests.
What to do:
- Retry with exponential backoff (3 attempts, 2s → 4s → 8s delays)
- Check status page - if widespread, wait for recovery
- Fall back to alternative model - try GPT-3.5 if GPT-4 failing
- Switch providers - use Anthropic/DeepSeek as backup
Don't spam retries - you'll make the outage worse and might get rate limited.
HTTP 503: Service Unavailable / "ChatGPT is at capacity"
{
"error": {
"message": "The engine is currently overloaded, please try again later",
"type": "server_error",
"param": null,
"code": "service_unavailable"
}
}
Overloaded infrastructure. OpenAI's servers are getting hammered. Common triggers:
- Viral moments - Tweet/article goes viral, everyone tests GPT
- Breaking news - Major events drive traffic spikes
- Product launches - GPT-5 announcements, new features
- Peak hours - US work hours (9 AM - 5 PM PT)
ChatGPT web version: Shows "ChatGPT is at capacity right now" message
API version: Returns 503 errors
Response: Wait 5-10 minutes. Unlike hard failures, capacity issues resolve as load decreases. Retry with longer delays (30s+).
HTTP 401: Unauthorized / Authentication Error
{
"error": {
"message": "Incorrect API key provided: sk-proj-********************",
"type": "invalid_request_error",
"param": null,
"code": "invalid_api_key"
}
}
Your API key is wrong or expired. Not an OpenAI outage.
Common mistakes:
- Using legacy key format (old keys start with
sk-, new ones withsk-proj-) - Whitespace before/after key in code
- Environment variable not loaded (
$OPENAI_API_KEYis blank) - Key revoked due to billing issue
Verify key works:
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"
Should return list of available models. If 401, key is invalid.
Regenerate:
- Go to platform.openai.com/api-keys
- Click "+ Create new secret key"
- Copy immediately (won't be shown again)
- Update your environment variables
HTTP 400: Invalid Request
{
"error": {
"message": "'messages' is a required property",
"type": "invalid_request_error",
"param": "messages",
"code": null
}
}
Your request is malformed. Code bug, not OpenAI issue.
Common mistakes:
1. Missing required fields:
# ❌ Wrong - missing 'messages'
response = client.chat.completions.create(
model="gpt-4"
)
# ✅ Correct
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
2. Invalid model name:
# ❌ Wrong
model="gpt4" # No hyphen
# ✅ Correct
model="gpt-4"
model="gpt-4o"
model="gpt-3.5-turbo"
3. Token limit exceeded:
{
"error": {
"message": "This model's maximum context length is 8192 tokens...",
"type": "invalid_request_error",
"code": "context_length_exceeded"
}
}
Use tiktoken to count tokens before sending:
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-4")
tokens = encoding.encode("Your text here")
print(f"Token count: {len(tokens)}")
# GPT-4: 8K tokens
# GPT-4-32k: 32K tokens
# GPT-4-turbo: 128K tokens
Timeout Errors (No HTTP Response)
Request hangs for 60+ seconds, then times out.
Possible causes:
1. Model overload (common with GPT-4 during peak hours) GPT-4 requests sometimes take 30-60 seconds when infrastructure is stressed.
Not an error - just slow. Increase timeout:
from openai import OpenAI
client = OpenAI(
timeout=120.0 # 2 minutes instead of default 10s
)
2. Network issues (rare) Your connection to OpenAI's servers failing.
Test:
curl -m 5 https://api.openai.com/v1/models
If times out, network problem. If fast response, it's model processing time.
3. OpenAI regional routing failure Requests to US-East timing out but US-West works (can't manually choose, but good to know).
How to Check OpenAI Status
1. Official Status Page
Shows:
- API status (operational, degraded, outage)
- ChatGPT web status
- DALL-E, Whisper, other services
- Incident history
- Scheduled maintenance
Caveat: Updates lag 5-15 minutes behind actual issues. During the December 2025 ChatGPT outage, users reported problems for 12 minutes before status page updated.
2. OpenAI Twitter/X
@OpenAI and @OpenAIDevs
Engineers tweet about incidents:
@OpenAI: "We're investigating reports of elevated error rates on the ChatGPT API"
Often faster than status page updates.
3. API Status Check (Our Monitoring)
https://apistatuscheck.com/openai
We test OpenAI every 60 seconds:
- GPT-4, GPT-4o, GPT-3.5 availability
- Response time by model
- Error rate trends
- ChatGPT web accessibility
Shows granular view: "GPT-4 degraded but GPT-3.5 working fine"
4. Developer Community
OpenAI Community Forum: https://community.openai.com
Search for "api down" or "500 error" to see real-time reports.
Reddit: r/OpenAI - developers post issues immediately
5. Manual API Test
Quick health check:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "test"}],
"max_tokens": 5
}'
Use GPT-3.5 for testing (cheapest). If this fails, API is down. If succeeds but your app fails, it's your code.
Troubleshooting OpenAI Issues
Step 1: Isolate API vs ChatGPT
Test ChatGPT website: Go to chat.openai.com in browser. Can you send a message?
Test API: Run the curl command above.
Possible outcomes:
- Both work → Your code/network has issues
- ChatGPT down, API works → Web infrastructure issue only
- API down, ChatGPT works → API-specific outage (rare)
- Both down → Major OpenAI infrastructure failure
Step 2: Test Different Models
OpenAI's models run on different infrastructure:
models_to_test = [
"gpt-3.5-turbo", # Baseline, most reliable
"gpt-4o", # Newest, sometimes unstable
"gpt-4-turbo", # Large context version
"gpt-4", # Classic, usually stable
]
for model in models_to_test:
try:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Hi"}],
max_tokens=5
)
print(f"{model}: ✓ Working")
except Exception as e:
print(f"{model}: ✗ {str(e)}")
If GPT-3.5 works but GPT-4 fails, it's model-specific issue.
Step 3: Check Billing and Limits
Verify billing:
- platform.openai.com → Billing
- Check "Payment methods" has valid card
- Look for failed payment notifications
Check usage limits:
- Settings → Limits
- See "Hard limit" (max monthly spend)
- See "Soft limit" (notification threshold)
If you hit hard limit, API returns 429 errors until you raise it or month resets.
Step 4: Test from Different Network
Corporate network issues:
Some firewalls block api.openai.com or specific ports.
Test:
# From corporate network
curl -I https://api.openai.com
# From phone hotspot (bypass firewall)
curl -I https://api.openai.com
If hotspot works but corporate fails, ask IT to whitelist:
*.openai.com*.openai.azure.com(if using Azure OpenAI)
Step 5: Verify SDK Version
Old OpenAI Python/Node SDKs have bugs that manifest as "API down."
Python:
pip show openai
# Should be 1.0.0+
pip install --upgrade openai
Node.js:
npm list openai
# Should be 4.0.0+
npm install openai@latest
Breaking changes: OpenAI v1.0+ completely changed API. If you have old code with v0.x SDK, upgrade both code and library.
What to Do During an OpenAI Outage
1. Implement Multi-Provider Fallback
Don't rely solely on OpenAI:
def get_ai_response(prompt, max_attempts=3):
providers = [
("OpenAI GPT-4", lambda: call_openai(prompt, "gpt-4o")),
("OpenAI GPT-3.5", lambda: call_openai(prompt, "gpt-3.5-turbo")),
("Anthropic Claude", lambda: call_anthropic(prompt)),
("DeepSeek", lambda: call_deepseek(prompt)),
]
for name, provider_fn in providers:
try:
return provider_fn()
except Exception as e:
print(f"{name} failed: {e}")
continue
raise Exception("All AI providers failed")
Cost impact: Budget for occasional Claude usage (3x OpenAI cost).
2. Cache Responses Aggressively
For similar requests:
import hashlib
import json
response_cache = {}
def get_or_generate(messages):
# Cache key from message content
key = hashlib.md5(
json.dumps(messages, sort_keys=True).encode()
).hexdigest()
if key in response_cache:
print("Cache hit!")
return response_cache[key]
try:
response = client.chat.completions.create(
model="gpt-4",
messages=messages
)
response_cache[key] = response
return response
except:
# During outages, return stale cache if available
if key in response_cache:
print("Returning stale cache")
return response_cache[key]
raise
Production: Use Redis/Memcached instead of in-memory dict.
3. Queue Non-Critical Requests
Use background job queue for async work:
from celery import Celery
app = Celery('tasks')
@app.task(bind=True, max_retries=5)
def generate_summary(self, text):
try:
return client.chat.completions.create(
model="gpt-4",
messages=[{
"role": "user",
"content": f"Summarize: {text}"
}]
)
except Exception as e:
# Retry in 5 minutes
raise self.retry(exc=e, countdown=300)
During outages, jobs pile up and process when API recovers.
4. Graceful Degradation
Show users you're aware:
// Frontend notification
if (openaiStatus === 'degraded') {
showBanner(
'⚠️ Our AI provider is experiencing high demand. ' +
'Responses may be slower than usual.'
);
}
// Disable non-critical AI features
if (openaiStatus === 'down') {
disableFeature('ai-autocomplete');
disableFeature('smart-suggestions');
// Keep core features working without AI
}
5. Pre-generate Common Responses
For FAQ-style applications:
# Pre-generate during normal operation
common_questions = {
"What are your hours?": "We're open 9-5 Monday-Friday...",
"How do I reset my password?": "Click 'Forgot Password'...",
# etc.
}
def get_answer(question):
# Check cache first
if question in common_questions:
return common_questions[question]
# Fall back to AI
try:
return ask_gpt(question)
except:
return "Our AI is temporarily unavailable. Email support@..."
OpenAI Outage History & Patterns
Notable Incidents
November 8, 2025 - 3.5-hour ChatGPT outage
- Cause: Database failover during infrastructure migration
- Impact: ChatGPT website completely down, API unaffected
- Recovery: Rolled back migration, restored from backup
- Lesson: Web and API have separate infrastructure
September 2025 - GPT-4 degradation (12 hours)
- Cause: Model serving infrastructure couldn't handle load after GPT-4 price drop
- Impact: GPT-4 requests took 30-90 seconds or timed out, GPT-3.5 fine
- Recovery: Scaled up GPU capacity
- Lesson: Price changes drive usage spikes
July 2025 - API authentication failure (45 minutes)
- Cause: Certificate renewal bug broke API key validation
- Impact: All API requests returned 401 errors
- Recovery: Emergency certificate rollback
- Lesson: Have test API calls in your monitoring
March 2025 - Regional outage (US-East, 2 hours)
- Cause: AWS us-east-1 networking issue cascaded to OpenAI
- Impact: ~40% of users affected (those routed to US-East)
- Recovery: Traffic rerouted to US-West
- Lesson: Multi-region architecture limits blast radius
Patterns Observed
Time of day:
- Most issues: US work hours (9 AM - 6 PM PT)
- Least issues: US late night / Asia business hours
Day of week:
- Mondays: More deployment-related issues
- Fridays: Fewer issues (change freeze)
Launch correlation:
- New model releases = 48-72 hour instability window
- Price changes = immediate traffic spike and potential overload
ChatGPT vs API:
- ChatGPT outages 2x more frequent than API outages
- API outages tend to be shorter (30-90 min vs 2-4 hours)
Uptime:
- ChatGPT: ~99.0% (87 hours down/year)
- API: ~99.5% (43 hours down/year)
- Individual models vary: GPT-3.5 most reliable, GPT-4 slightly less
OpenAI vs Alternatives
When OpenAI is Down, Where to Fallback?
Quick comparison:
| Provider | Model | Cost (1M tokens) | Speed | Reliability |
|---|---|---|---|---|
| OpenAI | GPT-4o | $2.50 in, $10 out | ⭐⭐⭐⭐ | 99.5% |
| Anthropic | Claude 3.5 | $3.00 in, $15 out | ⭐⭐⭐⭐ | 99.5% |
| Gemini 1.5 Pro | $1.25 in, $5 out | ⭐⭐⭐ | 99.3% | |
| DeepSeek | V2 | $0.14 in, $0.28 out | ⭐⭐⭐ | 97% |
| Groq | Llama 3.1 | Free tier | ⭐⭐⭐⭐⭐ | 97% |
Best fallback:
- Quality match: Anthropic Claude 3.5 Sonnet (equal or better quality)
- Cost match: DeepSeek V2 (20x cheaper, 90% quality)
- Speed priority: Groq (2x faster, but rate limited free tier)
Migration Example
OpenAI → Anthropic is easiest:
# OpenAI code
from openai import OpenAI
client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
# Switch to Anthropic
from anthropic import Anthropic
client = Anthropic(api_key="sk-ant-...")
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024, # Required in Anthropic
messages=[{"role": "user", "content": "Hello"}]
)
# Main difference: max_tokens required
Get Alerts Before Your Users Notice
Our Alert Pro plan ($9/month) tracks:
- OpenAI API (all models) + ChatGPT web
- Anthropic Claude (your fallback)
- DeepSeek, Groq (alternatives)
- 60-second checks from multiple regions
- Instant alerts via email, Slack, Discord, webhook
Know about OpenAI degradation before error logs fill up.
FAQ
Q: Is OpenAI down right now?
A: Check apistatuscheck.com/openai for real-time API and ChatGPT status.
Q: Why does ChatGPT work but my API fails?
A: They're separate systems. ChatGPT web app runs on different infrastructure than the developer API. One can be down while the other works.
Q: How long do OpenAI outages last?
A: Minor issues: 15-45 minutes. Major outages: 1-4 hours. Longest in 2025: 3.5 hours (ChatGPT database failover).
Q: Does OpenAI have an SLA?
A: Not for standard API usage. Enterprise customers can negotiate SLAs. No SLA for ChatGPT Plus/Team.
Q: Can I get a refund for downtime?
A: OpenAI charges per token used, not uptime. Failed requests don't consume tokens, so no refund needed. Enterprise SLAs may include credits.
Q: Should I retry 500 errors immediately?
A: No. Use exponential backoff (1s, 2s, 4s, 8s delays). Immediate retries during outages make the problem worse and waste your money.
Q: What's the difference between 500 and 503 errors?
A: 500 = internal server error (bug/crash). 503 = service unavailable (overloaded). Both mean OpenAI has issues, but 503 is more likely to be temporary capacity constraints.
Q: Why does GPT-4 fail but GPT-3.5 works?
A: Different infrastructure. GPT-4 requires more GPU resources and can be overloaded when GPT-3.5 is fine. Always test multiple models during incidents.
Q: Can I check OpenAI status via API?
A: No official status API. You can parse status.openai.com or use our API: apistatuscheck.com/api/v1/status/openai
Q: Does OpenAI prioritize Enterprise customers during outages?
A: Officially no. Recovery affects all tiers simultaneously. Enterprise gets dedicated support and better communication, not faster recovery.
Q: What's OpenAI's actual uptime?
A: Based on our 2025 monitoring: API 99.5%, ChatGPT 99.0%. That's ~43 hours of API downtime and ~87 hours of ChatGPT downtime per year.
Last updated: February 6, 2026
Next review: Weekly (highest traffic page)
🛠 Tools We Recommend
Uptime monitoring, incident management, and status pages — know before your users do.
Securely manage API keys, database credentials, and service tokens across your team.
Remove your personal data from 350+ data broker sites automatically.
Monitor your developer content performance and track API documentation rankings.
API Status Check
Stop checking API status pages manually
Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.
Free dashboard available · 14-day trial on paid plans · Cancel anytime
Browse Free Dashboard →