Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google)
When OpenAI went down on December 11, 2025, thousands of AI applications stopped working. Chatbots froze. Content generators failed. Customer support systems crashed. If your entire business depends on a single AI provider, you're one outage away from disaster.
But it doesn't have to be that way.
In this guide, you'll learn how to build a production-ready multi-provider AI system that automatically fails over between OpenAI, Anthropic, and Google when one goes down—with complete code examples.
Why You Need Multi-Provider AI
The Outage Reality
AI APIs go down more often than you think:
- OpenAI (December 2025): 4-hour outage affecting ChatGPT and API
- Anthropic (November 2025): Degraded performance for 6+ hours
- Google Gemini (October 2025): Complete API outage for 2 hours
- OpenAI (March 2024): 3+ hour outage during business hours
- Anthropic (June 2024): Rate limit issues affecting production apps
If you're only using one provider, your availability is capped at their uptime. With three providers and automatic failover, you can achieve 99.99%+ uptime even when individual providers fail.
The Business Case
Cost savings: Route to cheaper providers first, use premium as fallback Reliability: 3 providers with 99.9% uptime each = 99.9999% combined uptime Performance: Route to fastest provider based on real-time latency Compliance: Some regions require data to stay local; multi-provider enables geo-routing
Architecture: Provider Abstraction Layer
The key to multi-provider AI is a unified interface that abstracts away provider differences:
Your Application
↓
AI Client (Unified Interface)
↓
Provider Router
↙ ↓ ↘
OpenAI Anthropic Google
Core Components
- Unified Interface: Same method signatures regardless of provider
- Provider Adapters: Translate between your interface and each provider's API
- Router: Decides which provider to use and handles fallback
- Health Monitor: Tracks which providers are healthy
- Prompt Translator: Adapts prompts for provider-specific quirks
Building the System: Complete Code
Let's build a production-ready multi-provider AI client in TypeScript.
1. Define the Unified Interface
// types.ts
export interface Message {
role: 'system' | 'user' | 'assistant';
content: string;
}
export interface CompletionRequest {
messages: Message[];
maxTokens?: number;
temperature?: number;
stream?: boolean;
}
export interface CompletionResponse {
content: string;
provider: string;
model: string;
usage: {
promptTokens: number;
completionTokens: number;
totalTokens: number;
};
}
export interface AIProvider {
name: string;
complete(request: CompletionRequest): Promise<CompletionResponse>;
isHealthy(): Promise<boolean>;
}
2. Implement Provider Adapters
OpenAI Adapter
// providers/openai.ts
import OpenAI from 'openai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';
export class OpenAIProvider implements AIProvider {
name = 'openai';
private client: OpenAI;
private model: string;
constructor(apiKey: string, model = 'gpt-4-turbo') {
this.client = new OpenAI({ apiKey });
this.model = model;
}
async complete(request: CompletionRequest): Promise<CompletionResponse> {
const response = await this.client.chat.completions.create({
model: this.model,
messages: request.messages,
max_tokens: request.maxTokens,
temperature: request.temperature,
stream: false,
});
return {
content: response.choices[0].message.content || '',
provider: this.name,
model: this.model,
usage: {
promptTokens: response.usage?.prompt_tokens || 0,
completionTokens: response.usage?.completion_tokens || 0,
totalTokens: response.usage?.total_tokens || 0,
},
};
}
async isHealthy(): Promise<boolean> {
try {
// Lightweight health check
await this.client.models.retrieve('gpt-3.5-turbo');
return true;
} catch (error) {
console.error('OpenAI health check failed:', error);
return false;
}
}
}
Anthropic Adapter
// providers/anthropic.ts
import Anthropic from '@anthropic-ai/sdk';
import { AIProvider, CompletionRequest, CompletionResponse, Message } from '../types';
export class AnthropicProvider implements AIProvider {
name = 'anthropic';
private client: Anthropic;
private model: string;
constructor(apiKey: string, model = 'claude-3-5-sonnet-20241022') {
this.client = new Anthropic({ apiKey });
this.model = model;
}
async complete(request: CompletionRequest): Promise<CompletionResponse> {
// Anthropic requires system messages separately
const systemMessage = request.messages.find(m => m.role === 'system');
const messages = request.messages
.filter(m => m.role !== 'system')
.map(m => ({
role: m.role as 'user' | 'assistant',
content: m.content,
}));
const response = await this.client.messages.create({
model: this.model,
max_tokens: request.maxTokens || 4096,
temperature: request.temperature,
system: systemMessage?.content,
messages,
});
return {
content: response.content[0].type === 'text'
? response.content[0].text
: '',
provider: this.name,
model: this.model,
usage: {
promptTokens: response.usage.input_tokens,
completionTokens: response.usage.output_tokens,
totalTokens: response.usage.input_tokens + response.usage.output_tokens,
},
};
}
async isHealthy(): Promise<boolean> {
try {
// Minimal completion as health check
await this.client.messages.create({
model: this.model,
max_tokens: 1,
messages: [{ role: 'user', content: 'Hi' }],
});
return true;
} catch (error) {
console.error('Anthropic health check failed:', error);
return false;
}
}
}
Google Gemini Adapter
// providers/google.ts
import { GoogleGenerativeAI } from '@google/generative-ai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';
export class GoogleProvider implements AIProvider {
name = 'google';
private client: GoogleGenerativeAI;
private model: string;
constructor(apiKey: string, model = 'gemini-1.5-pro') {
this.client = new GoogleGenerativeAI(apiKey);
this.model = model;
}
async complete(request: CompletionRequest): Promise<CompletionResponse> {
const model = this.client.getGenerativeModel({ model: this.model });
// Convert messages to Gemini format
const systemMessage = request.messages.find(m => m.role === 'system');
const history = request.messages
.filter(m => m.role !== 'system')
.map(m => ({
role: m.role === 'assistant' ? 'model' : 'user',
parts: [{ text: m.content }],
}));
const chat = model.startChat({
history: history.slice(0, -1),
generationConfig: {
maxOutputTokens: request.maxTokens,
temperature: request.temperature,
},
systemInstruction: systemMessage?.content,
});
const lastMessage = history[history.length - 1];
const result = await chat.sendMessage(lastMessage.parts[0].text);
const response = result.response;
return {
content: response.text(),
provider: this.name,
model: this.model,
usage: {
promptTokens: response.usageMetadata?.promptTokenCount || 0,
completionTokens: response.usageMetadata?.candidatesTokenCount || 0,
totalTokens: response.usageMetadata?.totalTokenCount || 0,
},
};
}
async isHealthy(): Promise<boolean> {
try {
const model = this.client.getGenerativeModel({ model: this.model });
await model.generateContent('Test');
return true;
} catch (error) {
console.error('Google health check failed:', error);
return false;
}
}
}
3. Build the Router with Automatic Fallback
// router.ts
import { AIProvider, CompletionRequest, CompletionResponse } from './types';
interface RouterConfig {
providers: AIProvider[];
maxRetries?: number;
healthCheckInterval?: number;
}
export class AIRouter {
private providers: AIProvider[];
private maxRetries: number;
private healthStatus: Map<string, boolean> = new Map();
constructor(config: RouterConfig) {
this.providers = config.providers;
this.maxRetries = config.maxRetries || 3;
// Initialize health checks
this.startHealthMonitoring(config.healthCheckInterval || 60000);
}
async complete(request: CompletionRequest): Promise<CompletionResponse> {
let lastError: Error | null = null;
// Try each provider in order until one succeeds
for (const provider of this.providers) {
// Skip unhealthy providers
if (this.healthStatus.get(provider.name) === false) {
console.log(`Skipping unhealthy provider: ${provider.name}`);
continue;
}
// Retry logic for transient failures
for (let attempt = 0; attempt < this.maxRetries; attempt++) {
try {
console.log(`Attempting ${provider.name} (attempt ${attempt + 1}/${this.maxRetries})`);
const response = await this.withTimeout(
provider.complete(request),
30000 // 30s timeout
);
console.log(`✓ Success with ${provider.name}`);
return response;
} catch (error: any) {
lastError = error;
console.error(`✗ ${provider.name} failed (attempt ${attempt + 1}):`, error.message);
// Don't retry on client errors (4xx)
if (error.status && error.status >= 400 && error.status < 500) {
console.log(`Client error (${error.status}), skipping retries`);
break;
}
// Exponential backoff for retries
if (attempt < this.maxRetries - 1) {
const backoff = Math.pow(2, attempt) * 1000;
console.log(`Waiting ${backoff}ms before retry...`);
await new Promise(resolve => setTimeout(resolve, backoff));
}
}
}
// Mark provider as unhealthy after all retries fail
this.healthStatus.set(provider.name, false);
}
throw new Error(
`All providers failed. Last error: ${lastError?.message}`
);
}
private async withTimeout<T>(
promise: Promise<T>,
timeoutMs: number
): Promise<T> {
const timeout = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('Request timeout')), timeoutMs)
);
return Promise.race([promise, timeout]);
}
private startHealthMonitoring(intervalMs: number) {
// Initial health check
this.checkAllHealth();
// Periodic health checks
setInterval(() => this.checkAllHealth(), intervalMs);
}
private async checkAllHealth() {
for (const provider of this.providers) {
try {
const healthy = await provider.isHealthy();
this.healthStatus.set(provider.name, healthy);
console.log(`Health check ${provider.name}: ${healthy ? '✓' : '✗'}`);
} catch (error) {
this.healthStatus.set(provider.name, false);
console.error(`Health check failed for ${provider.name}:`, error);
}
}
}
getHealthStatus(): Record<string, boolean> {
return Object.fromEntries(this.healthStatus);
}
}
4. Putting It All Together
// index.ts
import { AIRouter } from './router';
import { OpenAIProvider } from './providers/openai';
import { AnthropicProvider } from './providers/anthropic';
import { GoogleProvider } from './providers/google';
// Initialize providers
const openai = new OpenAIProvider(process.env.OPENAI_API_KEY!);
const anthropic = new AnthropicProvider(process.env.ANTHROPIC_API_KEY!);
const google = new GoogleProvider(process.env.GOOGLE_API_KEY!);
// Create router with fallback order: Google → Anthropic → OpenAI
// (Google is cheapest, OpenAI is most expensive)
const ai = new AIRouter({
providers: [google, anthropic, openai],
maxRetries: 2,
healthCheckInterval: 60000, // Check health every minute
});
// Use it!
async function main() {
try {
const response = await ai.complete({
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain quantum computing in simple terms.' },
],
maxTokens: 500,
temperature: 0.7,
});
console.log(`Response from ${response.provider}:`);
console.log(response.content);
console.log(`\nTokens used: ${response.usage.totalTokens}`);
} catch (error) {
console.error('All AI providers failed:', error);
}
}
main();
Handling Provider Differences
Prompt Compatibility Layer
Different providers have different strengths and behaviors. Here's how to handle that:
// prompt-translator.ts
export class PromptTranslator {
static adaptForProvider(
messages: Message[],
provider: string
): Message[] {
switch (provider) {
case 'anthropic':
// Claude prefers detailed, structured prompts
return this.makeMoreVerbose(messages);
case 'google':
// Gemini handles casual language better
return this.makeMoreCasual(messages);
case 'openai':
default:
return messages;
}
}
private static makeMoreVerbose(messages: Message[]): Message[] {
// Add more structure for Claude
return messages.map(msg => {
if (msg.role === 'user') {
return {
...msg,
content: msg.content + '\n\nPlease provide a detailed response.',
};
}
return msg;
});
}
private static makeMoreCasual(messages: Message[]): Message[] {
// Simplify for Gemini
return messages.map(msg => ({
...msg,
content: msg.content.replace(/\n\n/g, ' '),
}));
}
}
Capability Detection
Not all providers support the same features:
interface ProviderCapabilities {
vision: boolean;
toolUse: boolean;
streaming: boolean;
maxContextTokens: number;
}
const capabilities: Record<string, ProviderCapabilities> = {
openai: {
vision: true,
toolUse: true,
streaming: true,
maxContextTokens: 128000, // GPT-4 Turbo
},
anthropic: {
vision: true,
toolUse: true,
streaming: true,
maxContextTokens: 200000, // Claude 3.5 Sonnet
},
google: {
vision: true,
toolUse: true,
streaming: true,
maxContextTokens: 1000000, // Gemini 1.5 Pro
},
};
// Use capability detection in router
if (requiresVision && !capabilities[provider.name].vision) {
console.log(`Skipping ${provider.name}: vision not supported`);
continue;
}
Cost Optimization
Route to the cheapest provider first, premium as fallback:
interface ProviderPricing {
inputPer1M: number; // USD per 1M input tokens
outputPer1M: number; // USD per 1M output tokens
}
const pricing: Record<string, ProviderPricing> = {
google: { inputPer1M: 1.25, outputPer1M: 5.00 }, // Gemini 1.5 Pro
anthropic: { inputPer1M: 3.00, outputPer1M: 15.00 }, // Claude 3.5 Sonnet
openai: { inputPer1M: 10.00, outputPer1M: 30.00 }, // GPT-4 Turbo
};
// Sort providers by cost
const providers = [google, anthropic, openai].sort((a, b) => {
return pricing[a.name].inputPer1M - pricing[b.name].inputPer1M;
});
const ai = new AIRouter({ providers });
Result: You use Google (cheapest) by default, fail over to Anthropic if Google is down, and only use OpenAI (most expensive) as a last resort.
Monitoring with API Status Check
Instead of waiting for health checks to detect outages, proactively monitor provider status:
// status-monitor.ts
async function checkProviderStatus(provider: string): Promise<boolean> {
try {
const response = await fetch(
`https://apistatuscheck.com/api/status/${provider}`
);
const data = await response.json();
return data.status === 'operational';
} catch {
return true; // Assume operational if check fails
}
}
// In your router
async complete(request: CompletionRequest): Promise<CompletionResponse> {
for (const provider of this.providers) {
// Check API Status Check first
const isUp = await checkProviderStatus(provider.name);
if (!isUp) {
console.log(`Skipping ${provider.name}: reported down on API Status Check`);
continue;
}
// Proceed with request...
}
}
Why this matters: API Status Check aggregates reports from thousands of developers. You'll know about outages seconds after they start, before your own health checks fail.
Provider Comparison Table
| Provider | Best For | Strengths | Weaknesses | Cost (1M tokens) |
|---|---|---|---|---|
| OpenAI GPT-4 | Complex reasoning, coding | Highest quality, best tool use | Most expensive, frequent outages | $10 input / $30 output |
| Anthropic Claude | Long context, analysis | 200K context, reliable | Slower responses | $3 input / $15 output |
| Google Gemini | Cost efficiency, speed | 1M context, cheapest | Less capable reasoning | $1.25 input / $5 output |
When to Use Each
Default to Google Gemini when:
- Cost is a concern
- You need massive context windows
- Task is straightforward (summarization, classification)
Upgrade to Anthropic when:
- You need high-quality analysis
- Working with long documents
- Gemini fails or is down
Fallback to OpenAI when:
- Task requires complex reasoning
- Using function calling/tools heavily
- Both Gemini and Claude are down
Production Checklist
- Implement exponential backoff with jitter
- Add circuit breakers to fail fast on persistent outages
- Log all provider switches for debugging
- Track cost per provider in your analytics
- Set up alerts when all providers fail
- Monitor API Status Check for early outage detection
- Cache responses where possible to reduce API calls
- Implement rate limiting per provider
- Store provider preferences per user/session
- Test failover regularly (chaos engineering)
Advanced: Streaming Support
For real-time responses, implement streaming with fallback:
async *completeStream(
request: CompletionRequest
): AsyncGenerator<string, void, unknown> {
for (const provider of this.providers) {
try {
// Try streaming with current provider
const stream = await provider.completeStream(request);
for await (const chunk of stream) {
yield chunk;
}
return; // Success, stop trying other providers
} catch (error) {
console.error(`Streaming failed with ${provider.name}:`, error);
// Try next provider
}
}
throw new Error('Streaming failed with all providers');
}
FAQ
Q: Won't this increase my costs? Not necessarily. By routing to cheaper providers first (Google → Anthropic → OpenAI), you actually save money. Most requests will use the cheapest provider.
Q: How do I handle different response formats? The adapter pattern handles this. Each provider adapter translates its native response format into your unified interface.
Q: What about API key security? Store keys in environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault). Never commit them to git.
Q: Can I mix streaming and non-streaming? Yes, but handle it per-request. Some providers support streaming, others don't. Your router should detect this and route appropriately.
Q: How do I test this without burning through API credits? Use mock providers in testing:
class MockProvider implements AIProvider {
async complete(request: CompletionRequest): Promise<CompletionResponse> {
return {
content: 'Mock response',
provider: 'mock',
model: 'mock-model',
usage: { promptTokens: 10, completionTokens: 20, totalTokens: 30 },
};
}
async isHealthy(): Promise<boolean> {
return true;
}
}
Q: What if a provider returns an incomplete response? Add validation in your adapters:
if (!response.content || response.content.length < 10) {
throw new Error('Incomplete response, triggering fallback');
}
Q: Should I use this for production? Yes! Many companies use this pattern, including Anthropic themselves (they have fallback to other providers in Claude). The key is thorough testing and monitoring.
Real-World Example: E-commerce Chatbot
Here's how a real e-commerce company uses multi-provider AI:
const ai = new AIRouter({
providers: [
new GoogleProvider(process.env.GOOGLE_KEY, 'gemini-1.5-flash'), // Fast, cheap
new AnthropicProvider(process.env.ANTHROPIC_KEY), // Reliable
new OpenAIProvider(process.env.OPENAI_KEY), // Last resort
],
});
// Customer support chatbot
app.post('/api/chat', async (req, res) => {
const { message, history } = req.body;
try {
const response = await ai.complete({
messages: [
{
role: 'system',
content: 'You are a helpful e-commerce support assistant. Be concise and friendly.',
},
...history,
{ role: 'user', content: message },
],
maxTokens: 300,
temperature: 0.7,
});
// Track which provider was used
await analytics.track('ai_request', {
provider: response.provider,
tokens: response.usage.totalTokens,
cost: calculateCost(response),
});
res.json({ reply: response.content });
} catch (error) {
// All providers failed - use fallback
res.json({
reply: "I'm having trouble right now. Please try again in a moment or contact support@example.com",
});
}
});
Result: 99.95% uptime, average cost reduced by 60% (by routing to Gemini first), and zero customer-facing outages in 6 months.
Get Alerted Before Your Users Notice
The best failover system is one that triggers before you even need it. API Status Check monitors OpenAI, Anthropic, Google, and 200+ other APIs in real-time.
Get notified the second an outage begins. Set up intelligent alerts at apistatuscheck.com and never be caught off guard again.
Ready to build bulletproof AI applications? Implement this pattern today, and you'll sleep better knowing one API outage won't take down your entire business.
API Status Check
Stop checking API status pages manually
Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.
Free dashboard available · 14-day trial on paid plans · Cancel anytime
Browse Free Dashboard →