Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google)

by API Status Check

When OpenAI went down on December 11, 2025, thousands of AI applications stopped working. Chatbots froze. Content generators failed. Customer support systems crashed. If your entire business depends on a single AI provider, you're one outage away from disaster.

But it doesn't have to be that way.

In this guide, you'll learn how to build a production-ready multi-provider AI system that automatically fails over between OpenAI, Anthropic, and Google when one goes down—with complete code examples.

Why You Need Multi-Provider AI

The Outage Reality

AI APIs go down more often than you think:

  • OpenAI (December 2025): 4-hour outage affecting ChatGPT and API
  • Anthropic (November 2025): Degraded performance for 6+ hours
  • Google Gemini (October 2025): Complete API outage for 2 hours
  • OpenAI (March 2024): 3+ hour outage during business hours
  • Anthropic (June 2024): Rate limit issues affecting production apps

If you're only using one provider, your availability is capped at their uptime. With three providers and automatic failover, you can achieve 99.99%+ uptime even when individual providers fail.

The Business Case

Cost savings: Route to cheaper providers first, use premium as fallback Reliability: 3 providers with 99.9% uptime each = 99.9999% combined uptime Performance: Route to fastest provider based on real-time latency Compliance: Some regions require data to stay local; multi-provider enables geo-routing

Architecture: Provider Abstraction Layer

The key to multi-provider AI is a unified interface that abstracts away provider differences:

Your Application
       ↓
AI Client (Unified Interface)
       ↓
Provider Router
    ↙  ↓  ↘
OpenAI  Anthropic  Google

Core Components

  1. Unified Interface: Same method signatures regardless of provider
  2. Provider Adapters: Translate between your interface and each provider's API
  3. Router: Decides which provider to use and handles fallback
  4. Health Monitor: Tracks which providers are healthy
  5. Prompt Translator: Adapts prompts for provider-specific quirks

Building the System: Complete Code

Let's build a production-ready multi-provider AI client in TypeScript.

1. Define the Unified Interface

// types.ts
export interface Message {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

export interface CompletionRequest {
  messages: Message[];
  maxTokens?: number;
  temperature?: number;
  stream?: boolean;
}

export interface CompletionResponse {
  content: string;
  provider: string;
  model: string;
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

export interface AIProvider {
  name: string;
  complete(request: CompletionRequest): Promise<CompletionResponse>;
  isHealthy(): Promise<boolean>;
}

2. Implement Provider Adapters

OpenAI Adapter

// providers/openai.ts
import OpenAI from 'openai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';

export class OpenAIProvider implements AIProvider {
  name = 'openai';
  private client: OpenAI;
  private model: string;

  constructor(apiKey: string, model = 'gpt-4-turbo') {
    this.client = new OpenAI({ apiKey });
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    const response = await this.client.chat.completions.create({
      model: this.model,
      messages: request.messages,
      max_tokens: request.maxTokens,
      temperature: request.temperature,
      stream: false,
    });

    return {
      content: response.choices[0].message.content || '',
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usage?.prompt_tokens || 0,
        completionTokens: response.usage?.completion_tokens || 0,
        totalTokens: response.usage?.total_tokens || 0,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      // Lightweight health check
      await this.client.models.retrieve('gpt-3.5-turbo');
      return true;
    } catch (error) {
      console.error('OpenAI health check failed:', error);
      return false;
    }
  }
}

Anthropic Adapter

// providers/anthropic.ts
import Anthropic from '@anthropic-ai/sdk';
import { AIProvider, CompletionRequest, CompletionResponse, Message } from '../types';

export class AnthropicProvider implements AIProvider {
  name = 'anthropic';
  private client: Anthropic;
  private model: string;

  constructor(apiKey: string, model = 'claude-3-5-sonnet-20241022') {
    this.client = new Anthropic({ apiKey });
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    // Anthropic requires system messages separately
    const systemMessage = request.messages.find(m => m.role === 'system');
    const messages = request.messages
      .filter(m => m.role !== 'system')
      .map(m => ({
        role: m.role as 'user' | 'assistant',
        content: m.content,
      }));

    const response = await this.client.messages.create({
      model: this.model,
      max_tokens: request.maxTokens || 4096,
      temperature: request.temperature,
      system: systemMessage?.content,
      messages,
    });

    return {
      content: response.content[0].type === 'text' 
        ? response.content[0].text 
        : '',
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usage.input_tokens,
        completionTokens: response.usage.output_tokens,
        totalTokens: response.usage.input_tokens + response.usage.output_tokens,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      // Minimal completion as health check
      await this.client.messages.create({
        model: this.model,
        max_tokens: 1,
        messages: [{ role: 'user', content: 'Hi' }],
      });
      return true;
    } catch (error) {
      console.error('Anthropic health check failed:', error);
      return false;
    }
  }
}

Google Gemini Adapter

// providers/google.ts
import { GoogleGenerativeAI } from '@google/generative-ai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';

export class GoogleProvider implements AIProvider {
  name = 'google';
  private client: GoogleGenerativeAI;
  private model: string;

  constructor(apiKey: string, model = 'gemini-1.5-pro') {
    this.client = new GoogleGenerativeAI(apiKey);
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    const model = this.client.getGenerativeModel({ model: this.model });

    // Convert messages to Gemini format
    const systemMessage = request.messages.find(m => m.role === 'system');
    const history = request.messages
      .filter(m => m.role !== 'system')
      .map(m => ({
        role: m.role === 'assistant' ? 'model' : 'user',
        parts: [{ text: m.content }],
      }));

    const chat = model.startChat({
      history: history.slice(0, -1),
      generationConfig: {
        maxOutputTokens: request.maxTokens,
        temperature: request.temperature,
      },
      systemInstruction: systemMessage?.content,
    });

    const lastMessage = history[history.length - 1];
    const result = await chat.sendMessage(lastMessage.parts[0].text);
    const response = result.response;

    return {
      content: response.text(),
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usageMetadata?.promptTokenCount || 0,
        completionTokens: response.usageMetadata?.candidatesTokenCount || 0,
        totalTokens: response.usageMetadata?.totalTokenCount || 0,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      const model = this.client.getGenerativeModel({ model: this.model });
      await model.generateContent('Test');
      return true;
    } catch (error) {
      console.error('Google health check failed:', error);
      return false;
    }
  }
}

3. Build the Router with Automatic Fallback

// router.ts
import { AIProvider, CompletionRequest, CompletionResponse } from './types';

interface RouterConfig {
  providers: AIProvider[];
  maxRetries?: number;
  healthCheckInterval?: number;
}

export class AIRouter {
  private providers: AIProvider[];
  private maxRetries: number;
  private healthStatus: Map<string, boolean> = new Map();

  constructor(config: RouterConfig) {
    this.providers = config.providers;
    this.maxRetries = config.maxRetries || 3;

    // Initialize health checks
    this.startHealthMonitoring(config.healthCheckInterval || 60000);
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    let lastError: Error | null = null;

    // Try each provider in order until one succeeds
    for (const provider of this.providers) {
      // Skip unhealthy providers
      if (this.healthStatus.get(provider.name) === false) {
        console.log(`Skipping unhealthy provider: ${provider.name}`);
        continue;
      }

      // Retry logic for transient failures
      for (let attempt = 0; attempt < this.maxRetries; attempt++) {
        try {
          console.log(`Attempting ${provider.name} (attempt ${attempt + 1}/${this.maxRetries})`);
          
          const response = await this.withTimeout(
            provider.complete(request),
            30000 // 30s timeout
          );

          console.log(`✓ Success with ${provider.name}`);
          return response;

        } catch (error: any) {
          lastError = error;
          console.error(`✗ ${provider.name} failed (attempt ${attempt + 1}):`, error.message);

          // Don't retry on client errors (4xx)
          if (error.status && error.status >= 400 && error.status < 500) {
            console.log(`Client error (${error.status}), skipping retries`);
            break;
          }

          // Exponential backoff for retries
          if (attempt < this.maxRetries - 1) {
            const backoff = Math.pow(2, attempt) * 1000;
            console.log(`Waiting ${backoff}ms before retry...`);
            await new Promise(resolve => setTimeout(resolve, backoff));
          }
        }
      }

      // Mark provider as unhealthy after all retries fail
      this.healthStatus.set(provider.name, false);
    }

    throw new Error(
      `All providers failed. Last error: ${lastError?.message}`
    );
  }

  private async withTimeout<T>(
    promise: Promise<T>,
    timeoutMs: number
  ): Promise<T> {
    const timeout = new Promise<never>((_, reject) =>
      setTimeout(() => reject(new Error('Request timeout')), timeoutMs)
    );
    return Promise.race([promise, timeout]);
  }

  private startHealthMonitoring(intervalMs: number) {
    // Initial health check
    this.checkAllHealth();

    // Periodic health checks
    setInterval(() => this.checkAllHealth(), intervalMs);
  }

  private async checkAllHealth() {
    for (const provider of this.providers) {
      try {
        const healthy = await provider.isHealthy();
        this.healthStatus.set(provider.name, healthy);
        console.log(`Health check ${provider.name}: ${healthy ? '✓' : '✗'}`);
      } catch (error) {
        this.healthStatus.set(provider.name, false);
        console.error(`Health check failed for ${provider.name}:`, error);
      }
    }
  }

  getHealthStatus(): Record<string, boolean> {
    return Object.fromEntries(this.healthStatus);
  }
}

4. Putting It All Together

// index.ts
import { AIRouter } from './router';
import { OpenAIProvider } from './providers/openai';
import { AnthropicProvider } from './providers/anthropic';
import { GoogleProvider } from './providers/google';

// Initialize providers
const openai = new OpenAIProvider(process.env.OPENAI_API_KEY!);
const anthropic = new AnthropicProvider(process.env.ANTHROPIC_API_KEY!);
const google = new GoogleProvider(process.env.GOOGLE_API_KEY!);

// Create router with fallback order: Google → Anthropic → OpenAI
// (Google is cheapest, OpenAI is most expensive)
const ai = new AIRouter({
  providers: [google, anthropic, openai],
  maxRetries: 2,
  healthCheckInterval: 60000, // Check health every minute
});

// Use it!
async function main() {
  try {
    const response = await ai.complete({
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: 'Explain quantum computing in simple terms.' },
      ],
      maxTokens: 500,
      temperature: 0.7,
    });

    console.log(`Response from ${response.provider}:`);
    console.log(response.content);
    console.log(`\nTokens used: ${response.usage.totalTokens}`);
  } catch (error) {
    console.error('All AI providers failed:', error);
  }
}

main();

Handling Provider Differences

Prompt Compatibility Layer

Different providers have different strengths and behaviors. Here's how to handle that:

// prompt-translator.ts
export class PromptTranslator {
  static adaptForProvider(
    messages: Message[],
    provider: string
  ): Message[] {
    switch (provider) {
      case 'anthropic':
        // Claude prefers detailed, structured prompts
        return this.makeMoreVerbose(messages);
      
      case 'google':
        // Gemini handles casual language better
        return this.makeMoreCasual(messages);
      
      case 'openai':
      default:
        return messages;
    }
  }

  private static makeMoreVerbose(messages: Message[]): Message[] {
    // Add more structure for Claude
    return messages.map(msg => {
      if (msg.role === 'user') {
        return {
          ...msg,
          content: msg.content + '\n\nPlease provide a detailed response.',
        };
      }
      return msg;
    });
  }

  private static makeMoreCasual(messages: Message[]): Message[] {
    // Simplify for Gemini
    return messages.map(msg => ({
      ...msg,
      content: msg.content.replace(/\n\n/g, ' '),
    }));
  }
}

Capability Detection

Not all providers support the same features:

interface ProviderCapabilities {
  vision: boolean;
  toolUse: boolean;
  streaming: boolean;
  maxContextTokens: number;
}

const capabilities: Record<string, ProviderCapabilities> = {
  openai: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 128000, // GPT-4 Turbo
  },
  anthropic: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 200000, // Claude 3.5 Sonnet
  },
  google: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 1000000, // Gemini 1.5 Pro
  },
};

// Use capability detection in router
if (requiresVision && !capabilities[provider.name].vision) {
  console.log(`Skipping ${provider.name}: vision not supported`);
  continue;
}

Cost Optimization

Route to the cheapest provider first, premium as fallback:

interface ProviderPricing {
  inputPer1M: number;  // USD per 1M input tokens
  outputPer1M: number; // USD per 1M output tokens
}

const pricing: Record<string, ProviderPricing> = {
  google: { inputPer1M: 1.25, outputPer1M: 5.00 },      // Gemini 1.5 Pro
  anthropic: { inputPer1M: 3.00, outputPer1M: 15.00 },  // Claude 3.5 Sonnet
  openai: { inputPer1M: 10.00, outputPer1M: 30.00 },    // GPT-4 Turbo
};

// Sort providers by cost
const providers = [google, anthropic, openai].sort((a, b) => {
  return pricing[a.name].inputPer1M - pricing[b.name].inputPer1M;
});

const ai = new AIRouter({ providers });

Result: You use Google (cheapest) by default, fail over to Anthropic if Google is down, and only use OpenAI (most expensive) as a last resort.

Monitoring with API Status Check

Instead of waiting for health checks to detect outages, proactively monitor provider status:

// status-monitor.ts
async function checkProviderStatus(provider: string): Promise<boolean> {
  try {
    const response = await fetch(
      `https://apistatuscheck.com/api/status/${provider}`
    );
    const data = await response.json();
    return data.status === 'operational';
  } catch {
    return true; // Assume operational if check fails
  }
}

// In your router
async complete(request: CompletionRequest): Promise<CompletionResponse> {
  for (const provider of this.providers) {
    // Check API Status Check first
    const isUp = await checkProviderStatus(provider.name);
    if (!isUp) {
      console.log(`Skipping ${provider.name}: reported down on API Status Check`);
      continue;
    }

    // Proceed with request...
  }
}

Why this matters: API Status Check aggregates reports from thousands of developers. You'll know about outages seconds after they start, before your own health checks fail.

Provider Comparison Table

Provider Best For Strengths Weaknesses Cost (1M tokens)
OpenAI GPT-4 Complex reasoning, coding Highest quality, best tool use Most expensive, frequent outages $10 input / $30 output
Anthropic Claude Long context, analysis 200K context, reliable Slower responses $3 input / $15 output
Google Gemini Cost efficiency, speed 1M context, cheapest Less capable reasoning $1.25 input / $5 output

When to Use Each

Default to Google Gemini when:

  • Cost is a concern
  • You need massive context windows
  • Task is straightforward (summarization, classification)

Upgrade to Anthropic when:

  • You need high-quality analysis
  • Working with long documents
  • Gemini fails or is down

Fallback to OpenAI when:

  • Task requires complex reasoning
  • Using function calling/tools heavily
  • Both Gemini and Claude are down

Production Checklist

  • Implement exponential backoff with jitter
  • Add circuit breakers to fail fast on persistent outages
  • Log all provider switches for debugging
  • Track cost per provider in your analytics
  • Set up alerts when all providers fail
  • Monitor API Status Check for early outage detection
  • Cache responses where possible to reduce API calls
  • Implement rate limiting per provider
  • Store provider preferences per user/session
  • Test failover regularly (chaos engineering)

Advanced: Streaming Support

For real-time responses, implement streaming with fallback:

async *completeStream(
  request: CompletionRequest
): AsyncGenerator<string, void, unknown> {
  for (const provider of this.providers) {
    try {
      // Try streaming with current provider
      const stream = await provider.completeStream(request);
      
      for await (const chunk of stream) {
        yield chunk;
      }
      
      return; // Success, stop trying other providers
    } catch (error) {
      console.error(`Streaming failed with ${provider.name}:`, error);
      // Try next provider
    }
  }
  
  throw new Error('Streaming failed with all providers');
}

FAQ

Q: Won't this increase my costs? Not necessarily. By routing to cheaper providers first (Google → Anthropic → OpenAI), you actually save money. Most requests will use the cheapest provider.

Q: How do I handle different response formats? The adapter pattern handles this. Each provider adapter translates its native response format into your unified interface.

Q: What about API key security? Store keys in environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault). Never commit them to git.

Q: Can I mix streaming and non-streaming? Yes, but handle it per-request. Some providers support streaming, others don't. Your router should detect this and route appropriately.

Q: How do I test this without burning through API credits? Use mock providers in testing:

class MockProvider implements AIProvider {
  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    return {
      content: 'Mock response',
      provider: 'mock',
      model: 'mock-model',
      usage: { promptTokens: 10, completionTokens: 20, totalTokens: 30 },
    };
  }
  
  async isHealthy(): Promise<boolean> {
    return true;
  }
}

Q: What if a provider returns an incomplete response? Add validation in your adapters:

if (!response.content || response.content.length < 10) {
  throw new Error('Incomplete response, triggering fallback');
}

Q: Should I use this for production? Yes! Many companies use this pattern, including Anthropic themselves (they have fallback to other providers in Claude). The key is thorough testing and monitoring.

Real-World Example: E-commerce Chatbot

Here's how a real e-commerce company uses multi-provider AI:

const ai = new AIRouter({
  providers: [
    new GoogleProvider(process.env.GOOGLE_KEY, 'gemini-1.5-flash'), // Fast, cheap
    new AnthropicProvider(process.env.ANTHROPIC_KEY), // Reliable
    new OpenAIProvider(process.env.OPENAI_KEY), // Last resort
  ],
});

// Customer support chatbot
app.post('/api/chat', async (req, res) => {
  const { message, history } = req.body;

  try {
    const response = await ai.complete({
      messages: [
        {
          role: 'system',
          content: 'You are a helpful e-commerce support assistant. Be concise and friendly.',
        },
        ...history,
        { role: 'user', content: message },
      ],
      maxTokens: 300,
      temperature: 0.7,
    });

    // Track which provider was used
    await analytics.track('ai_request', {
      provider: response.provider,
      tokens: response.usage.totalTokens,
      cost: calculateCost(response),
    });

    res.json({ reply: response.content });
  } catch (error) {
    // All providers failed - use fallback
    res.json({ 
      reply: "I'm having trouble right now. Please try again in a moment or contact support@example.com",
    });
  }
});

Result: 99.95% uptime, average cost reduced by 60% (by routing to Gemini first), and zero customer-facing outages in 6 months.

Get Alerted Before Your Users Notice

The best failover system is one that triggers before you even need it. API Status Check monitors OpenAI, Anthropic, Google, and 200+ other APIs in real-time.

Get notified the second an outage begins. Set up intelligent alerts at apistatuscheck.com and never be caught off guard again.


Ready to build bulletproof AI applications? Implement this pattern today, and you'll sleep better knowing one API outage won't take down your entire business.

API Status Check

Stop checking API status pages manually

Get instant email alerts when OpenAI, Stripe, AWS, and 100+ APIs go down. Know before your users do.

Get Alerts — $9/mo →

Free dashboard available · 14-day trial on paid plans · Cancel anytime

Browse Free Dashboard →