Where can I monitor API status in real-time?

API Status Check (apistatuscheck.com) provides real-time monitoring for 100+ APIs with uptime tracking and alerts. You can view dashboards, subscribe to feeds, and set up notifications in minutes.

Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google)

Q: Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google)?

This post explains Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google) with clear steps and practical examples. Use the guidance to apply the recommendations in your own API workflows.

When OpenAI went down on December 11, 2025, thousands of AI applications stopped working. Chatbots froze. Content generators failed. Customer support systems crashed. If your entire business depends on a single AI provider, you're one outage away from disaster.

But it doesn't have to be that way.

In this guide, you'll learn how to build a production-ready multi-provider AI system that automatically fails over between OpenAI, Anthropic, and Google when one goes down—with complete code examples.

Why You Need Multi-Provider AI

The Outage Reality

AI APIs go down more often than you think:

📡 Don't get caught off guard by AI service outages. Better Stack monitors your API endpoints every 30 seconds and alerts you instantly via Slack, email, or SMS — so you can switch to a fallback provider before users notice.

OpenAI (December 2025): 4-hour outage affecting ChatGPT and API
Anthropic (November 2025): Degraded performance for 6+ hours
Google Gemini (October 2025): Complete API outage for 2 hours
OpenAI (March 2024): 3+ hour outage during business hours
Anthropic (June 2024): Rate limit issues affecting production apps

If you're only using one provider, your availability is capped at their uptime. With three providers and automatic failover, you can achieve 99.99%+ uptime even when individual providers fail.

The Business Case

Cost savings: Route to cheaper providers first, use premium as fallback Reliability: 3 providers with 99.9% uptime each = 99.9999% combined uptime Performance: Route to fastest provider based on real-time latency Compliance: Some regions require data to stay local; multi-provider enables geo-routing

Architecture: Provider Abstraction Layer

The key to multi-provider AI is a unified interface that abstracts away provider differences:

Your Application
       ↓
AI Client (Unified Interface)
       ↓
Provider Router
    ↙  ↓  ↘
OpenAI  Anthropic  Google

Core Components

Unified Interface: Same method signatures regardless of provider
Provider Adapters: Translate between your interface and each provider's API
Router: Decides which provider to use and handles fallback
Health Monitor: Tracks which providers are healthy
Prompt Translator: Adapts prompts for provider-specific quirks

Building the System: Complete Code

Let's build a production-ready multi-provider AI client in TypeScript.

1. Define the Unified Interface

// types.ts
export interface Message {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

export interface CompletionRequest {
  messages: Message[];
  maxTokens?: number;
  temperature?: number;
  stream?: boolean;
}

export interface CompletionResponse {
  content: string;
  provider: string;
  model: string;
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

export interface AIProvider {
  name: string;
  complete(request: CompletionRequest): Promise<CompletionResponse>;
  isHealthy(): Promise<boolean>;
}

2. Implement Provider Adapters

OpenAI Adapter

// providers/openai.ts
import OpenAI from 'openai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';

export class OpenAIProvider implements AIProvider {
  name = 'openai';
  private client: OpenAI;
  private model: string;

  constructor(apiKey: string, model = 'gpt-4-turbo') {
    this.client = new OpenAI({ apiKey });
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    const response = await this.client.chat.completions.create({
      model: this.model,
      messages: request.messages,
      max_tokens: request.maxTokens,
      temperature: request.temperature,
      stream: false,
    });

    return {
      content: response.choices[0].message.content || '',
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usage?.prompt_tokens || 0,
        completionTokens: response.usage?.completion_tokens || 0,
        totalTokens: response.usage?.total_tokens || 0,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      // Lightweight health check
      await this.client.models.retrieve('gpt-3.5-turbo');
      return true;
    } catch (error) {
      console.error('OpenAI health check failed:', error);
      return false;
    }
  }
}

Anthropic Adapter

// providers/anthropic.ts
import Anthropic from '@anthropic-ai/sdk';
import { AIProvider, CompletionRequest, CompletionResponse, Message } from '../types';

export class AnthropicProvider implements AIProvider {
  name = 'anthropic';
  private client: Anthropic;
  private model: string;

  constructor(apiKey: string, model = 'claude-3-5-sonnet-20241022') {
    this.client = new Anthropic({ apiKey });
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    // Anthropic requires system messages separately
    const systemMessage = request.messages.find(m => m.role === 'system');
    const messages = request.messages
      .filter(m => m.role !== 'system')
      .map(m => ({
        role: m.role as 'user' | 'assistant',
        content: m.content,
      }));

    const response = await this.client.messages.create({
      model: this.model,
      max_tokens: request.maxTokens || 4096,
      temperature: request.temperature,
      system: systemMessage?.content,
      messages,
    });

    return {
      content: response.content[0].type === 'text' 
        ? response.content[0].text 
        : '',
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usage.input_tokens,
        completionTokens: response.usage.output_tokens,
        totalTokens: response.usage.input_tokens + response.usage.output_tokens,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      // Minimal completion as health check
      await this.client.messages.create({
        model: this.model,
        max_tokens: 1,
        messages: [{ role: 'user', content: 'Hi' }],
      });
      return true;
    } catch (error) {
      console.error('Anthropic health check failed:', error);
      return false;
    }
  }
}

Google Gemini Adapter

// providers/google.ts
import { GoogleGenerativeAI } from '@google/generative-ai';
import { AIProvider, CompletionRequest, CompletionResponse } from '../types';

export class GoogleProvider implements AIProvider {
  name = 'google';
  private client: GoogleGenerativeAI;
  private model: string;

  constructor(apiKey: string, model = 'gemini-1.5-pro') {
    this.client = new GoogleGenerativeAI(apiKey);
    this.model = model;
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    const model = this.client.getGenerativeModel({ model: this.model });

    // Convert messages to Gemini format
    const systemMessage = request.messages.find(m => m.role === 'system');
    const history = request.messages
      .filter(m => m.role !== 'system')
      .map(m => ({
        role: m.role === 'assistant' ? 'model' : 'user',
        parts: [{ text: m.content }],
      }));

    const chat = model.startChat({
      history: history.slice(0, -1),
      generationConfig: {
        maxOutputTokens: request.maxTokens,
        temperature: request.temperature,
      },
      systemInstruction: systemMessage?.content,
    });

    const lastMessage = history[history.length - 1];
    const result = await chat.sendMessage(lastMessage.parts[0].text);
    const response = result.response;

    return {
      content: response.text(),
      provider: this.name,
      model: this.model,
      usage: {
        promptTokens: response.usageMetadata?.promptTokenCount || 0,
        completionTokens: response.usageMetadata?.candidatesTokenCount || 0,
        totalTokens: response.usageMetadata?.totalTokenCount || 0,
      },
    };
  }

  async isHealthy(): Promise<boolean> {
    try {
      const model = this.client.getGenerativeModel({ model: this.model });
      await model.generateContent('Test');
      return true;
    } catch (error) {
      console.error('Google health check failed:', error);
      return false;
    }
  }
}

3. Build the Router with Automatic Fallback

// router.ts
import { AIProvider, CompletionRequest, CompletionResponse } from './types';

interface RouterConfig {
  providers: AIProvider[];
  maxRetries?: number;
  healthCheckInterval?: number;
}

export class AIRouter {
  private providers: AIProvider[];
  private maxRetries: number;
  private healthStatus: Map<string, boolean> = new Map();

  constructor(config: RouterConfig) {
    this.providers = config.providers;
    this.maxRetries = config.maxRetries || 3;

    // Initialize health checks
    this.startHealthMonitoring(config.healthCheckInterval || 60000);
  }

  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    let lastError: Error | null = null;

    // Try each provider in order until one succeeds
    for (const provider of this.providers) {
      // Skip unhealthy providers
      if (this.healthStatus.get(provider.name) === false) {
        console.log(`Skipping unhealthy provider: ${provider.name}`);
        continue;
      }

      // Retry logic for transient failures
      for (let attempt = 0; attempt < this.maxRetries; attempt++) {
        try {
          console.log(`Attempting ${provider.name} (attempt ${attempt + 1}/${this.maxRetries})`);
          
          const response = await this.withTimeout(
            provider.complete(request),
            30000 // 30s timeout
          );

          console.log(`✓ Success with ${provider.name}`);
          return response;

        } catch (error: any) {
          lastError = error;
          console.error(`✗ ${provider.name} failed (attempt ${attempt + 1}):`, error.message);

          // Don't retry on client errors (4xx)
          if (error.status && error.status >= 400 && error.status < 500) {
            console.log(`Client error (${error.status}), skipping retries`);
            break;
          }

          // Exponential backoff for retries
          if (attempt < this.maxRetries - 1) {
            const backoff = Math.pow(2, attempt) * 1000;
            console.log(`Waiting ${backoff}ms before retry...`);
            await new Promise(resolve => setTimeout(resolve, backoff));
          }
        }
      }

      // Mark provider as unhealthy after all retries fail
      this.healthStatus.set(provider.name, false);
    }

    throw new Error(
      `All providers failed. Last error: ${lastError?.message}`
    );
  }

  private async withTimeout<T>(
    promise: Promise<T>,
    timeoutMs: number
  ): Promise<T> {
    const timeout = new Promise<never>((_, reject) =>
      setTimeout(() => reject(new Error('Request timeout')), timeoutMs)
    );
    return Promise.race([promise, timeout]);
  }

  private startHealthMonitoring(intervalMs: number) {
    // Initial health check
    this.checkAllHealth();

    // Periodic health checks
    setInterval(() => this.checkAllHealth(), intervalMs);
  }

  private async checkAllHealth() {
    for (const provider of this.providers) {
      try {
        const healthy = await provider.isHealthy();
        this.healthStatus.set(provider.name, healthy);
        console.log(`Health check ${provider.name}: ${healthy ? '✓' : '✗'}`);
      } catch (error) {
        this.healthStatus.set(provider.name, false);
        console.error(`Health check failed for ${provider.name}:`, error);
      }
    }
  }

  getHealthStatus(): Record<string, boolean> {
    return Object.fromEntries(this.healthStatus);
  }
}

4. Putting It All Together

// index.ts
import { AIRouter } from './router';
import { OpenAIProvider } from './providers/openai';
import { AnthropicProvider } from './providers/anthropic';
import { GoogleProvider } from './providers/google';

// Initialize providers
const openai = new OpenAIProvider(process.env.OPENAI_API_KEY!);
const anthropic = new AnthropicProvider(process.env.ANTHROPIC_API_KEY!);
const google = new GoogleProvider(process.env.GOOGLE_API_KEY!);

// Create router with fallback order: Google → Anthropic → OpenAI
// (Google is cheapest, OpenAI is most expensive)
const ai = new AIRouter({
  providers: [google, anthropic, openai],
  maxRetries: 2,
  healthCheckInterval: 60000, // Check health every minute
});

// Use it!
async function main() {
  try {
    const response = await ai.complete({
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: 'Explain quantum computing in simple terms.' },
      ],
      maxTokens: 500,
      temperature: 0.7,
    });

    console.log(`Response from ${response.provider}:`);
    console.log(response.content);
    console.log(`\nTokens used: ${response.usage.totalTokens}`);
  } catch (error) {
    console.error('All AI providers failed:', error);
  }
}

main();

Handling Provider Differences

Prompt Compatibility Layer

Different providers have different strengths and behaviors. Here's how to handle that:

// prompt-translator.ts
export class PromptTranslator {
  static adaptForProvider(
    messages: Message[],
    provider: string
  ): Message[] {
    switch (provider) {
      case 'anthropic':
        // Claude prefers detailed, structured prompts
        return this.makeMoreVerbose(messages);
      
      case 'google':
        // Gemini handles casual language better
        return this.makeMoreCasual(messages);
      
      case 'openai':
      default:
        return messages;
    }
  }

  private static makeMoreVerbose(messages: Message[]): Message[] {
    // Add more structure for Claude
    return messages.map(msg => {
      if (msg.role === 'user') {
        return {
          ...msg,
          content: msg.content + '\n\nPlease provide a detailed response.',
        };
      }
      return msg;
    });
  }

  private static makeMoreCasual(messages: Message[]): Message[] {
    // Simplify for Gemini
    return messages.map(msg => ({
      ...msg,
      content: msg.content.replace(/\n\n/g, ' '),
    }));
  }
}

Capability Detection

Not all providers support the same features:

interface ProviderCapabilities {
  vision: boolean;
  toolUse: boolean;
  streaming: boolean;
  maxContextTokens: number;
}

const capabilities: Record<string, ProviderCapabilities> = {
  openai: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 128000, // GPT-4 Turbo
  },
  anthropic: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 200000, // Claude 3.5 Sonnet
  },
  google: {
    vision: true,
    toolUse: true,
    streaming: true,
    maxContextTokens: 1000000, // Gemini 1.5 Pro
  },
};

// Use capability detection in router
if (requiresVision && !capabilities[provider.name].vision) {
  console.log(`Skipping ${provider.name}: vision not supported`);
  continue;
}

Cost Optimization

Route to the cheapest provider first, premium as fallback:

🔐 Managing API keys across multiple AI providers? 1Password securely stores and organizes your API tokens, environment variables, and service credentials — rotate keys in seconds when a provider has issues.

interface ProviderPricing {
  inputPer1M: number;  // USD per 1M input tokens
  outputPer1M: number; // USD per 1M output tokens
}

const pricing: Record<string, ProviderPricing> = {
  google: { inputPer1M: 1.25, outputPer1M: 5.00 },      // Gemini 1.5 Pro
  anthropic: { inputPer1M: 3.00, outputPer1M: 15.00 },  // Claude 3.5 Sonnet
  openai: { inputPer1M: 10.00, outputPer1M: 30.00 },    // GPT-4 Turbo
};

// Sort providers by cost
const providers = [google, anthropic, openai].sort((a, b) => {
  return pricing[a.name].inputPer1M - pricing[b.name].inputPer1M;
});

const ai = new AIRouter({ providers });

Result: You use Google (cheapest) by default, fail over to Anthropic if Google is down, and only use OpenAI (most expensive) as a last resort.

Monitoring with API Status Check

Instead of waiting for health checks to detect outages, proactively monitor provider status:

// status-monitor.ts
async function checkProviderStatus(provider: string): Promise<boolean> {
  try {
    const response = await fetch(
      `https://apistatuscheck.com/api/status/${provider}`
    );
    const data = await response.json();
    return data.status === 'operational';
  } catch {
    return true; // Assume operational if check fails
  }
}

// In your router
async complete(request: CompletionRequest): Promise<CompletionResponse> {
  for (const provider of this.providers) {
    // Check API Status Check first
    const isUp = await checkProviderStatus(provider.name);
    if (!isUp) {
      console.log(`Skipping ${provider.name}: reported down on API Status Check`);
      continue;
    }

    // Proceed with request...
  }
}

Why this matters: API Status Check aggregates reports from thousands of developers. You'll know about outages seconds after they start, before your own health checks fail.

Provider Comparison Table

Provider	Best For	Strengths	Weaknesses	Cost (1M tokens)
OpenAI GPT-4	Complex reasoning, coding	Highest quality, best tool use	Most expensive, frequent outages	$10 input / $30 output
Anthropic Claude	Long context, analysis	200K context, reliable	Slower responses	$3 input / $15 output
Google Gemini	Cost efficiency, speed	1M context, cheapest	Less capable reasoning	$1.25 input / $5 output

When to Use Each

Default to Google Gemini when:

Cost is a concern
You need massive context windows
Task is straightforward (summarization, classification)

Upgrade to Anthropic when:

You need high-quality analysis
Working with long documents
Gemini fails or is down

Fallback to OpenAI when:

Task requires complex reasoning
Using function calling/tools heavily
Both Gemini and Claude are down

Production Checklist

Implement exponential backoff with jitter
Add circuit breakers to fail fast on persistent outages
Log all provider switches for debugging
Track cost per provider in your analytics
Set up alerts when all providers fail
Monitor API Status Check for early outage detection
Cache responses where possible to reduce API calls
Implement rate limiting per provider
Store provider preferences per user/session
Test failover regularly (chaos engineering)

Advanced: Streaming Support

For real-time responses, implement streaming with fallback:

async *completeStream(
  request: CompletionRequest
): AsyncGenerator<string, void, unknown> {
  for (const provider of this.providers) {
    try {
      // Try streaming with current provider
      const stream = await provider.completeStream(request);
      
      for await (const chunk of stream) {
        yield chunk;
      }
      
      return; // Success, stop trying other providers
    } catch (error) {
      console.error(`Streaming failed with ${provider.name}:`, error);
      // Try next provider
    }
  }
  
  throw new Error('Streaming failed with all providers');
}

FAQ

Q: Won't this increase my costs? Not necessarily. By routing to cheaper providers first (Google → Anthropic → OpenAI), you actually save money. Most requests will use the cheapest provider.

Q: How do I handle different response formats? The adapter pattern handles this. Each provider adapter translates its native response format into your unified interface.

Q: What about API key security? Store keys in environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault). Never commit them to git.

Q: Can I mix streaming and non-streaming? Yes, but handle it per-request. Some providers support streaming, others don't. Your router should detect this and route appropriately.

Q: How do I test this without burning through API credits? Use mock providers in testing:

class MockProvider implements AIProvider {
  async complete(request: CompletionRequest): Promise<CompletionResponse> {
    return {
      content: 'Mock response',
      provider: 'mock',
      model: 'mock-model',
      usage: { promptTokens: 10, completionTokens: 20, totalTokens: 30 },
    };
  }
  
  async isHealthy(): Promise<boolean> {
    return true;
  }
}

Q: What if a provider returns an incomplete response? Add validation in your adapters:

if (!response.content || response.content.length < 10) {
  throw new Error('Incomplete response, triggering fallback');
}

Q: Should I use this for production? Yes! Many companies use this pattern, including Anthropic themselves (they have fallback to other providers in Claude). The key is thorough testing and monitoring.

Real-World Example: E-commerce Chatbot

Here's how a real e-commerce company uses multi-provider AI:

const ai = new AIRouter({
  providers: [
    new GoogleProvider(process.env.GOOGLE_KEY, 'gemini-1.5-flash'), // Fast, cheap
    new AnthropicProvider(process.env.ANTHROPIC_KEY), // Reliable
    new OpenAIProvider(process.env.OPENAI_KEY), // Last resort
  ],
});

// Customer support chatbot
app.post('/api/chat', async (req, res) => {
  const { message, history } = req.body;

  try {
    const response = await ai.complete({
      messages: [
        {
          role: 'system',
          content: 'You are a helpful e-commerce support assistant. Be concise and friendly.',
        },
        ...history,
        { role: 'user', content: message },
      ],
      maxTokens: 300,
      temperature: 0.7,
    });

    // Track which provider was used
    await analytics.track('ai_request', {
      provider: response.provider,
      tokens: response.usage.totalTokens,
      cost: calculateCost(response),
    });

    res.json({ reply: response.content });
  } catch (error) {
    // All providers failed - use fallback
    res.json({ 
      reply: "I'm having trouble right now. Please try again in a moment or contact support@example.com",
    });
  }
});

Result: 99.95% uptime, average cost reduced by 60% (by routing to Gemini first), and zero customer-facing outages in 6 months.

Get Alerted Before Your Users Notice

The best failover system is one that triggers before you even need it. API Status Check monitors OpenAI, Anthropic, Google, and 200+ other APIs in real-time.

Get notified the second an outage begins. Set up intelligent alerts at apistatuscheck.com and never be caught off guard again.

Ready to build bulletproof AI applications? Implement this pattern today, and you'll sleep better knowing one API outage won't take down your entire business.