OpenRouter is an API gateway that provides unified access to multiple AI models through a single endpoint. Instead of managing separate API keys and integrations for Claude, GPT-4, Llama, Mistral, and dozens of other models, you use one API that routes requests to your chosen provider.
This guide covers when OpenRouter makes sense, what to watch out for, and how to integrate it effectively.
What OpenRouter Actually Does
OpenRouter sits between your application and AI providers. You send requests to OpenRouter's API, and it forwards them to the model you specify.
The key benefits:
- Single API, multiple models: Switch between Claude, GPT-4, Llama, and others without code changes
- Unified billing: One account instead of managing multiple provider relationships
- Automatic failover: If your primary model is down, requests can route to a backup
- Pass-through pricing: You pay the same per-token rates as going direct to providers
When OpenRouter Makes Sense
Good use cases
- Experimentation: Testing different models without setting up multiple accounts
- Flexibility: Switching models based on task type (fast model for simple queries, powerful model for complex ones)
- Redundancy: Automatic failover when a provider has issues
- Simplified billing: One invoice instead of many
When to go direct
- High volume production: Direct API calls have slightly lower latency (no proxy hop)
- Enterprise contracts: Volume discounts from providers directly may be better
- Compliance requirements: Some organizations require direct provider relationships
- Single model usage: If you only use Claude, direct Anthropic API is simpler
Gotchas and What to Watch For
1. Model availability varies
Not all models are available at all times. Provider outages, rate limits, and capacity constraints can affect availability. OpenRouter's status page shows current availability, but you should handle model unavailability gracefully in your code.
// Always have a fallback
const models = [
"anthropic/claude-sonnet-4-20250514",
"anthropic/claude-3-haiku-20240307",
"openai/gpt-4o-mini"
];
async function callWithFallback(prompt: string) {
for (const model of models) {
try {
const response = await fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENROUTER_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ model, messages: [{ role: "user", content: prompt }] }),
});
if (response.ok) return response.json();
} catch (e) {
continue; // Try next model
}
}
throw new Error("All models unavailable");
}
2. Warm-up times for less popular models
Models that aren't frequently used may have cold start delays. Popular models like Claude and GPT-4 are typically warm, but niche or open-source models may take a few seconds to spin up.
Tip: If using less common models, implement retry logic with a short delay.
3. Error responses differ from direct APIs
OpenRouter wraps provider errors in its own format. Your error handling needs to account for both OpenRouter-level errors (authentication, rate limits) and provider-level errors (model errors, content policy).
// Check both error sources
if (!response.ok) {
const error = await response.json();
// OpenRouter error
if (error.error?.code) {
console.error("OpenRouter error:", error.error.message);
}
// Provider error (wrapped)
if (error.error?.metadata?.provider_error) {
console.error("Provider error:", error.error.metadata.provider_error);
}
}
4. Rate limits are per-model and per-provider
Your OpenRouter account has overall limits, but individual providers may impose their own limits. Hitting Claude's rate limit through OpenRouter is the same as hitting it directly.
5. Pricing transparency
OpenRouter shows per-model pricing on their website. Verify current prices before committing to a model for production use. Prices can change as providers update their pricing.
6. Response format consistency
Different models return slightly different response structures. If you're switching between models, normalize the response format in your code rather than assuming consistency.
Setting Up OpenRouter
Basic integration
const OPENROUTER_API_KEY = process.env.OPENROUTER_API_KEY;
async function chat(messages: Array<{ role: string; content: string }>) {
const response = await fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENROUTER_API_KEY}`,
"Content-Type": "application/json",
"HTTP-Referer": "https://your-site.com", // Required for some models
"X-Title": "Your App Name", // Optional, helps with debugging
},
body: JSON.stringify({
model: "anthropic/claude-sonnet-4-20250514",
messages,
max_tokens: 1000,
}),
});
return response.json();
}
Environment setup
For Cloudflare Workers:
❯ npx wrangler secret put OPENROUTER_API_KEY
For VPS or local development:
❯ echo "OPENROUTER_API_KEY=your-key-here" >> .env
Cost Optimization Tips
1. Use the right model for the task
| Task Type | Recommended Model | Why |
|-----------|------------------|-----|
| Simple classification | Claude Haiku or GPT-4o-mini | Fast, cheap |
| Code generation | Claude Sonnet | Good balance |
| Complex reasoning | Claude Opus or GPT-4 | Worth the cost |
| High volume, simple | Llama or Mistral | Very cheap |
2. Monitor usage
OpenRouter provides usage dashboards. Check regularly to catch unexpected spikes.
3. Set spending limits
Configure budget alerts in your OpenRouter dashboard to avoid surprise bills.
OpenRouter vs Direct API Comparison
| Factor | OpenRouter | Direct API |
|--------|-----------|------------|
| Latency | Slightly higher (+10-50ms) | Lowest |
| Flexibility | Multiple models, one key | One provider per key |
| Failover | Automatic | Manual implementation |
| Billing | Unified | Per-provider |
| Support | OpenRouter support | Provider support |
| Volume pricing | Pass-through | Negotiable |
Integration with OpenClaw
OpenClaw supports OpenRouter out of the box. Configure it in your environment:
❯ OPENROUTER_API_KEY=your-key
❯ DEFAULT_MODEL=anthropic/claude-sonnet-4-20250514
❯ FALLBACK_MODEL=anthropic/claude-3-haiku-20240307
OpenClaw will automatically use your fallback model if the primary is unavailable.
Summary
OpenRouter is a solid choice for:
- Multi-model applications
- Experimentation and prototyping
- Redundancy and failover
- Simplified vendor management
Go direct to providers for:
- Single-model, high-volume production
- Enterprise contracts with volume discounts
- Latency-critical applications
Start with OpenRouter for flexibility, then optimize specific high-volume paths to direct APIs if needed.