LiteLLM (Unified Gateway)
Open-source LLM gateway providing unified API to 100+ model providers. Offers centralized cost tracking, logging, virtual keys with spend limits, and automatic failover.
Tags: gateway, self-hosted, open-source, cost-tracking, unified-api
Use Cases
- Self-hosted multi-provider gateway for OpenClaw with centralized cost tracking
- Automatic failover between providers (Anthropic → OpenAI → Together) for high availability
- Virtual API key management with per-key spend limits
- Load balancing across multiple model deployments or regions
- Unified logging and analytics for AI spend across providers
- Enterprise environments requiring self-hosted gateway for compliance
Tips
- Start with the Python SDK for simple multi-provider switching. Graduate to the proxy server when you need centralized logging and key management.
- For OpenClaw, deploy LiteLLM as a Docker container alongside your gateway for the cleanest setup.
- Use virtual API keys to set per-user or per-task spend limits — great for controlling costs across different OpenClaw workflows.
- Configure fallback chains: primary → secondary → tertiary provider. LiteLLM handles automatic failover.
- Disable database logging in production if you don't need the dashboard — significantly improves throughput.
- Use LiteLLM's cost tracking to monitor spend across all providers in one dashboard.
Known Issues & Gotchas
- At high request volumes (100K+/day), the Python-based proxy can become a performance bottleneck. Database logging on the request path is a known issue.
- The proxy requires external dependencies: PostgreSQL/SQLite for logging, Redis for caching. These add operational complexity.
- LiteLLM has two distinct components (SDK vs Proxy) — make sure you're setting up the right one. OpenClaw needs the proxy server.
- Enterprise features (SSO, audit logs) require a paid license. The MIT-licensed version is powerful but may lack enterprise governance.
- Performance can degrade over time — users report latency creep after hours of operation. Periodic restarts may be needed.
- Cache hits can still be slow (10+ seconds) due to blocking on the response path — not just cache lookup time.
- Configuration can be complex. The config.yaml for model routing, fallbacks, and load balancing has a learning curve.
Alternatives
- OpenRouter
- Cloudflare AI Gateway
- Vercel AI Gateway
- Direct Provider APIs
Community Feedback
We have been using LiteLLM proxy in production for a while. Overall, solid project. Easy multi provider abstraction, good defaults, very active community. For prototyping and early stage setups, it works well.
— Reddit r/aiagents
Once traffic increases, some issues start showing up. Logging becoming a bottleneck — once the database crosses 1M+ logs, API requests start slowing down because logging is on the request path.
— Reddit r/aiagents
LiteLLM has become the default open-source standard for teams attempting to normalize the fragmented landscape of LLM APIs. For individual developers and early-stage startups, it is an excellent tool.
— TrueFoundry Review
Configuration Examples
LiteLLM as OpenClaw provider
providers:
litellm:
apiKey: sk-litellm-xxxxx # Your LiteLLM virtual key
baseUrl: http://localhost:4000
model: litellm/claude-sonnet-4-5 # Model name from LiteLLM configLiteLLM Docker deployment
# docker-compose.yml for LiteLLM
services:
litellm:
image: ghcr.io/berriai/litellm:main-latest
ports:
- "4000:4000"
volumes:
- ./litellm_config.yaml:/app/config.yaml
command: --config /app/config.yamlLiteLLM config with fallbacks
# litellm_config.yaml
model_list:
- model_name: default
litellm_params:
model: anthropic/claude-sonnet-4-5
api_key: sk-ant-xxxxx
- model_name: default
litellm_params:
model: openai/gpt-5.4-mini
api_key: sk-proj-xxxxx
# Automatic failover: Anthropic → OpenAI