Cloudflare AI Gateway
Universal AI Gateway providing caching, rate limiting, analytics, and cost control for any AI API with one line of code.
Overview
Cloudflare AI Gateway is a unified API gateway that sits between your application and AI providers, offering caching, rate limiting, analytics, and cost controls for any LLM or AI API. With just a single line of code change, developers gain visibility into usage patterns, implement request retries and model fallbacks, and leverage Cloudflare's global edge network for reduced latency. The service is free with unlimited requests on all plans.
The Verdict
Who Should Use Cloudflare AI Gateway?
Best For
- Teams needing observability across multiple AI providers
- Applications requiring intelligent caching to reduce costs
- Organizations implementing rate limiting and cost controls
- Developers wanting model fallbacks and automatic retries
Not Ideal For
- Use cases requiring on-premise deployment
- Teams needing advanced prompt engineering features
What's Great
- Free unlimited requests on all plans
- One-line code change to integrate
- Intelligent caching reduces API costs by 50-90%
- Real-time analytics and logging dashboard
- Model fallbacks and automatic retry logic
Watch Out For
- Limited to providers supported by Cloudflare
- Advanced features require enterprise plan
- May add minimal latency overhead
Team Budget & Governance
In June 2026 Cloudflare added spend limits to AI Gateway — making it a genuine budget-enforcement layer, not just an observability proxy. It's the lowest-friction option here: no self-hosting, and spend limits are available on any paid Cloudflare account.
- Daily, weekly, or monthly windows — fixed (calendar reset) or rolling (trailing N days); daily is a first-class option, not an afterthought
- Metadata-scoped limits — cap spend by user ID, team, or application, up to 20 rules per gateway
- Block or downgrade — return HTTP 429 at the cap, or route to a cheaper fallback model instead of failing
- Caveat: lacks deep per-team RBAC and virtual-key hierarchies that LiteLLM and Portkey offer
Pricing
View all features & details
Key Features
- Universal gateway for OpenAI, Anthropic, Azure, and more
- Intelligent caching with configurable TTL
- Rate limiting and request quotas
- Real-time analytics and logging
- Model fallbacks and retry logic
- Cost tracking and budget alerts
Platforms
- REST API
- Cloudflare Workers
- Global edge network
- Any HTTP client
How It Compares
| Feature | Cloudflare AI Gateway | Portkey | LiteLLM Proxy |
|---|---|---|---|
| Pricing | Free unlimited | Usage-based | Free OSS |
| Caching | Built-in | Built-in | Basic |
| Analytics | Full dashboard | Advanced | Basic |
| Best For | Simplicity & scale | Enterprise | Self-hosted |