Cloudflare AI Gateway

commercial Freemium

Universal AI Gateway providing caching, rate limiting, analytics, and cost control for any AI API with one line of code.

—

Unlimited Requests

4.6/5 Rating

2023 Founded

Overview

Cloudflare AI Gateway is a unified API gateway that sits between your application and AI providers, offering caching, rate limiting, analytics, and cost controls for any LLM or AI API. With just a single line of code change, developers gain visibility into usage patterns, implement request retries and model fallbacks, and leverage Cloudflare's global edge network for reduced latency. The service is free with unlimited requests on all plans.

The Verdict

Who Should Use Cloudflare AI Gateway?

Best For

Teams needing observability across multiple AI providers
Applications requiring intelligent caching to reduce costs
Organizations implementing rate limiting and cost controls
Developers wanting model fallbacks and automatic retries

Not Ideal For

Use cases requiring on-premise deployment
Teams needing advanced prompt engineering features

What's Great

Free unlimited requests on all plans
One-line code change to integrate
Intelligent caching reduces API costs by 50-90%
Real-time analytics and logging dashboard
Model fallbacks and automatic retry logic

Official Docs

Watch Out For

Limited to providers supported by Cloudflare
Advanced features require enterprise plan
May add minimal latency overhead

Documentation

Team Budget & Governance

In June 2026 Cloudflare added spend limits to AI Gateway — making it a genuine budget-enforcement layer, not just an observability proxy. It's the lowest-friction option here: no self-hosting, and spend limits are available on any paid Cloudflare account.

Daily, weekly, or monthly windows — fixed (calendar reset) or rolling (trailing N days); daily is a first-class option, not an afterthought
Metadata-scoped limits — cap spend by user ID, team, or application, up to 20 rules per gateway
Block or downgrade — return HTTP 429 at the cap, or route to a cheaper fallback model instead of failing
Caveat: lacks deep per-team RBAC and virtual-key hierarchies that LiteLLM and Portkey offer

Spend Limits docs

Pricing

Free

Unlimited requests, caching, analytics, rate limiting

Enterprise

Custom

Advanced features, SLA, dedicated support

View all features & details

Key Features

Universal gateway for OpenAI, Anthropic, Azure, and more
Intelligent caching with configurable TTL
Rate limiting and request quotas
Real-time analytics and logging
Model fallbacks and retry logic
Cost tracking and budget alerts

Platforms

REST API
Cloudflare Workers
Global edge network
Any HTTP client

How It Compares

Feature	Cloudflare AI Gateway	Portkey	LiteLLM Proxy
Pricing	Free unlimited	Usage-based	Free OSS
Caching	Built-in	Built-in	Basic
Analytics	Full dashboard	Advanced	Basic
Best For	Simplicity & scale	Enterprise	Self-hosted

User Reviews

Loading reviews...