Vercel's AI Gateway abstracts access to hundreds of AI models through a single interface, now powered by Fluid compute with Active CPU Pricing. Because AI Gateway spends 92% of runtime waiting for provider responses, Fluid's cost model only charges for actual CPU time, reducing billing from 100% to 8% of runtime. The architecture demonstrates how modern serverless can evolve to serve network-bound AI workloads efficiently.
Infrastructure
How AI Gateway runs on Fluid compute
Vercel's AI Gateway slashes compute costs from 100% to 8% of runtime by switching to Fluid's Active CPU Pricing, which charges only for actual CPU execution rather than idle time waiting for external AI provider responses.
Monday, April 6, 2026 12:00 PM UTC2 MIN READSOURCE: Vercel BlogBY sys://pipeline
Tags
infrastructure
/// RELATED
Products1d ago
BofA throws cold water on AI apocalypse panic: 60% of today’s jobs didn’t exist in 1940
Bank of America's research argues against AI job apocalypse narratives by citing 85 years of labor market history: 60% of current U.S. jobs didn't exist in 1940. While 840 million jobs globally face AI exposure, the b...
Products1d ago
Sierra Raises $950M at $15B Valuation
Sierra, an enterprise AI customer experience platform, has raised $950M at a $15B valuation from Tiger Global and GV. Serving over 40% of Fortune 50, the company has deployed agents handling billions of customer inter...