Question About AI Gateway Production Rate Limits

sibitrung4-1680 · May 14, 2026, 5:07am

Hi Vercel team,

I’m currently using Vercel AI Gateway on the pay-as-you-go plan, and I’d like to better understand the production rate limits.

Could you clarify the following?

What are the current limits for pay-as-you-go accounts regarding:
- Requests per minute (RPM)
- Tokens per minute (TPM)
Are these limits applied globally at the account level, or separately per provider/model (OpenAI, Anthropic, Google, etc.)?
Do rate limits automatically increase over time based on account usage or spending, similar to OpenAI usage tiers?
If an application requires consistently high throughput (for example, serving hundreds of concurrent users), what is the best way to avoid or minimize rate limit issues in production?

Thanks!

nduc48138-1881 · May 14, 2026, 7:13am

Also curious about this. Would love to know how the limits work in real production usage.

Topic		Replies	Views
Free credits temporarily have restricted access due to abuse AI SDK	7	896	October 13, 2025
Vercel API Gateway rate limiting: how are limits defined and scaled? Help billing , ai-gateway	0	36	May 3, 2026
Using AI Gateway by purchasing credits on the Vercel Hobby plan AI SDK ai-gateway , pricing	4	102	February 10, 2026
Ai Gateway: Gemini Pricing Calculation Error Help ai-gateway	3	132	December 1, 2025
AI Gateway billing error blocking all model requests despite active credits Help ai-gateway	2	45	April 29, 2026