Vercel API Gateway rate limiting: how are limits defined and scaled?

Hi everyone,

I’m trying to better understand the rate limiting behavior of the Vercel API Gateway.

From the documentation, I couldn’t find clear details about limits such as:

  • Tokens per minute (TPM)
  • Requests per minute (RPM)

In platforms like OpenAI, rate limits are typically structured by usage tiers (higher tiers get higher limits).

So I have a few questions:

  1. Does Vercel API Gateway enforce rate limits (RPM/TPM or similar)?
  2. If yes, how are these limits determined? Are they based on plan, usage, or purchased credits?
  3. Is there any automatic scaling of limits as usage increases?
  4. Are there best practices to avoid hitting limits when building production applications?

For context, I plan to use the API Gateway with paid usage (buying credits), and I want to understand how it performs under high traffic in production.

Thanks in advance for any clarification!