Groq AI Gateway Pricing

kop7er · February 11, 2026, 4:42pm

Hey everyone,

On the AI Gateway, the pricing for Groq cache seems to be missing. Does it still apply? Another thing is that the pricing tables seem to be broken. I can’t scroll horizontally. I need to, like, drag select.

Example for OpenAI GPT-120B.

Vercel:

Groq:

Thank you for the help.

jerilynzheng · February 11, 2026, 6:53pm

Hi @kop7er - thanks for flagging this! Caching will still apply, we just added the pricing so it should be visible now in the UI. Horizontal scroll coming as well - stay tuned.

kop7er · February 11, 2026, 7:35pm

Thank you @jerilynzheng. Have other problems with Groq for you, except these ones are actually causing problems in production. I tried opening a ticket, but I don’t have a paid account yet, as I’m testing the platform first.

All of these issues occurred when trying GPT-OSS-120B and GPT-OSS-20B.

A good chunk of the time, I keep getting a 503 and service unavailable, only with Groq. Seems to be fixed when adding my own key, which defeats the whole purpose.
The reasoning tokens count is always 0. Their API returns the correct count, so it’s not being correctly extracted by the AI Gateway.
The reasoning_effort option isn’t being properly passed down to Groq’s API. I noticed that I always had a ton of reasoning when setting it to low. Tried the exact same request, but directly using their API, and got the expected amount. So I tried once again, but this time not configured at all, letting it default, and I got a 1:1 of what I get when using the AI Gateway.

Groq Reasoning Docs: Reasoning - GroqDocs

kop7er · February 11, 2026, 8:18pm

Sorry @jerilynzheng for keeping on posting new issues here, but all of these are currently happening, and this is the most direct way I have.

Because Groq kept failing, I added GPT-4.1-Mini as a fallback, even though I keep sending the same prompts. Nothing caches at all on OpenAI, while that happens on Groq.

You can also see here the Reasoning always being 0.

kop7er · February 12, 2026, 12:24am

Marking this as not solved, as even though pricing was updated UI-wise, the actual usage pricing doesn’t take it into account.

Request example:

Prompt Tokens: 1271
Cached Tokens: 1024
Completion Tokens: 1665

Category	Tokens	Rate (per 1M)	Calculation	Cost
Cached Input	1,024	$0.07	1024 / 1M * 0.07$	$0.00007168
Uncached Input	247	$0.15	247 / 1M * 0.15$	$0.00003705
Output	1,665	$0.60	1665 / 1M * 0.60$	$0.00099900

Expected Cost: $0.00110773
Actual Cost: $0.00126645

Not sure if it’s because reasoning is not being correctly counted or not.

For some reason, it’s even cheaper to run them uncached

pawlean · February 12, 2026, 7:06am

Thanks! The team is looking into this

jerilynzheng · February 12, 2026, 7:39am

Hi! Groq is a overloaded provider, so if you want to use AI Gateway, it’s better to not restrict to any specific provider in case they do have capacity issues for a specific model. We have fallbacks and multiple providers for this exact reason. What are you using to try out these models - AI SDK or something else? I can help more on the reasoning side if we get more details there.

On the caching side, there are providers that require a minimum prompt size to trigger caching - I believe this is the case with OpenAI.

kop7er · February 12, 2026, 9:07am

Hey, about the reasoning_effort. When you take the exact same body and request a completion, you get 2 very different amount of reasoning. Using Groq’s API, it returns the expected amount when set to low, about 15 tokens. Using the AI Gateway, exact same body, you get like 500 tokens and the reasoning text is the same as if you didn’t set an effort.

With the AI Gateway, at least on Groq and with the GPT OSS models, setting reasoning_effort does not change anything at all.

kop7er · February 23, 2026, 12:44pm

Hey @pawlean @jerilynzheng , sorry for asking, but is there any update on these issues?

The reasoning_effort parameter is also an issue on other providers, like Amazon Bedrock, when using the GPT OSS models.

Thank you.

Topic		Replies	Views
Ai Gateway: Gemini Pricing Calculation Error Help ai-gateway	3	135	December 1, 2025
AI Gateway pricing discrepancy between documented rates and actual charges Help billing , ai-gateway , pricing	6	88	January 22, 2026
SOLVED: Pricing Issues for zai/glm-4.7, in AI Gateway? Help ai-gateway	4	159	February 8, 2026
Question About AI Gateway Production Rate Limits Help ai-gateway	1	43	May 14, 2026
AI Gateway billing error blocking all model requests despite active credits Help ai-gateway	2	61	April 29, 2026

Groq AI Gateway Pricing

Related topics