According to the pricing page on AI Gateway, the total cost for these requests (input + output tokens) should be approximately 7x lower than they are.
Input ~30k + output ~200 = $0.01325, according to the pricing list.
This is decrementing credits in line with what the usage says, so it’s not a UI error. It also appears not to be due to reasoning tokens - these are being logged as 0 through the AI SDK, and the speed of response would support that. I suspect it’s due to Cerebras, as their pricing is substantially higher than z.ai ( Cerebras ) - is there any way to force the provider to be z.ai to confirm?
Hi - we’ve added new columns (Reasoning, Cache Read, Cache Write) to help with matching requests with cost. You can also use order: https://vercel.com/docs/ai-gateway/models-and-providers/provider-options#configure-the-provider-order-in-your-request to set your preferred providers for the model as well if you want to optimize for cost. The pricing list can vary by provider for a single model as well, if you want to check the exact prices we have by provider, that can also be found here: GLM 4.7 on Vercel AI Gateway. The pricing page shows the lowest cost per model, so this view is more accurate.
Hey there, @employersai! Just checking in on your pricing issue for zai/glm-4.7 in AI Gateway. Have you found a solution, or do you still need help with it? Let me know!