Problem
For the gemini-3-flash-preview, there is support for a non-thinking option through thinking_level (or the OpenAI-compatible reasoning_effort) when a minimal value is provided. However, I’m unable to enable this mode when using AI Gateway through the https://ai-gateway.vercel.sh/v1/chat/completions endpoint.
In the documentation, it says that when using the reasoning options, they should be mapped to the correct config options for the provider. But I can’t get it to work with the minimal setting.
Current Behavior
None of the following combinations seem to work:
reasoning.enabledset tofalsereasoning.effortset tononeorminimal
Although reasoning output isn’t present in the response when reasoning.enabled is false, I consistently see reasoning_tokens in the response. The response duration also suggests that reasoning was still performed when comparing the same request directly with the provider.
I’ve also tried using providerOptions to set thinking and reasoning levels, but to no avail.
Request Example
curl -X POST "https://ai-gateway.vercel.sh/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_AI_GATEWAY_API_KEY" \
-d '{
"model": "google/gemini-3-flash",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false,
"reasoning": {"enabled": false, "effort": "minimal"}
}'
I understand that minimal !== “no reasoning”; but for the same request sent directly to the provider, I can consistently confirm that it truly has no reasoning executed.
Did someone succeed with a similar task, or have a suggestion on how to achieve it?