Can't disable reasoning for zai/glm-5.1

hugopod-8255 · June 10, 2026, 12:23pm

I’m experiencing two related issues with zai/glm-5.1 through AI Gateway:

Reasoning cannot be disabled. The documented controls (reasoning: { "effort": "none" }, reasoning: { "enabled": false }) are accepted but not honored by 5 of the 6 providers serving this model. Only baseten actually disables thinking.

Context: we’re benchmarking GLM-5.1 for a latency-critical voice agent, so ~100–250 hidden reasoning tokens before the first content token (2.5–10s of added TTFT/TTFAT) makes the model unusable.

Issue 1: reasoning disable not forwarded/honored

provider	reasoning option	total	out_tokens	reasoning_tokens
baseten	`{"enabled": false}`	449 ms	1	0
baseten	`{"effort": "none"}`	423 ms	1	0
togetherai	`{"enabled": false}`	3995 ms	176	158
togetherai	`{"effort": "none"}`	5711 ms	102	85
fireworks	`{"enabled": false}`	4444 ms	182	180
fireworks	`{"effort": "none"}`	2841 ms	189	178
deepinfra	`{"enabled": false}`	3863 ms	274	265
deepinfra	`{"effort": "none"}`	2552 ms	124	108
novita	`{"enabled": false}`	6290 ms	156	153
novita	`{"effort": "none"}`	9876 ms	188	185
zai	`{"enabled": false}`	4951 ms	159	156
zai	`{"effort": "none"}`	4731 ms	164	161

Notes:

Per the advanced configuration docs, effort: "none" “disables reasoning”. It doesn’t for this model on 5/6 providers.
{"enabled": false} is worse than a no-op: it hides the reasoning text from the stream but the tokens are still generated and billed (reasoning_chars=0 while reasoning_tokens=150+). That’s easy to misread as “thinking disabled” while you still pay the full latency and cost.

seething-tabby · June 11, 2026, 7:29am

Hi,
While not GLM per se, I find it that setting it through providerOptions works better for most of the models that I used.
You can find the reasoning config for each model from their official documentation (For GLM: Thinking Mode - Overview - Z.AI DEVELOPER DOCUMENT ) and perhaps need to check if different providers have different config (e.g. Bedrock).

"providerOptions": {
    "gateway": {
        "only": ["zai", "novita"],
        "sort": "ttft"
    },
    "zai": {"thinking": {"type": "disabled"}},
    "novita": {"thinking": {"type": "disabled"}}
}

Though it does that mean you need to set them for each providers that you use.

ryux1 · June 12, 2026, 7:35pm

Hi Hugopod,

Your table makes this look less like a client-side syntax issue and more like provider-specific behavior behind the Gateway.

One thing I’d do for the benchmark is pin the provider instead of letting Gateway route across all providers, since you already found that Baseten is the only one honoring the disable option:

providerOptions: {
  gateway: {
    only: ["baseten"]
  },
  baseten: {
    thinking: { type: "disabled" }
  }
}

Then run the same prompt a few times and compare:

- first token latency
- total output tokens
- reasoning_tokens
- provider metadata / selected provider

If Baseten consistently returns reasoning_tokens: 0 and the others do not, then I’d avoid fallback routing for this latency-critical path unless each fallback provider has its own verified native “thinking disabled” option.

I’d also be careful with reasoning: { effort: "none" } as a portable assumption here. In the AI SDK docs, none is only supported for specific OpenAI GPT-5.1 models, so for GLM via multiple providers I’d expect provider-specific options to be more reliable than a single normalized field.

For this specific case, the strongest support report would be exactly what you already collected: same model, same prompt, same Gateway request shape, pinned provider, and reasoning_tokens still generated despite a disable option.

Topic		Replies	Views
How to enable/disable reasoning in certain models AI SDK ai-sdk	0	1127	September 16, 2025
Gemini 3 Flash - Disabled thinking/reasoning Help ai-gateway , api	3	302	March 1, 2026
Ai Gateway Help ai-sdk , ai-gateway	2	129	January 7, 2026
How to use GLM-5.1 with Vercel AI Gateway via Fireworks and DeepInfra Feedback ai-gateway	1	37	April 16, 2026
SOLVED: Pricing Issues for zai/glm-4.7, in AI Gateway? Help ai-gateway	3	171	January 25, 2026

Can't disable reasoning for zai/glm-5.1

Issue 1: reasoning disable not forwarded/honored

Related topics