I find this VERY strange behavior where the streaming for AI works locally, but is being "cut off” in Vercel (production)
I have tried to use both edge runtime and nodejs runtime, both have the same issues
I don’t see any errors in the Vercel logs
I remember that this has happened on the past because I forgot to specify the runtime in my route.ts, which then I solved it by specifying edge runtime
But all sudden this issue re-appeared out of the blue, when I didn’t have any change in the code, nor any deployment since 2 July 2025
Getting a sense that there’s some change in Vercel platform since that date, which caused this issue
return new AgentExecutor({
agent,
tools,
returnIntermediateSteps: true,
verbose: false,
maxIterations: 15, // Reduced from 500 to prevent timeout issues
handleParsingErrors: true,
earlyStoppingMethod: 'generate', // Stop when the agent generates a final answer
});
Anyone have experienced similar issue? Any advice on how to solve this? I’m super frustrated, I already asked Vercel Support and they’re useless (I’m on Vercel Pro)
I can confirm your functions are not timing out and aren’t seeing any errors from the AI APIs you call (including Langchain)
If you check your Logs or Observability tabs, you can filter to the /api/chat/agentic route to see requests and it will show both the runtime and timeout like 3.5s / 5m to help confirm your settings are working
I’m checking internally to see if there’s any potential changes that could have affected this for you from our end, if you can narrow down a smaller window for when the issue started happening it would help me dig deeper in the right place
Unfortunately from Vercel’s end at the moment this looks like a simple request that ends successfully. Since LangChain is the last step before returning the stream, I’d double check Langchain is returning the full token count and try to get some logs between what langchain returns and what you return from your function. If you can prove your API is returning data that gets truncated before it hits the browser then that would strongly point toward an issue on our end, but currently it’s looking more like the AgentExecutor is giving the wrong output
From a technical point of view, LangChain’s AgentExecutor returns a generator that can be consumed as a ReadableStream. Your API route returns this generator so the error is in one of two places
Your API route is returning a stream, which is the whole stream locally but in prod Vercel cuts it off after two lines
or, The AgentExecutor works correctly locally, but in prod it only generates two lines, which then get correctly returned to the browser
Since there’s no timeout or errors I suspect option 2 is the issue.
You should be able to log the output serverside like this, which you can verify in your prod environment
If you see the whole stream appearing in the logs but only two lines make it to the browser, then that would strongly point to an issue on our end
If you only see the two lines here, then that points to a LangChain issue. In that case I’d check their dashboard for any logs/hints, and add error handler callbacks to your AgentExecutor instance to see what’s truncating the output
If you don’t want to push those logs to prod, you can make a new branch and try it in a Preview environment. If for some reason it works in Preview on Vercel but NOT prod, then it’s usually an issue with either environment variables or devDependencies, but you can debug further in that direction once it’s narrowed down
It doesn’t seem to even show the stream chunks in the Vercel server. I do see these chunks on my localhost logs.
I noticed that in LangSmith, it should be returning the stream (see below)