Parsing custom XML tool calls from LongCat Flash models in Vercel AI SDK

Recently, I have been using LongCat Flash models quite extensively, but when using them in long-running tasks that involve many tool calls, they will occasionally produce custom tool calls via XML that look like this:

<longcat_tool_call>
  readFile 
  <longcat_arg_key>path</longcat_arg_key> 
  <longcat_arg_value>./app/pages/index.vue</longcat_arg_value> 
</longcat_tool_call>

vLLM has a parser for this tool call format in vllm/vllm/tool_parsers/longcat_tool_parser.py, but I don’t know exactly how I would implement something like that with the Vercel AI SDK, specifically with the OpenAI-compatible adapter.

Current Issues

I initially wanted to use transform streams, but encountered several issues:

  1. Errors like id must be a string.
  2. When I was able to enqueue a tool-call message, the generation stopped because the finish-step event was sent with "finish-reason": "stop".

Question

I see some people suggesting I should use middleware to do this. Is that the recommended way to handle custom tool call formats? Or is there a better way? I don’t think stream transformers are the correct approach, but I don’t really know honestly.