Need help stabilizing AI orchestration in V0/Vercel app

Hi everyone,
I’ve been building an AI-powered party planning app using:
V0
Next.js
Vercel
OpenAI APIs
The app generates complete themed party plans with:
“Muse” personalities/styles
menus
decor
playlists
vibe descriptions
timelines
effort levels
variations
Originally the prototype worked surprisingly well.
However, after expanding the metadata/intelligence layer for ~10 muses and more detailed theme matching, I’m now running into major issues:
Current Problems
outputs becoming inconsistent
muses mismatching menus/vibes/themes
variation logic failing
repeated content
generation times increasing to 60+ seconds
app feels unstable after metadata expansion
Example:

Asian tea party originally paired with the right scene but not taking to long to load content. Now loading with the worng muse and worng content see screenshots.
A Bridgerton-themed party might get paired with incorrect menu styles or generic outputs that don’t match the intended vibe.
My Suspicion
I think I may have:
too much logic inside prompts
schema drift
overly large context windows
weak deterministic mappings
orchestration problems between AI-generated sections
Looking For
Would love advice on:
architecture cleanup
prompt orchestration
structured metadata approaches
caching/performance optimization
separating deterministic vs generative logic
improving generation speed
Also open to paid freelance/consulting help from someone experienced with:
Vercel
AI apps
Next.js
prompt orchestration
Thanks! Adeline

<https://v0-party-plan-premium.vercel.app/index.html!-- Project information (URL, framework, environment, project settings) →

Hi Adeline,

Your suspicion sounds pretty reasonable. When an AI app grows from “one prompt generates everything” into multiple muses, menus, timelines, variations, etc., the fragile part is usually that too much decision-making is left inside the prompt.

I’d separate this into a more deterministic pipeline:

1. User selects / describes the party
2. App deterministically chooses the muse/theme metadata from a fixed JSON or database table
3. AI generates only the sections that actually need language generation
4. Output is validated against a schema
5. Failed or mismatched sections are retried individually, not by regenerating the whole plan

The biggest fix I’d make is: don’t ask the model to both choose the correct muse and generate the entire party plan in one large prompt. Pick the muse in code first, then pass only that muse’s allowed values into the generation step.

For example:

const selectedMuse = getMuseForTheme(theme)

const prompt = `
Generate a party plan using ONLY this muse:

${JSON.stringify(selectedMuse)}

Do not invent a different muse.
Return menu, decor, playlist, timeline, and vibe using the provided muse rules.
`

For the repeated / drifting output, I’d also move to structured output instead of free-form text. With the AI SDK, that usually means defining a Zod schema for the exact shape you expect, so the model has to return fields like:

{
  museId: string,
  theme: string,
  menu: string[],
  decor: string[],
  playlist: string[],
  timeline: string[],
  effortLevel: "easy" | "medium" | "high"
}

Then you can add a simple check after generation:

if (result.museId !== selectedMuse.id) {
  // retry only this generation with a stricter prompt
}

That gives you a way to catch “Bridgerton theme got the wrong menu/muse” instead of silently showing bad content.

For the 60+ second generation time, I’d avoid generating every section in one blocking request. Either cache stable muse/theme metadata, stream the response, or split generation into smaller steps such as plan outline → menu → decor → timeline. The AI SDK structured output docs are useful for this pattern:
https://ai-sdk.dev/docs/ai-sdk-core/generating-structured-data

A useful v0 prompt might be:

Refactor my party plan generator so muse selection is deterministic and not decided by the LLM. Store muses as structured JSON with ids, allowed themes, menu rules, decor rules, tone, and constraints. Select the muse in code before calling the model. Then generate the party plan using structured output with schema validation. Add a post-generation validation step that rejects or retries the result if the returned museId does not match the selected muse.

Are your muse definitions currently stored as code/JSON, or are they mostly embedded inside one large prompt?