Proposal for outcome-based ranking and telemetry for Vercel agent skills

Hi Vercel team,

I’ve been following the new skills ecosystem and I think find-skills already solves an important part of the problem really well: discovery, installation, and popularity/reputation-based recommendation.

What I’m more interested in is the next layer after discovery: a standard way for agents to report:

  • The task they are trying to solve
  • Which skills were considered as candidates
  • Which skill was actually selected
  • Whether it succeeded, failed, or required fallback

In other words, not just skill discovery, but a runtime feedback layer for skill selection.

Intuition

My intuition is:

  • LLMs can do the initial semantic filtering.
  • Over time, real execution data should help ranking move from popularity-based to outcome-based.

This could make it possible to learn:

  • Which skills work best for which task clusters
  • Which skills look good in metadata but underperform in practice
  • How to reduce conflicts when many skills are available

Contribution

By background, I’m a recommendation / ranking engineer, so this problem feels very familiar to me from a retrieval, ranking, feedback-loop, and evaluation perspective.

If this direction resonates with your roadmap, I’d be very interested in contributing — whether that means discussing a minimal spec, prototyping telemetry/events, or helping design the recommendation and evaluation layer around skill selection.

Thanks — I’d be excited to help if this is useful.