Hi Vercel team,
I’ve been following the new skills ecosystem and I think find-skills already solves an important part of the problem really well: discovery, installation, and popularity/reputation-based recommendation.
What I’m more interested in is the next layer after discovery: a standard way for agents to report:
- The task they are trying to solve
- Which skills were considered as candidates
- Which skill was actually selected
- Whether it succeeded, failed, or required fallback
In other words, not just skill discovery, but a runtime feedback layer for skill selection.
Intuition
My intuition is:
- LLMs can do the initial semantic filtering.
- Over time, real execution data should help ranking move from popularity-based to outcome-based.
This could make it possible to learn:
- Which skills work best for which task clusters
- Which skills look good in metadata but underperform in practice
- How to reduce conflicts when many skills are available
Contribution
By background, I’m a recommendation / ranking engineer, so this problem feels very familiar to me from a retrieval, ranking, feedback-loop, and evaluation perspective.
If this direction resonates with your roadmap, I’d be very interested in contributing — whether that means discussing a minimal spec, prototyping telemetry/events, or helping design the recommendation and evaluation layer around skill selection.
Thanks — I’d be excited to help if this is useful.