hackernews client

sjmaplesec

2 days ago

This resonates with my experience: we have dozens of internal “playbooks” and prompt snippets floating around, and nobody knows which ones still work after model changes. If you can make “skill quality” visible over time (regressions, drift), that’s valuable. Do you have a CI integration where you can pin a skill version and fail builds if eval scores drop?

shree_e

2 days ago

At some point I thought skills were only markdown files

Show HN: A package manager for agent skills with built-in evals

2 Comments

sjmaplesec

shree_e