EvoSkill — An open-source framework that automatically discovers and synthesizes reusable agent skills from failed trajectories to improve coding agent performance on long-horizon tasks.
Just released our newest piece of research that identifies gaps in a coding agent's operation on various capabilities, and then creates the skills needed to fill those gaps. Our preliminary tests show this evolutionary loop leads to substantial improvement on the hardest benchmarks. Also, we see skills as a powerful abstraction that can be leveraged for generalizability---the skills created in this loop also lead to improvement in benchmarks outside of the direct one tested.
Replies
ROMA