hackernews client

youraimarketer

2 months ago

I've been building multi-agent systems for the past year and kept running into the same problems: context windows filling up with tool outputs, agents losing track of information buried in the middle of long conversations, supervisors becoming bottlenecks as they accumulated state from all workers.

The solutions to these problems are scattered across research papers, framework docs, and production war stories. I collected and synthesized them into a set of "Agent Skills" - structured instructions that agents can load on demand when working on relevant tasks.

7 skills covering context engineering fundamentals:

- \context-fundamentals\: What context actually is (system prompts, tool definitions, retrieved docs, message history, tool outputs) and why context quality matters more than context length

- \context-degradation\: The failure modes - lost-in-middle (10-40% accuracy drop for middle content), context poisoning (hallucinations that compound), context distraction (irrelevant info consuming attention budget)

- \multi-agent-patterns\: Supervisor vs swarm vs hierarchical architectures, when to use each, and the "telephone game" problem where supervisors paraphrase sub-agent responses incorrectly

- \memory-systems\: Why vector stores lose relationship information, when to use knowledge graphs, and how temporal validity prevents outdated facts from conflicting with new ones

- \tool-design\: The consolidation principle (if a human can't say which tool to use, an agent can't either), error messages that enable recovery, response format options for token efficiency

- \context-optimization\: Compaction triggers, observation masking (tool outputs can be 80%+ of token usage), KV-cache optimization

- \evaluation\: Multi-dimensional rubrics instead of single metrics, LLM-as-judge for scale, human review for edge cases

It uses Anthropic's open Agent Skills format. Each skill is a folder with a SKILL.md file containing instructions. Progressive disclosure - agents load only skill names/descriptions at startup, full content loads when activated for relevant tasks.

Works with Claude Code, Cursor, or any agent that supports skills/custom instructions.

Would appreciate feedback, especially from anyone running multi-agent systems in production. What patterns are you seeing that aren't captured here?

StackTopherFlow

2 months ago

Very cool! Have you done any Evals? I’m trying to figure out my own personal evals to know if anything I’m doing makes a real difference

youraimarketer

2 months ago

No evaluations are done separately, but the documents I used to create the Skills come from official AI Lab documentation and other technical blogs from Manus, Chroma, Anthropic, and many ArXiv papers.

Agent Skills for Context Engineering

3 Comments

youraimarketer

StackTopherFlow

youraimarketer