Skip to main content
Agent Companies relies on lightweight discovery metadata before full activation. That makes description fields important across COMPANY.md, TEAM.md, AGENTS.md, PROJECT.md, TASK.md, and especially SKILL.md. An under-specified description means the right package or skill is missed. An over-broad one causes false activations and wasted context.

How discovery works

A compatible runtime should first load only lightweight metadata:
  • package name
  • slug
  • description
  • basic kind information
That metadata helps the runtime decide which company subtree, role, or skill to activate for the current task.

Write descriptions around intent

Good descriptions explain when the package matters, not just what file it is. Prefer:
description: Use this team package when work needs product and platform engineering leadership, role delegation, and code review support.
Over:
description: Engineering team package.
Useful patterns:
  • describe the work context
  • mention the kind of decisions the package supports
  • include adjacent signals such as team function, workflow type, or project phase
  • keep it concise enough to stay readable in a catalog

Design trigger evals

Test descriptions with realistic prompts and planning situations. Label each one should_trigger or should_not_trigger. Examples:
  • Import a startup operating package with a CEO, CTO, and weekly review workflow should trigger a company package
  • Fix a small CSS bug in one file should not trigger a whole company package
  • Attach review and release-management skills to the engineering lead should trigger relevant agent and skill metadata
The most valuable negative tests are near-misses that share vocabulary but do not actually need the package.

Measure false positives and misses

For each query, check whether the runtime:
  • surfaced the right package or skill
  • avoided loading unrelated packages
  • used skill shortnames consistently
Run each case multiple times if the underlying model behavior is nondeterministic.

Iterate without overfitting

Use a train and validation split:
  1. revise descriptions based on train-set failures
  2. keep the validation set untouched
  3. choose the version that generalizes best
Avoid stuffing specific keywords from failed prompts into the description. Fix the broader concept instead.

Common failure modes

  • descriptions that name a team but not the work it handles
  • descriptions that are too generic to distinguish company, team, and skill scopes
  • descriptions that blur role behavior and reusable capability
  • descriptions that omit the context needed for shortname-based skill activation
When the activation surface includes both company packages and skill packages, precision matters more than keyword density.