Had Forge audit itself today. Forge is the agent I use to build MCP servers (part of the larger system). Designs the architecture, writes the code, hardens the container, ships to the registry. It has been running for months.
Asked it to grade its own playbook against best practices. Came back with seven specific gaps. No anti-hallucination rule for external claims. No token budget enforcement. No multi-client smoke test, only Claude Code. No FastMCP version pinning policy. Reflection check was one line. Lessons file path undocumented. No quarterly re-audit cadence on public repos.
Forge proposed a v2 with each gap closed as a discrete edit, marked with explicit ADD or REPLACE blocks so the diffs apply cleanly. I approved. It applied them to its own definition file. Playbook went from 258 to 304 lines.
The interesting part: every gap was something I had been manually fixing in spawn prompts every time I called the agent. The audit just made the patches permanent so I stop typing them.
Agents that audit themselves and apply the fix are the real move. Tools that build tools.