Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed
dsifry Tuesday, February 03, 2026A few weeks ago I posted about GoodToGo https://news.ycombinator.com/item?id=46656759 - a tool that gives AI agents a deterministic answer to "is this PR ready to merge?" Several people asked about the larger orchestration system I mentioned. This is that system.
I got tired of being a project manager for Claude Code. It writes code fine, but shipping production code is seven or eight jobs — research, planning, design review, implementation, code review, security audit, PR creation, CI babysitting. I was doing all the coordination myself. The agent typed fast. I was still the bottleneck. What I really needed was an orchestrator of orchestrators - swarms of swarms of agents with deterministic quality checks.
So I built metaswarm. It breaks work into phases and assigns each to a specialist swarm orchestrator. It manages handoffs and uses BEADS for deterministic gates that persist across /compact, /clear, and even across sessions. Point it at a GitHub issue or brainstorm with it (it uses Superpowers to ask clarifying questions) and it creates epics, tasks, and dependencies, then runs the full pipeline to a merged PR - including outside code review like CodeRabbit, Greptile, and Bugbot.
The thing that surprised me most was the design review gate. Five agents — PM, Architect, Designer, Security, CTO — review every plan in parallel before a line of code gets written. All five must approve. Three rounds max, then it escalates to a human. I expected a rubber stamp. It catches real design problems, dependency issues, security gaps.
This weekend I pointed it at my backlog. 127 PRs merged. Every one hit 100% test coverage. No human wrote code, reviewed code, or clicked merge. OK, I guided it a bit, mostly helping with plans for some of the epics.
A few learnings:
Agent checklists are theater. Agents skipped coverage checks, misread thresholds, or decided they didn't apply. Prompts alone weren't enough. The fix was deterministic gates — BEADS, pre-push hooks, CI jobs all on top of the agent completion check. The gates block bad code whether or not the agent cooperates.
The agents are just markdown files. No custom runtime, no server, and while I built it on TypeScript, the agents are language-agnostic. You can read all of them, edit them, add your own.
It self-reflects too. After every merged PR, the system extracts patterns, gotchas, and decisions into a JSONL knowledge base. Agents only load entries relevant to the files they're touching. The more it ships, the fewer mistakes it makes. It learns as it goes.
metaswarm stands on two projects: https://github.com/steveyegge/beads by Steve Yegge (git-native task tracking and knowledge priming) and https://github.com/obra/superpowers by Jesse Vincent (disciplined agentic workflows — TDD, brainstorming, systematic debugging). Both were essential.
Background: I founded Technorati, Linuxcare, and Warmstart; tech exec at Lyft and Reddit. I built metaswarm because I needed autonomous agents that could ship to a production codebase with the same standards I'd hold a human team to.
$ cd my-project-name
$ npx metaswarm init
MIT licensed. IANAL. YMMV. Issues/PRs welcome!