case study · 02

Smithy

A reimagined OpenAI Symphony, with the harness opinions cranked up.

Smithy is a fork of OpenAI's Symphony harness with the opinions cranked up: dual runtimes (Codex + Claude Code), cross-model adversarial pre-PR review, Linear OAuth identity, label-gated autonomous merge, and model-summarized run logs. The Symphony shape is preserved; the production discipline is added. v2 spec in progress.

Smithy illustration

~ what shipped ~

Three numbers that matter.

metric · 01

Dual runtime

Workers run Claude Code (claude -p) or Codex (codex exec) per ticket state. Builder runs Claude for Todo / In Progress; reviewer runs Codex for In Review. Cross-model is the load-bearing differentiator: a model marking its own work passes too easily.

metric · 02

Linear OAuth

Smithy authenticates as itself in Linear via OAuth, not via a borrowed PAT. Worker-level MCP can read but cannot write; the orchestrator owns every state transition. One rule, audit trail for free.

metric · 03

Label-gated merge

Reviews that pass the adversarial cross-model pass and carry the merge label auto-merge to main. Anything else stops at draft PR for human review. The label is the discriminator, not the model.

What Symphony does

OpenAI shipped Symphony as an open-source agentic coding harness reference: a tracker-agnostic poll loop that picks up issues, spawns a worker per issue in an isolated worktree, runs the work to completion, and ends in a draft PR for human review. Hooks, no auto-merge, brevity in the worker prompts. Solid, principled, deliberately minimal.

What Smithy adds

Smithy keeps the Symphony shape and adds five things the reference doesn't ship:

The chain of command

Linear (source of truth: what work)
  ↓ poll every 30s
Smithy (in-memory: what's running, ~3 workers max)
  ↓ spawn per-issue worker (Linear write blocked at MCP)
Worker = claude -p (builder) or codex exec (reviewer)
  ↓ runs in per-issue worktree
  ↓ writes RESULT.md or REVIEW.md and exits
Smithy parses the handoff, drives the Linear state move.

The non-obvious decision

The orchestrator stays dumb. Workers are smart. The harness can't analyze your code, can't decide what to build, can't reason about quality. It just spawns the right worker, watches for handoff files, and moves Linear states. Every time the orchestrator gets clever, the system gets fragile. Symphony got this right; Smithy preserves it.

↳ Smithy is open source as of May 2026. v2 spec is in v2/SPEC.md in the repo. Reach out if you want a walkthrough of how the additions land.

~ on the workbench ~

The tooling.

~ counterfactual ~

What would have been worse.

Without it: every ticket is a manual session. Branch creation, PR drafting, worker prompt assembly, state transitions, all hand-stitched. The harness IS the multiplier; without it, you're back to one-tab-at-a-time work even though the LLM cost has dropped 90% in two years.

~ got something like this on the bench? ~

Pull the cord.

Start the conversation

/ smithy / built by hand / shipped to a working URL /