Agents

Configure which AI agents handle implementation, testing, and review in your workbench pipeline.

Supported Agents

Workbench ships with built-in adapters for four agent CLIs, plus support for any custom CLI:

Agent	Command	Provider
Claude Code	`claude`	Anthropic
Gemini CLI	`gemini`	Google
OpenAI Codex	`codex`	OpenAI
Cursor CLI	`cursor`	Cursor

Each agent goes through the same pipeline stages: implement → test → review → fix. Workbench treats all agents as interchangeable — any agent can fill any role.

Use --agent to select which agent to use:

wb run plan.md --agent claude     # default
wb run plan.md --agent gemini
wb run plan.md --agent codex
wb run plan.md --agent cursor

Setting Up Agents

Use wb setup to install agent skill files for your platform:

wb setup                           # auto-detect agent, install skills locally
wb setup --agent claude            # install to <repo>/.claude/skills/ + .agents/skills/
wb setup --agent gemini            # install to <repo>/.agents/skills/
wb setup --agent cursor            # install to <repo>/.cursor/skills/
wb setup --agent manual            # print paths for manual setup
wb setup --global                  # install skills to user-level paths only
wb setup --global --agent claude   # install to ~/.claude/skills/
wb setup --symlink                 # symlink instead of copy (stays in sync with updates)
wb setup --update                  # force-update skills to the latest version
wb setup --profile                 # also create a profile.yaml with the detected agent

If the skill file already exists and is unchanged, it's skipped. If it differs, you'll be prompted before overwriting.

Managing Agents

Use the wb agents commands to view, add, and configure agent adapters:

wb agents init                    # create agents.yaml with all built-in adapter configs
wb agents list                    # show built-in and custom agents
wb agents show my-agent           # show full config for an agent
wb agents add my-agent --command my-cli --args "--headless,{prompt}"
wb agents add my-agent --command new-cli   # update an existing agent
wb agents remove my-agent         # remove a custom agent

wb agents init creates .workbench/agents.yaml pre-populated with the configs for all built-in adapters (Claude, Gemini, Codex, Cursor). Use this as a starting point to customize command flags, output parsing, or to add your own agents.

Custom Agents

Define adapters for any CLI tool using the wb agents add command or by editing .workbench/agents.yaml directly:

wb agents add my-agent --command my-cli --args "--headless,{prompt}" --output-format json
wb run plan.md --agent my-agent

This creates an entry in .workbench/agents.yaml:

agents:
  my-agent:
    command: my-cli
    args: ["--headless", "{prompt}"]
    output_format: json
    json_result_key: result
    json_cost_key: cost_usd

The {prompt} placeholder in args is replaced with the agent's prompt at runtime.

agents.yaml Fields

Field	Description
`command`	The CLI command to execute
`args`	Arguments passed to the command. Use `{prompt}` as a placeholder for the task prompt
`output_format`	How to parse the agent's output (`text` or `json`)
`json_result_key`	When `output_format` is `json`, the key containing the agent's response
`json_cost_key`	When `output_format` is `json`, the key containing the cost (optional)

wb agents add Flags

Flag	Description
`--command CMD`	CLI command to invoke (required)
`--args TEMPLATE`	Argument template, comma-separated (default: `{prompt}`)
`--output-format FMT`	`text` or `json` (default: `text`)
`--json-result-key KEY`	JSON key for result (default: `result`)
`--json-cost-key KEY`	JSON key for cost (default: `cost_usd`)
`--repo PATH`	Repository path (auto-detected if omitted)

Multi-Provider Setup

Use profiles to assign different agents to different pipeline roles:

wb profile init --set implementor.agent=gemini --set reviewer.agent=claude

Or edit .workbench/profile.yaml directly:

roles:
  implementor:
    agent: gemini
  tester:
    agent: claude
  reviewer:
    agent: claude
  fixer:
    agent: claude

You can also assign custom agents to roles:

wb profile set implementor.agent my-agent

Directive Overrides

Override agent instructions from the CLI without modifying your profile or plan:

wb run plan.md --reviewer-directive "Focus on security vulnerabilities and input validation"
wb run plan.md --tester-directive "Use pytest -x --tb=short for faster feedback"

Available flags: --implementor-directive, --tester-directive, --reviewer-directive, --fixer-directive.

Directives are appended to the default role prompts. You can combine them:

wb run plan.md \
  --reviewer-directive "Enforce strict type safety, no any types" \
  --tester-directive "Include edge case tests for empty inputs and null values"

For persistent customization, use profiles with directive or directive_extend fields.

Agent Roles

Workbench assigns agents to five pipeline roles. Each role has a specific job in the task lifecycle:

Role	Description
`implementor`	Writes code to fulfill the task
`tester`	Runs and writes tests, reports PASS/FAIL
`reviewer`	Reviews the diff for correctness and quality
`fixer`	Addresses feedback from failed tests or reviews
`merger`	Resolves merge conflicts between parallel branches

Implementor

Receives the task description, context, and conventions. Writes the code to fulfill the task. This is the primary coding agent.

Tester

Runs the project's test suite against the implementation. If tests are missing, the tester writes them. In TDD mode, the tester runs first and writes failing tests before implementation begins.

Reviewer

Reviews the git diff produced by the implementor. Checks for correctness, style adherence, and potential issues. Reports pass/fail with actionable feedback.

Fixer

Activated when the tester or reviewer reports failures. Receives the failure output and attempts to fix the issues. Runs up to --max-retries times (default: 2).

Merger

Handles merge conflicts when parallel task branches are combined. The merger receives the conflict markers and both sides of the diff, then produces a clean resolution.