AI coding agents
review
local-first

One Agent, One Task: The Discipline That Keeps Runs Reviewable

A simple rule for keeping AI agent runs reviewable: one run, one outcome, one branch, and one stop point.

Junction TeamJunction Panel5 min read
On this page

The easiest way to make an AI coding agent hard to review is to let it do too much at once.

That failure mode shows up everywhere. The task starts as a small bug fix, then picks up cleanup, then folds in a refactor, then ends with a few unrelated edits because "the agent was already in there." By the time you review the diff, you are no longer looking at one decision. You are looking at a bundle of them.

One agent, one task is the discipline that keeps that from happening.

Why multiplexing is so expensive

When an agent carries unrelated work in the same run, the human reviewer loses the ability to answer a basic question: what was the run supposed to accomplish?

If the answer is "several things," review gets slower and less reliable.

You start asking:

  • Which edits are essential?
  • Which ones are cleanup?
  • Which files changed because of the task and which changed because the agent wandered?
  • What should be kept, and what should be rolled back?

That is review debt. It is easy to create and annoying to pay down.

The problem is not unique to Claude Code or Codex. Any agent can drift if the scope is vague. Junction is useful because it keeps the live run, the diff, and the approval surface visible while the task is still active. That makes scope creep easier to catch before it becomes a mess.

The rule

The rule is simple:

One run should have one outcome, one primary branch, and one stopping point.

That does not mean the agent can only touch one file. It means every edit should belong to the same outcome.

Good examples:

  • fix one failing test and the code that caused it
  • add one migration and the schema changes that support it
  • update one doc page and the examples that appear in that page
  • prepare one PR from one isolated worktree

Bad examples:

  • fix a bug and also clean up unrelated formatting
  • add a feature and also rewrite the release notes
  • patch a test and also rename a bunch of files for convenience
  • "while you're there" work that was never part of the prompt

If the diff does not fit the task description, the task description was too broad or the agent stayed active too long.

How to keep runs narrow from the start

The easiest fix is to write a tighter brief.

Tell the agent what done looks like before it starts. Say what it should not touch. Say whether the output should stay limited to one branch, one worktree, or one change set.

That helps with both Claude Code and Codex because the run begins with a smaller target area. The agent is less likely to interpret "helpful" as "expand the scope."

In Junction, that is easier to enforce because the control surface keeps the output stream, the branch context, and the diff close together. You do not need to reverse engineer the task from a terminal window.

What to do when the task grows

Sometimes the scope really does expand.

Maybe the bug fix exposes a separate issue. Maybe the task turns into a refactor. Maybe the agent discovers a migration you did not expect.

That is the point where you split the work.

Do not ask one run to absorb the new problem just because it is already in motion. Stop the run, review what you have, and decide whether the new work deserves its own task and its own diff.

That is not overhead. It is how you keep the reviewable unit from becoming too large.

A practical example

Suppose you ask an agent to fix a flaky test in a service package.

The right output is a narrow diff:

  • the test explanation
  • the minimal code change
  • the supporting assertion update

The wrong output starts to drift:

  • the test fix
  • a redesign of the helper layer
  • a formatting sweep
  • an unrelated dependency bump

Even if the broader diff is technically fine, it is harder to trust. You cannot tell whether the agent solved the original problem or wandered into other work while it was there.

Review becomes faster when scope stays tight

People sometimes treat this discipline as restrictive. In practice it makes the workflow faster.

Why?

Because a reviewer can understand the intent of a small run much faster than the intent of a mixed run. That means fewer questions, fewer rework loops, and less time spent deciding whether a strange change is accidental or deliberate.

That matters when you are reviewing from a phone as well as when you are at a desk. Junction's diff view and approval controls are much more useful when the change set is cleanly bounded.

Useful guardrails

These habits help keep runs honest:

Give the run a single objective

Write the prompt as one sentence if you can. If you need three separate goals, they may be three separate runs.

Keep a single review target

If the run will land in one PR, keep that PR narrow. A good PR can be explained without a long apology for unrelated changes.

Stop when the task changes shape

If the work starts to look different from the prompt, that is the cue to pause and reassess.

Prefer isolated worktrees

Isolation does not solve scope creep by itself, but it makes it easier to contain.

Junction's workflow is designed for that. Each session can stay local, and Switchboard runs use isolated git worktrees so one run does not trample another.

The payoff

One agent, one task is not a slogan. It is a way to keep your review cost predictable.

When the unit of work stays small, you get cleaner diffs, simpler approvals, and fewer moments where a good run becomes an afternoon of cleanup.

That is true whether you are supervising Claude Code or Codex, and it is especially true when you are away from the terminal and relying on the browser to keep you oriented.

Start with the Junction setup guide if you want the control surface in place. If you are deciding whether the richer workflow is worth it, check pricing and then read Use Branch Suggestions to Keep Agent Runs Reviewable.