January 21, 2025·13 min read

Our ‘Stateless’ AI Workers Were Leaking State Through the Git Working Tree

The filesystem is the undeclared global variable of agent swarms. Reuse one git clone across tasks and your stateless worker is running at READ UNCOMMITTED isolation.

git working treestate contaminationsnapshot isolationMVCCagent architectureagent-swarm

Git working tree state contamination in AI agent swarms — The database was gone. The working tree was still carrying yesterday's task.

We banned the database. We trimmed context windows. We decayed memory to keep our workers “stateless.” Then we handed every task a long-lived, mutable, container-scoped working directory and never noticed it carried state task-to-task.

The bug arrived as a ghost. A PR opened with an edit to a file the assigned agent never touched. In review, nobody could explain the diff. The author swore they only changed one line. The branch base looked correct. The timeline did not add up.

It only happened sometimes. That was the clue. The failure depended on which task had previously occupied the same container. The filesystem was the undeclared global variable of our entire swarm.

The Comforting Lie

In April we published our stateless workers post. We were right to ban local databases. SQL state in a task runner is obvious shared mutable state, and obvious problems get fixed. We felt safe. We had eliminated the database; therefore, we had eliminated state.

This was a half-truth. The state we eliminated was explicit and queryable. The state we missed was implicit, ambient, and sitting in /workspace/repo/.git.

For a coding agent, the working tree is the workspace. It is where the agent reads context, runs builds, writes edits, and stages commits. A reused working tree is exactly the shared-mutable-state hazard the whole stateless architecture was supposed to kill, just one directory below where everyone was looking.

The Mechanism: How We Leaked

Our worker runtime kept one persistent clone per repository per container at a fixed path. Every task that landed on that container reused it. Here is the shape of ensureRepoForTask from the runner, simplified to the dangerous part:

async function ensureRepoForTask(
  task: Task,
  repo: RepositoryConfig
): Promise<WorkingDirectory> {
  const clonePath = resolve(CONTAINER_WORKSPACE, repo.name);

  const gitHead = join(clonePath, ".git", "HEAD");

  if (!(await exists(gitHead))) {
    await exec(
      `git clone ${repo.url} ${clonePath} --depth 50 --single-branch`
    );
    return { path: clonePath, isFresh: true };
  }

  const status = await exec("git status --porcelain", { cwd: clonePath });
  const isDirty = status.stdout.trim().length > 0;

  if (isDirty) {
    logger.warn(
      "The repo has uncommitted changes. A git pull was skipped " +
      "to avoid losing work. You may need to commit or stash."
    );

    return { path: clonePath, isFresh: false, dirty: true };
  }

  await exec(
    `git pull origin ${repo.defaultBranch} --ff-only`,
    { cwd: clonePath }
  );

  return { path: clonePath, isFresh: false, dirty: false };
}

Study what is missing. No git checkout main. No git reset --hard origin/main. No git clean -fdx. The setup path detects a dirty tree, warns the agent, and proceeds.

Detection without neutralization is a foot-gun with a label on it. Whatever branch the previous task checked out, whatever uncommitted edits it left behind, the next task inherits it all.

Three Failure Modes

Read-stale

Task A crashes mid-edit, leaving the tree dirty. Task B lands on the same container. The pull is skipped. Task B now reasons about code that is three merges behind origin/main. The agent may “fix” bugs already fixed, or reintroduce patterns the team deprecated.

Write-contamination

Task A’s half-finished edits are still in the working tree. Task B runs git diff, sees A’s changes mixed with its own, and cannot distinguish them. The PR ships with a line task B never touched and nobody can explain in review.

Wrong-branch carryover

Task A checked out feature/payment-gateway to open a PR. Task B begins, the tree is still on that branch, and the ff-only pull fails, or worse, Task B branches from Task A’s feature branch. The new PR’s base is silently wrong.

This is the filesystem equivalent of a thread that forgot to release a lock. The scheduler’s nondeterminism becomes your nondeterminism.

Transaction Isolation for the Filesystem

Database people solved this class of problem forty years ago and gave it a name: transaction isolation. A reused working tree across tasks is READ UNCOMMITTED. Every task sees every other task’s dirty, half-written changes.

What you actually want for autonomous tasks is SNAPSHOT ISOLATION. Each task gets a private, pristine, point-in-time view of the repo, and changes do not leak across task boundaries.

The cheap way to get snapshot isolation on a git repo is a task-scoped pristine checkout: fetch, checkout default, reset hard to origin, and clean untracked files at task start. The better way is an ephemeral git worktree per task.

// Baseline fix: task-scoped pristine checkout
async function ensureIsolatedRepo(task: Task, repo: RepositoryConfig) {
  const clonePath = resolve(CONTAINER_WORKSPACE, repo.name);

  await exec("git fetch origin", { cwd: clonePath });
  await exec(`git checkout ${repo.defaultBranch}`, { cwd: clonePath });
  await exec(
    `git reset --hard origin/${repo.defaultBranch}`,
    { cwd: clonePath }
  );
  await exec("git clean -fdx", { cwd: clonePath });

  return { path: clonePath };
}

// Better: ephemeral worktree, private index, shared object store
async function createTaskWorktree(task: Task, repo: RepositoryConfig) {
  const objectStore = resolve(CONTAINER_WORKSPACE, "_shared", repo.name);
  const taskTree = resolve(CONTAINER_WORKSPACE, "_tasks", task.id, repo.name);

  await ensureObjectStore(repo, objectStore);

  await exec(
    `git worktree add --checkout ${taskTree} origin/${repo.defaultBranch}`,
    { cwd: objectStore }
  );

  return {
    path: taskTree,
    cleanup: () => exec(`git worktree remove ${taskTree}`),
  };
}

The worktree pattern is MVCC for the filesystem: immutable shared object store, copy-on-write working directories, private index per task. Database engineers recognize this immediately. The engineering cost is seconds per task and a few megabytes per concurrent task.

Approach	Isolation	Cost	Correct
Reused clone	READ UNCOMMITTED	~0s	No
Pristine checkout	SNAPSHOT ISOLATION	2-5s	Yes
Ephemeral worktree	SNAPSHOT ISOLATION	1-3s + a few MB	Yes

What Does Not Work

Container-per-task is correct isolation, but expensive for short tasks. Stashing dirty state looks clever until the stash stack itself becomes cross-task state. Branch-per-task avoids some wrong-branch failures but does not remove uncommitted edits from the working tree before the next task begins.

The right abstraction is not a better cleanup ritual. The right abstraction is an isolation level. Stop treating the working directory as ambient container scenery and start treating it as a resource with semantics you are choosing, whether you realize it or not.

The Honest Note

Our swarm already detected the hazard. The git status --porcelain check and the warning string were both there. Detection without neutralization is worse than ignorance because it creates false confidence. The infrastructure appears to handle dirty repos, but it only tells the agent about the problem and proceeds.

That abdicates the one layer that can actually enforce isolation. The agent can choose to stash, reset, inspect, or ignore. The runner should make contamination impossible by construction.

What We Ship Now

The current pattern is task-scoped worktrees. Every task receives a private working directory. The shared object store lives separately. Tasks cannot see each other’s uncommitted edits, branch choices, or untracked files by construction.

The measurable outcome is boring in the best way: phantom diffs in PR review went away, stale builds against old default branches disappeared, and the “why is this edit in my PR” question stopped appearing in engineering Slack.

The Prediction

Within twelve months, “working tree isolation” becomes a checkbox feature for production coding-agent platforms. “Reused clone across tasks” joins “shared mutable global state” as a recognized anti-pattern in agent architecture reviews.

Statelessness has to extend to the filesystem or it is not statelessness. The git working tree is state. Treat it accordingly.

FAQ

How is working tree state different from database state?

Database state is explicit and queryable. Working tree state is implicit: leftover files, uncommitted edits, and wrong branches that silently contaminate the next task. Banning SQL fixes one layer; filesystem isolation fixes the layer coding agents actually execute against.

Why not create a fresh clone for every task?

Fresh clones are correct, but wasteful for large repositories. Git worktrees share the immutable object store while giving each task a private index and working directory, which gives snapshot isolation without paying full clone cost every time.

What is the cost of task-scoped worktrees?

The cost is one fetch plus worktree creation at task start, usually seconds, and a few megabytes per concurrent task for the private index and working directory. Against phantom diffs and stale builds, it is cheap insurance.

Does this affect single-container or distributed swarms?

Both. Any reused filesystem, including a persistent container volume or a shared network mount, can leak task state. The fix is choosing the right isolation level for the working tree, not relying on container count.

How do I detect whether my swarm has this issue?

Look for non-deterministic diffs, mystery edits in pull requests, tasks building against old default branches, and wrong-base branches. The symptoms only appear when the right task sequence lands on the same reused clone, which is why they are hard to reproduce.

/ keep reading

All posts

June 18, 2026 / 13 min read

Our ‘Stateless’ AI Workers Were Leaking State Through the Git Working Tree

The Comforting Lie

The Mechanism: How We Leaked

Three Failure Modes

Transaction Isolation for the Filesystem

What Does Not Work

The Honest Note

What We Ship Now

The Prediction

FAQ

How is working tree state different from database state?

Why not create a fresh clone for every task?

What is the cost of task-scoped worktrees?

Does this affect single-container or distributed swarms?

How do I detect whether my swarm has this issue?

LLM-Agent-UMF Did Not Redesign Agent Swarm. It Named What We Already Built.

Multi-Agent Systems Reproduce Every Organizational Anti-Pattern You Already Hate

A Frontier Model Is Rented. A Swarm Is Owned.

Build your swarm tonight.

The Comforting Lie

The Mechanism: How We Leaked

Three Failure Modes

Transaction Isolation for the Filesystem

What Does Not Work

The Honest Note

What We Ship Now

The Prediction

FAQ

How is working tree state different from database state?

Why not create a fresh clone for every task?

What is the cost of task-scoped worktrees?

Does this affect single-container or distributed swarms?

How do I detect whether my swarm has this issue?

Related field notes

LLM-Agent-UMF Did Not Redesign Agent Swarm. It Named What We Already Built.

Multi-Agent Systems Reproduce Every Organizational Anti-Pattern You Already Hate

A Frontier Model Is Rented. A Swarm Is Owned.

Build your swarm tonight.