Agent Orchestration Hardening
Resilient uses GitHub Issues and pull requests as the source of truth for agent work. The local scripts provide scheduling and guardrails, but they must not create a second canonical board.
Control Plane
The orchestration loop has five phases:
- Pick an
agent-readyissue that has no open PR claim. - Dispatch a fresh branch and worktree from
origin/main. - Claim expected files before the executor starts editing.
- Verify with
verify-scope.shand CI before a PR leaves draft. - Sync through
agents/integrationbefore auto-merge.
agent-scripts/agent-status.sh --json is the dashboard feed. A Kanban UI
may be built on top of that JSON, but the UI must remain a projection of
issues, PRs, CI, worktrees, and file claims.
Resumability
Agents should assume model context can disappear at any point. The durable
handoff is a PR comment emitted by agent-handoff.sh at dispatch,
executor start, guardrail failure, guardrail success, and orchestrator
finish. A replacement agent resumes from:
- the linked issue body,
- the PR body and latest handoff comments,
- branch commits and changed files,
- CI check state,
- the latest guardrail report.
Conflict Policy
File claims are leases, not ownership forever. Dispatch refuses known
overlaps before work begins, verify-scope.sh checks overlap before a PR
is marked ready, and merge releases claims through
release-file-claims.yml.
Hard conflicts require human resolution. Append-only extension conflicts
may be handled by sync-integration.sh and auto-resolve-extensions.sh.
Formal Model
The TLA+ model in docs/agent-orchestration.tla captures the safety
invariants this workflow is meant to preserve:
- no two live PRs own the same claimed file,
- a ready PR has passed guardrails,
- an auto-mergeable PR has passed guardrails and integration sync,
- merged PRs have released their file claims.