Blog

Why GitHub Copilot can't block your merge (and how a real AI merge gate works)

July 8, 2026 · Postil team

A common assumption about AI code review is that turning it on adds a safety interlock: if the reviewer dislikes a change, the change can't merge. For most of the tools in this category that is not how it works, and the reason is not a missing feature or a pricing tier. It is the GitHub merge mechanics. A reviewer that posts a comment, or completes a check with a neutral result, is structurally incapable of stopping a merge no matter how confident it is. This is a mechanics explainer: what GitHub actually blocks on, why a comment-only or neutral reviewer slips past it, and how a real gate is built. The Postil-specific parts are backed by the open behavior in our CLI source rather than by marketing.

Three different things GitHub calls "review"

Branch protection on GitHub has two independent levers that people conflate, plus a third state that quietly defeats both.

Required pull request reviews. A protected branch can require N approving reviews. A review is an event with one of three states: APPROVE, REQUEST_CHANGES, or COMMENT. Only APPROVE counts toward the required count, and only REQUEST_CHANGES actively holds a merge open until dismissed. A COMMENT review is inert: it neither approves nor blocks.
Required status checks. A protected branch can require named checks to be green before merge. Each check run reports a conclusion: success, failure, neutral, cancelled, and so on. Branch protection blocks the merge only while a required check is missing or its conclusion is one of the failing values. This is the lever that actually enforces.
The neutral conclusion. A check that completes with neutral renders as a grey square, not a red X. Crucially, branch protection treats neutral as not failing. A required check that always concludes neutral will never block anything; it satisfies the "check has reported" requirement and then reports a non-failure.

So there is exactly one way for an automated reviewer to gate a merge on GitHub: it must publish a status check, that check must be marked required in branch protection, and it must be willing to conclude failure. A reviewer that only leaves comments has opted out of the enforcement lever entirely. A reviewer that publishes a check but only ever concludes neutral has wired up the lever and then disconnected it.

Where the surveyed tools land

With that mechanic in hand, the behavior of the common tools is unsurprising.

GitHub Copilot code review posts a Comment. According to GitHub's own documentation, Copilot's review is left as a Comment and does not count toward required approvals. It is on the first lever (reviews) but in the one state that neither approves nor requests changes. There is no setting that promotes it to a required, blocking check, which is why enterprises asking for enforcement remain an open community thread. This is not Copilot being weak; it is Copilot choosing the inert review state by design.

Claude Code review concludes neutral; Macroscope defaults neutral. Both publish status checks rather than comments, which looks like the enforcement path. But Claude Code review's documentation states its check completes with a neutral conclusion. Macroscope Check Run Agentsdefault to a neutral ceiling, while Macroscope Approvability documents how to configure a failure conclusion as a required status check. So Macroscope is configurable: left at its neutral default, the check exists and the gate does not; configured to fail, it can block a merge.

Among the surveyed tools, Cursor Bugbot offers a real required check. Per its documentation, Bugbot publishes a CI check with success and failure conclusions that branch protection can require. That is the architecture that actually enforces. We mention it precisely because it shows the difference is a design decision about conclusions and required checks, not a limit of what AI reviewers can technically do on GitHub.

Why this matters more as agents write more code

The gap between advisory and enforcing is the gap between feedback and a control. Integration guidance has been blunt about it: "verification that is recommended but not enforced in CI gets bypassed under pressure." A reviewer that can only comment is a recommendation. When most of the diffs arriving in a repository are machine-generated and the human in the loop is approving at volume, "a bot left a comment" is not a control surface. The recommended architecture in neutral guides is to keep the chatty, explanatory feedback as advisory and put a separate, severity-thresholded required status check in front of the merge so that the AI finds and explains while the check enforces.

How Postil's two-check model works

Postil splits the two roles into two distinct GitHub check runs, created on every reviewed commit. The backing for what follows is the postil-cli source: the checks are created in start_checks and their conclusions are written in complete_checks.

The CLI opens both checks in_progress at the start of a review:

for name in ["postil/review", "postil/gate"] {
    // POST /check-runs  { name, head_sha, status: "in_progress" }
}

postil/review is the advisory check. It carries the summary and the inline finding annotations: the explanatory feedback, the part you read. postil/gate is the enforcing check. It carries the merge decision and nothing else. You mark postil/gate as required in branch protection; you leave postil/review advisory. The split exists so that the enforcement signal is never diluted by the volume of advisory commentary.

The conclusions are not interchangeable, and the design rule is written directly into the source. The doc comment on the CheckState enum in src/forge/mod.rs states the contract:

/// Check conclusions, mapped per-forge. Postil semantics:
/// - advisory check (`postil/review`): success unless the run itself failed.
/// - gate check (`postil/gate`): failure iff gate-level findings exist (or the
///   run failed, so it fails closed). Never `neutral` for the gate: a grey square
///   that reads as "didn't fail" is the GitHub Copilot mistake.

That last sentence is the whole article in one line, and it is in the shipping code, not a slide. The gate check is only ever success or failure. It is structurally not allowed to take the neutral conclusion that makes Claude Code review and default-neutral checks non-blocking. The conclusion is computed straight from the gate outcome:

let gate_state = if envelope.gate.failing {
    CheckState::Failure
} else {
    CheckState::Success
};

Whether the gate is failing is itself a policy decision you control: findings at or above your configured failOn severity flip gate.failing to true. When the review itself hits an operational error, such as a provider outage or unusable model output, the review check fails. It cannot look skipped or clean, while the merge gate applies its configured blocking policy separately.

The gate fails closed on error

The dangerous failure mode for a merge gate is the silent pass: the reviewer errors out, reports nothing alarming, and the merge proceeds as if it had been checked. Postil's default is the opposite. When a run errors, the gate concludes failure rather than standing aside:

let gate_state = match cfg.gate_on_error {
    OnError::Block => CheckState::Failure,   // default: fail closed
    OnError::Advisory => CheckState::Success,
};

By default gate.onError is block: an errored run blocks the merge. A repository can opt into onError: advisory so a provider blip does not freeze every merge, in which case the gate stands aside but the advisory check still goes neutral to show the error. The source comment names the constraint that keeps this honest: unusable model output never bypasses the gate, because a malicious diff could otherwise induce that error via prompt injection to slip past review. Either way the gate is binary on the merge decision and never neutral. That is the difference between "the check didn't fail" and "the check passed," and it is the difference branch protection actually reads.

Adopt the gate without surprises: postil plan

The honest objection to any required check is that you can't see what it will block until it blocks something, and a gate that fails your merge queue on day one is its own kind of noise. Postil's answer is postil plan, a config dry-run. It replays your stored past reviews under a candidate configuration and reports what would change, without posting anything or blocking anything:

The command reports before and after finding counts for each stored envelope, which findings the candidate config would suppress, and every gate outcome that would change. You see the gate flips before you arm the gate. The recommended adoption path is the one the integration guides describe: run postil/gate advisory for a couple of weeks, use postil plan to tune failOn until the gate flips only on changes you would genuinely hold, and only then mark the check required in branch protection. The dry-run is what lets that be a measured decision instead of a leap.

The short version

GitHub blocks merges on required status checks that conclude failure, not on review comments and not on neutral checks. A reviewer that posts a Comment (Copilot) or concludes neutral (Claude Code review, or any default-neutral check) has chosen, by design, a signal that branch protection will not enforce. A real gate is one check run that is willing to conclude failure, marked required, kept separate from the advisory chatter, and failing closed on error. You can read how the gate is configured, dry-run it with postil plan before arming it, or see a review run end to end first.

Sources

A required check that can fail.

postil/gate is a required check that concludes failure on gate-level findings and fails closed on error. Dry-run it with postil plan before you arm it.

Install the CLI Read the gate docs