Skip to content

fix(ci): E2E gate must verify work actually ran, not just top-level success#926

Merged
pimlock merged 1 commit intomainfrom
fix/e2e-gate-skip-detection
Apr 23, 2026
Merged

fix(ci): E2E gate must verify work actually ran, not just top-level success#926
pimlock merged 1 commit intomainfrom
fix/e2e-gate-skip-detection

Conversation

@pimlock
Copy link
Copy Markdown
Collaborator

@pimlock pimlock commented Apr 22, 2026

Summary

Closes gap in E2E Gate discovered while testing #922 in PR #924: labeling a PR after Branch E2E Checks already ran (with the label absent) kept the gate green, because the gate only checked the workflow's top-level conclusion.

When pr_metadata returns should_run=false, every downstream job (build-gateway, build-cluster, the E2E matrix) is skipped — so the workflow top-level concludes success even though nothing was actually tested. The gate was treating that as a pass.

Change

In .github/workflows/e2e-gate-check.yml, after verifying top-level success, also query the run's jobs and require at least one job other than Resolve PR metadata to have concluded success. If only the gate job succeeded, the gate fails with an actionable message.

Using "at least one non-gate success" instead of "zero skipped jobs" so legitimate conditional skips in future jobs don't break the gate.

Testing

Checklist

  • Conventional Commits
  • DCO sign-off

…uccess

The gate passed whenever `Branch E2E Checks` (or `GPU Test`) concluded
`success` for the head SHA. But when the required label wasn't set at
the time of the run, `pr_metadata` gates downstream jobs out, every
non-gate job is `skipped`, and the workflow still reports `success`.
Result: labeling a PR after the workflow already ran left the gate
green even though E2E never executed.

Now also query the run's jobs and require at least one non-gate
(`name != "Resolve PR metadata"`) job to have concluded `success`. If
only the gate job itself succeeded, fail with an actionable message
asking the maintainer to re-run the workflow so the gate re-evaluates
with the label in place.

Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
@pimlock pimlock requested a review from a team as a code owner April 22, 2026 22:21
@pimlock pimlock self-assigned this Apr 22, 2026
@pimlock pimlock merged commit 89dd10b into main Apr 23, 2026
17 checks passed
@pimlock pimlock deleted the fix/e2e-gate-skip-detection branch April 23, 2026 15:28
maxamillion added a commit to maxamillion/OpenShell that referenced this pull request Apr 23, 2026
…dman

# By Piotr Mlocek (2) and Drew Newberry (1)
# Via GitHub
* upstream/main:
  fix(ci): e2e gate must verify work actually ran, not just top-level success (NVIDIA#926)
  fix(driver-vm): preflight supervisor cross-compile toolchain in start.sh (NVIDIA#931)
  feat(server,driver-vm,e2e): gateway-owned readiness + VM compute driver e2e (NVIDIA#901)

# Conflicts:
#	crates/openshell-server/src/compute/mod.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants