[Chore] Unskip VS Code e2e replay for subtasks#94
Conversation
|
No actionable issues found. See task
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
9f0ddd1 to
c31a59a
Compare
|
I see some merge conflicts here. Working on them now... |
879e0e8 to
2a54c07
Compare
|
I see some merge conflicts here. Working on them now... |
77052fc to
fa293b2
Compare
|
I see some merge conflicts here. Working on them now... |
fa293b2 to
c946c38
Compare
|
I see some merge conflicts here. Working on them now... |
|
I see some merge conflicts here. Working on them now... |
895516b to
ca06a86
Compare
ca06a86 to
7de61e6
Compare
f62a72d to
9aedf31
Compare
|
Addressed the three open review threads and pushed the follow-up in d5d5ae3. The shipped diff now keeps the subtasks probe on the same cancelled child task, captures that resumed child's final Validation on the pushed head is clean: |
9f2e99a to
da52b4e
Compare
8def5a4 to
7cb1196
Compare
7cb1196 to
620fe7c
Compare
|
Reminder: This PR is better served by the parallelizable tasks refactor After a deeper look at the subtask architecture, the whack-a-mole edge cases here are symptoms of an architectural constraint rather than gaps in the test coverage. The core issue: What to do with this PR: The real fix — parallelizable tasks refactor (incremental, no feature flag needed for most of it):
With proper task isolation, the state transitions become deterministic and the e2e tests become straightforward. |
After review, I'm going to implement / re-enable these e2e tests differantly.
e65249b to
be02bf6
Compare
be02bf6 to
1305342
Compare
edelauna
left a comment
There was a problem hiding this comment.
Prod changes reviewed as correctness changes revealed by e2e tests, not new features
Related GitHub Issue
#355 — Parallelizable Tasks implementation roadmap — broader architectural follow-up. The immediate bugs in this PR were discovered while unskipping existing e2e coverage.
Description
Unskips the
subtasksVS Code e2e replay suite and fixes three delegation lifecycle bugs it exposed:Cancelled child could reopen parent on completion (
AttemptCompletionTool): A cancelled-then-resumed delegated child would unconditionally delegate back to the parent viaattempt_completion, even if the parent had already detached. Added an idempotency guard that checksparentHistory.awaitingChildId === task.taskIdbefore delegating; otherwise falls through to the normal completion-ask flow.Parent not detached on child cancellation (
ClineProvider.cancelTask): Cancelling a delegated child left the parent indelegatedstatus withawaitingChildIdstill pointing at the cancelled child. NowcancelTasktransitions the parent back toactiveand clearsawaitingChildIdbefore rehydrating the child, while preserving the persistedparentTaskId/rootTaskIdlineage for history grouping.Child started despite parent metadata failure (
ClineProvider.delegateParentAndOpenChild): IfupdateTaskHistoryfailed when persisting the parent's delegation metadata, the child was started anyway with corrupted state. NowcreateTaskacceptsstartTask: falseso the child is created but not started until metadata is persisted. On failure, the paused child is removed from the stack, its history entry is deleted, and the parent is restored from history before the error is surfaced.The e2e test was rewritten to exercise the actual cancellation-resumption flow: it cancels the delegated child mid-followup, resumes the same child (rather than starting a fresh task), validates the follow-up tool result content, and asserts clean completion without parent revival.
Why other VS Code e2e suites changed
This PR distinguishes two cleanup operations:
cancelCurrentTask()simulates a user cancellation and may rehydrate the current task so it can be resumed.clearCurrentTask()removes the current task without entering the cancellation-resumption flow.The existing tool e2e suites used
cancelCurrentTask()as teardown. Once the subtasks cancellation lifecycle was exercised correctly, that teardown behavior could leave rehydrated tasks active between tests. The affected suites now useclearCurrentTask()for cleanup because they are not testing cancellation behavior.Only
subtasks.test.tsintentionally usescancelCurrentTask().Interim mitigation
These lifecycle fixes are intentionally bounded mitigations for the current single-task execution model. They preserve recoverability and add regression coverage for races exposed by the unskipped e2e replay.
The broader architectural fix is tracked in #355. That work introduces explicit task lifecycle ownership, atomic history transitions, startup reconciliation, and
TaskScheduleratmaxConcurrency = 1before enabling parallel execution.The abort path also avoids rejecting from fire-and-forget assistant-message presentation. The awaited task loop now exits its
pWaitForwhen aborted or abandoned and throws from that owned control-flow boundary.Test Procedure
TEST_FILE=subtasks.test pnpm --dir apps/vscode-e2e test:ci:mock— e2e replay covering the full cancel → resume → complete → parent-stays-dormant flowpnpm --dir src test core/tools/__tests__/attemptCompletionTool.spec.ts— unit tests for the parent-awaiting-child guard (delegates when matched, falls through when detached)pnpm --dir src test core/webview/__tests__/ClineProvider.flicker-free-cancel.spec.ts— unit tests for runtime parent-link detachment on cancel and graceful fallback when detach failspnpm --dir src test src/__tests__/provider-delegation.spec.ts— unit test for child cleanup when parent metadata persistence failsPre-Submission Checklist
cancelCurrentTask()toclearCurrentTask().Screenshots / Videos
N/A — no UI changes. This is delegated-task lifecycle and e2e replay coverage work.
Documentation Updates
Additional Notes
Related PRs