Tech Brief: Parallelizable Tasks
Problem
The extension's execution model is strictly serial — one task at a time, one tool at a time. The delegation flow (delegateParentAndOpenChild / reopenParentFromDelegation in ClineProvider.ts) was built on top of this sequential model, resulting in three architectural race conditions patched with careful step ordering rather than proper concurrency primitives. The e2e tests in PR #94 are failing non-deterministically because these races cannot be fixed without structural changes.
Additionally, LLM responses frequently contain multiple independent tool calls (e.g., reading several files). These execute sequentially today, adding unnecessary latency.
Proposed Solution
An incremental migration from serial to concurrent execution, delivered in 5 epics and 18 stories. The approach has two key safety properties:
TaskScheduler ships at maxConcurrency = 1 — structurally identical to current behavior. This validates the architecture before any user-visible change.
- Parallel tool execution is gated behind an experiment flag (
PARALLEL_TOOL_EXECUTION, default false). Only promoted to default-on after file-level write serialization is in place.
Architecture
Before
ClineProvider
└─ clineStack: Task[] ← LIFO array, 18 usage sites
└─ Task
└─ presentAssistantMessage() ← sequential switch, boolean re-entrancy guard
└─ tool_use → execute → tool_result → next tool_use → ...
Shared mutable state: Task.lastGlobalApiRequestTime (static), terminalProcess (single slot), didRejectTool (global), presentAssistantMessageLocked (boolean).
After
ClineProvider
└─ TaskRegistry ← Map + stack adapter (Expand-Contract migration)
└─ TaskScheduler ← semaphore-based concurrency control
└─ Task
└─ ToolExecutionContext ← per-tool-call scoped state
└─ DispatchState ← IDLE | SERIAL | PARALLEL
└─ presentAssistantMessageParallel()
├─ parallelizable tools → Promise.allSettled + pLimit(8)
├─ file-write tools → per-path SequencerByKey serialization
└─ sequential tools → one at a time after parallel batch
Tool Classification
| Category |
Tools |
Rationale |
| Parallelizable |
read_file, list_files, search_files, list_code_definition_names |
Read-only, no side effects |
| File-serialized |
write_to_file, apply_diff, insert_code_block |
Parallel across files, serialized per-path |
| Sequential |
execute_command, new_task, attempt_completion, ask_followup_question, switch_mode |
Side effects, safety margin |
| Sequential (default) |
MCP tools, browser_action, unclassified |
Unknown side effects |
This follows OpenAI Codex's approach (supports_parallel_tool_calls() trait, default false) rather than Gemini CLI's model-driven per-call classification.
Key Design Decisions
| Decision |
Choice |
Prior Art |
| Concurrency primitive |
TaskSemaphore wrapping async-mutex (adds waiting count getter) |
async-mutex ^0.5.0 already in deps |
| Parallel dispatch |
Promise.allSettled + pLimit(8) |
Missing tool_result → API 400; allSettled guarantees all results collected |
| File write serialization |
Per-path promise chain (SequencerByKey pattern) |
VS Code SequencerByKey |
| Re-entrancy control |
DispatchState enum (IDLE/SERIAL/PARALLEL) replacing boolean lock |
Microsoft: Handling Reentrancy in Async Apps |
| Rejection strategy |
Two-tier: in-flight tools finish, queued tools cancelled with is_error result |
Gemini CLI batch-level rejection; Codex terminal_outcome_reached |
userMessageContentReady ownership |
Single Writer Principle — only parent's dispatch loop sets this flag |
Thompson (2011); Codex AtomicBool::swap |
clineStack migration |
Expand-Contract via TaskRegistry adapter (Phase A+B now, Phase C later) |
Fowler: Parallel Change |
| Webview task scoping |
Focus-gated posting + per-task clineMessagesSeq + taskId on messageUpdated |
VS Code Webview API — single WebviewView per ID, extension must implement multiplexing |
| Race condition fixes |
atomicReadAndUpdate() on TaskHistoryStore (keeps withLock private) |
Existing withLock pattern in codebase |
| Crash recovery |
Transition guard (enum + valid-transition map) + Kubernetes-style startup reconciliation |
K8s controller reconciliation loop; intent states rejected as over-engineering (rationale) |
Milestones
Milestone 1: Foundation + Race Fixes (Epics 1 & 2)
6 stories, all parallelizable across engineers.
Extract shared primitives (RateLimitClock, TaskSemaphore, experiment flag), fix the two delegation race conditions with proper lock boundaries, and add task status transition guards with startup reconciliation for crash recovery. No behavior change.
Milestone 2: TaskScheduler at maxConcurrency=1 (Epic 3)
6 stories, sequential after 3.2a lands first.
Introduce TaskRegistry (replaces clineStack), TaskScheduler (semaphore-based), fan-out code path (reachable only at maxConcurrency > 1), webview task-scoping guard split across two stories (3.2c extension-side focus gate; 3.2d webview-side rejection guard + per-task seq migration), and child completion callback with two injection paths (disk-based for suspended parent, in-memory for live parent). At maxConcurrency = 1, behavior is identical to today.
Milestone 3: Parallel Tool Execution (Epic 4)
5 stories, strictly sequential.
Scope per-tool state into ToolExecutionContext, replace boolean re-entrancy guard with DispatchState enum, implement parallel dispatch with Promise.allSettled + pLimit, add per-file write serialization, and serialize approval dialogs. All gated behind PARALLEL_TOOL_EXECUTION flag.
Milestone 4: Integration Validation (Epic 5)
1 story. Six end-to-end scenarios exercising the full path from parallel dispatch through subtask fan-out.
Delivery Order and Critical Path
┌─ 1.1 ─┐
├─ 1.2 ─┤
Weeks 1-2 ├─ 1.3 │ (all parallel)
├─ 2.1 │
└─ 2.2 ─┘
│
Week 3 3.2a (TaskRegistry — must land first)
│
Week 4 3.1 (TaskScheduler — rebases onto 3.2a)
│
Weeks 4-5 3.2b → 3.2c → 3.2d (fan-out + webview guard)
│
Week 5 3.3 (child completion)
│
Weeks 6-8 4.1 → 4.2a → 4.2b → 4.2c → 4.3 (strictly sequential)
│
Week 9 5.1 (integration tests)
Critical path: 3.2a → 3.1 → 3.2b → 3.2c → 3.2d → 3.3 → (Epic 4 can start at 4.1 once 1.2+1.3 land)
Risk Summary
| Risk |
Mitigation |
Story 3.2a is a blanket 18-site refactor of ClineProvider.ts |
Pure indirection with dual-write invariant; all existing tests must pass unchanged |
| Per-provider rate limiting changes cross-provider behavior |
Each VS Code window is a separate Node.js process; no regression for multi-window setups |
pLimit has no native cancellation for queued items |
Checked cancelled flag inside each wrapped function before tool body executes |
Parallel dispatch could produce duplicate tool_result entries |
Existing pushToolResultToUserContent duplicate guard (checks tool_use_id) prevents this |
Global TaskHistoryStore lock becomes a bottleneck at high concurrency |
Acceptable at maxConcurrency = 1; per-taskId locking deferred to Future Work |
Webview is single-task: messageUpdated has no taskId, seq is global |
Stories 3.2c + 3.2d add focus gating (ext-side) and cross-task rejection + per-task seq (webview-side) |
Crash mid-delegation leaves parent stuck as "delegated" |
Story 2.3 adds startup reconciliation (Kubernetes controller pattern) + transition guards |
Scope Boundaries
In scope: VS Code extension only (18 stories across 5 epics).
Out of scope:
- Standalone CLI delegation experience (deferred)
- Rich webview multi-task display (split-pane, tabbed task views — Stories 3.2c + 3.2d cover the safety layer; richer UI deferred)
- Per-taskId lock granularity in
TaskHistoryStore
- End-to-end flag-off regression test suite
TaskRegistry Phase C (remove internal stack)
Story Count
| Epic |
Stories |
New Files |
Modified Files |
| 1 — Foundation |
3 |
2 |
3 |
| 2 — Lifecycle Fixes |
3 |
0 |
2 |
| 3 — Scheduler |
6 |
2 |
6 |
| 4 — Parallel Tools |
5 |
0 |
4 |
| 5 — Integration |
1 |
1 |
0 |
| Total |
18 |
5 |
~9 unique |
Tech Brief: Parallelizable Tasks
Problem
The extension's execution model is strictly serial — one task at a time, one tool at a time. The delegation flow (
delegateParentAndOpenChild/reopenParentFromDelegationinClineProvider.ts) was built on top of this sequential model, resulting in three architectural race conditions patched with careful step ordering rather than proper concurrency primitives. The e2e tests in PR #94 are failing non-deterministically because these races cannot be fixed without structural changes.Additionally, LLM responses frequently contain multiple independent tool calls (e.g., reading several files). These execute sequentially today, adding unnecessary latency.
Proposed Solution
An incremental migration from serial to concurrent execution, delivered in 5 epics and 18 stories. The approach has two key safety properties:
TaskSchedulerships atmaxConcurrency = 1— structurally identical to current behavior. This validates the architecture before any user-visible change.PARALLEL_TOOL_EXECUTION, defaultfalse). Only promoted to default-on after file-level write serialization is in place.Architecture
Before
Shared mutable state:
Task.lastGlobalApiRequestTime(static),terminalProcess(single slot),didRejectTool(global),presentAssistantMessageLocked(boolean).After
Tool Classification
read_file,list_files,search_files,list_code_definition_nameswrite_to_file,apply_diff,insert_code_blockexecute_command,new_task,attempt_completion,ask_followup_question,switch_modebrowser_action, unclassifiedThis follows OpenAI Codex's approach (
supports_parallel_tool_calls()trait, defaultfalse) rather than Gemini CLI's model-driven per-call classification.Key Design Decisions
TaskSemaphorewrappingasync-mutex(addswaitingcount getter)async-mutex ^0.5.0already in depsPromise.allSettled+pLimit(8)tool_result→ API 400;allSettledguarantees all results collectedSequencerByKeypattern)SequencerByKeyDispatchStateenum (IDLE/SERIAL/PARALLEL) replacing boolean lockis_errorresultterminal_outcome_reacheduserMessageContentReadyownershipAtomicBool::swapclineStackmigrationTaskRegistryadapter (Phase A+B now, Phase C later)clineMessagesSeq+taskIdonmessageUpdatedWebviewViewper ID, extension must implement multiplexingatomicReadAndUpdate()onTaskHistoryStore(keepswithLockprivate)withLockpattern in codebaseMilestones
Milestone 1: Foundation + Race Fixes (Epics 1 & 2)
6 stories, all parallelizable across engineers.
Extract shared primitives (
RateLimitClock,TaskSemaphore, experiment flag), fix the two delegation race conditions with proper lock boundaries, and add task status transition guards with startup reconciliation for crash recovery. No behavior change.Milestone 2: TaskScheduler at maxConcurrency=1 (Epic 3)
6 stories, sequential after 3.2a lands first.
Introduce
TaskRegistry(replacesclineStack),TaskScheduler(semaphore-based), fan-out code path (reachable only atmaxConcurrency > 1), webview task-scoping guard split across two stories (3.2c extension-side focus gate; 3.2d webview-side rejection guard + per-task seq migration), and child completion callback with two injection paths (disk-based for suspended parent, in-memory for live parent). AtmaxConcurrency = 1, behavior is identical to today.Milestone 3: Parallel Tool Execution (Epic 4)
5 stories, strictly sequential.
Scope per-tool state into
ToolExecutionContext, replace boolean re-entrancy guard withDispatchStateenum, implement parallel dispatch withPromise.allSettled+pLimit, add per-file write serialization, and serialize approval dialogs. All gated behindPARALLEL_TOOL_EXECUTIONflag.Milestone 4: Integration Validation (Epic 5)
1 story. Six end-to-end scenarios exercising the full path from parallel dispatch through subtask fan-out.
Delivery Order and Critical Path
Critical path: 3.2a → 3.1 → 3.2b → 3.2c → 3.2d → 3.3 → (Epic 4 can start at 4.1 once 1.2+1.3 land)
Risk Summary
ClineProvider.tspLimithas no native cancellation for queued itemscancelledflag inside each wrapped function before tool body executestool_resultentriespushToolResultToUserContentduplicate guard (checkstool_use_id) prevents thisTaskHistoryStorelock becomes a bottleneck at high concurrencymaxConcurrency = 1; per-taskId locking deferred to Future WorkmessageUpdatedhas notaskId, seq is global"delegated"Scope Boundaries
In scope: VS Code extension only (18 stories across 5 epics).
Out of scope:
TaskHistoryStoreTaskRegistryPhase C (remove internal stack)Story Count