Skip to content

refactor(dream): remove learning pipeline + per-task skills with dynamic model loading#238

Open
dean0x wants to merge 5 commits into
mainfrom
refactor/dream-skills-remove-learning
Open

refactor(dream): remove learning pipeline + per-task skills with dynamic model loading#238
dean0x wants to merge 5 commits into
mainfrom
refactor/dream-skills-remove-learning

Conversation

@dean0x
Copy link
Copy Markdown
Owner

@dean0x dean0x commented Jun 5, 2026

Summary

This PR removes the Devflow self-learning pipeline end-to-end and restructures the Dream subsystem from a single sequential agent into N per-task agents with clean context and a hardcoded task→model map.

Delivered in two commits to allow atomic review of each concern:

  • Phase A (cf853e5) — Learning pipeline removal end-to-end
  • Phase B (98ffa97) — Per-task skills + dynamic model loading

Changes

Phase A: Learning Pipeline Removal

What was removed:

  • scripts/hooks/eval-learning and eval-reinforce — SessionEnd learning batch accumulator and artifact reinforcement
  • src/cli/commands/learn.ts — full devflow learn CLI command (5 subcommands)
  • src/cli/hud/learning-counts.ts + components/learning-counts.ts — HUD learning counts display
  • tests/learn.test.ts, tests/learning/hud-counts.test.ts, tests/learning/capacity-thresholds.test.ts, tests/learning/migration.test.ts — all learning-specific test files

What was updated:

  • DreamConfig interface — now {memory, decisions, knowledge} (no learning field); coerceConfig silently drops legacy learning key
  • scripts/hooks/dream-collect-taskslearning) case is unconditional rm (delete orphaned markers on sight, not just when disabled)
  • scripts/hooks/session-start-context — removed Section 1.75 (LEARNED BEHAVIORS) and all _SC2_LEARN_EN / _LEARNING_DREAM references
  • src/cli/utils/migrations.ts — added purge-learning-pipeline-v1 (per-project) and purge-learning-global-v1 (global) migrations: clean up eval-learning, eval-reinforce installed hooks, orphaned learning log + config + state files, and the learning key from dream configs

Kept intentionally:

  • src/cli/utils/learning-cleanup.tscleanSelfLearningArtifacts is called by the per-project migration
  • src/cli/utils/observations.ts + observation-io.ts — decisions pipeline uses shared LearningObservation type and readObservations

Phase B: Per-Task Skills + Dynamic Model Loading

New skills (shared/skills/dream-{memory,decisions,knowledge,curation}/SKILL.md):

  • Each carries the verbatim task procedure lifted from the former monolithic dream.md body — logic is byte-for-byte identical, only hosting changed (applies ADR-008: LLM judgment in skills, plumbing in scripts)
  • allowed-tools: Read, Bash, Write, Edit, Glob, Grep — deliberate exception to the read-only skill default; these skills materialize artifacts (same posture as quality-gates and git)
  • Lock hardening (reliability rule, AC-C6/AC-P4): replaced the give-up-fast mkdir || sleep 1 || exit 1 pattern in both the decisions skill (.reinforce.lock) and curation skill (.decisions.lock) with a bounded retry+backoff loop: 9 attempts, exponential backoff doubling from 1s capped at 8s, ~30s total. On exhaustion the task fails cleanly (leaves .processing for dream-recover) — never silently drops a write. No unbounded loops.
  • Curation skill: acquires .decisions.lock EXACTLY ONCE across its read-modify-write; never calls decisions-append while holding it (avoids the deadlock documented in KNOWLEDGE.md)

Rewritten Dream agent (shared/agents/dream.md):

  • Body is now plumbing-only: claim markers, heartbeat, multi-marker merge, then load the matching per-task skill via the Skill tool and follow it
  • Supports the combined "decisions then curation" sequential spawn (Opus handles both, run decisions fully first, then curation)
  • Adds four new dream skills to skills: frontmatter alongside existing devflow:apply-decisions and devflow:apply-feature-knowledge

Rewritten SessionStart spawn directive (scripts/hooks/session-start-context Section 2):

  • Replaces single Agent(subagent_type="Dream") with per-task Agent() calls using a hardcoded task→model map: memory=haiku, knowledge=sonnet, decisions=opus, curation=opus
  • decisions + curation co-pending → exactly ONE opus spawn whose prompt instructs running decisions skill then curation skill sequentially (prevents concurrent lock contention on .decisions.lock)
  • Unknown task types are silently skipped (belt-and-suspenders — dream-collect-tasks should never emit them)
  • bash -n clean; set -e intentionally absent (existing no-abort discipline, applies ADR-009/PF-008)

Registration:

  • All four skills added to plugins/devflow-core-skills/.claude-plugin/plugin.json and src/cli/plugins.ts (core-skills is the correct home — skills install universally regardless of plugin selection)
  • Bare names added to LEGACY_SKILL_NAMES for future cleanup migrations

Breaking Changes

None for end users. The devflow learn command is removed; users who call it will get a "command not found" error. The two new migrations clean up all installed artifacts automatically on devflow init.


Token Cost Characterization (AC-P1)

The new design spawns N per-task agents (up to 3: haiku for memory, sonnet for knowledge, opus for decisions+curation) instead of one sequential agent handling all tasks in a single context. Expected higher per-cycle token cost, justified by:

  • Clean context per task: each agent starts fresh, reducing irrelevant context overhead and improving LLM focus
  • Per-task model fit: memory (cheap haiku), knowledge refresh (capable sonnet), decisions/curation detection (high-quality opus) — the previous design ran everything at sonnet regardless
  • Decisions quality: running decisions on Opus every deep session improves ADR/PF detection fidelity; user-accepted trade-off
  • The concurrent execution model means wall-clock time is shorter even if raw token count increases

Reviewer Focus Areas

  • shared/skills/dream-decisions/SKILL.md — bounded retry loop for .reinforce.lock; verify cap is 9 attempts, backoff doubles 1→2→4→8 (capped), no unbounded paths
  • shared/skills/dream-curation/SKILL.md — verify lock acquired once, Edit calls happen between acquire/release, decisions-append not called while holding the lock
  • scripts/hooks/session-start-context lines 183–245 — verify decisions+curation branch emits exactly ONE opus spawn; verify memory/knowledge branches emit their correct models; verify unknown types fall through to the "no known types" log path
  • shared/agents/dream.md — confirm no learning remnants; confirm the "decisions then curation" combined prompt is unambiguous

Dean Sharon and others added 5 commits June 5, 2026 10:50
Removes the Devflow self-learning pipeline (workflow/procedural detection,
learning markers, eval-learning/eval-reinforce hooks, devflow learn CLI,
learningCounts HUD component) while preserving decisions, memory, knowledge,
and curation pipelines intact.

Key changes:
- Migrations: purge-learning-pipeline-v1 (per-project) + purge-learning-global-v1
  (global) clean up existing installs on next devflow init
- Hooks: delete eval-learning + eval-reinforce; remove Section 1.75 (LEARNED BEHAVIORS)
  from session-start-context; convert learning) case in dream-collect-tasks to
  unconditional delete-on-sight (R1: orphaned markers never reach spawn emitter)
- TypeScript: remove DreamConfig.learning, devflow learn command, learningCounts HUD
  component and supporting types; update manifest schema; remove getLearning* path helpers
- Tests: delete learning-specific test files; update all shared tests to remove
  learning field references; add migration tests (TDD)
- Docs: update CLAUDE.md, README.md to reflect removal

Existing installs self-heal: migrations remove .devflow/learning/, dream markers,
config keys, and auto-generated self-learning artifacts. Stale learningCounts in
user HUD configs degrades gracefully (unknown component IDs are silently skipped).

Applies ADR-001 (clean-break philosophy with explicit migrations), ADR-002 (clean house),
PF-004 (migration idempotency), PF-007 (source-first hooks).
Phase B: lift the four Dream task procedures into dedicated per-task skills
(dream-memory/decisions/knowledge/curation) and rewrite both the Dream agent
body and the SessionStart spawn directive to load skills dynamically with a
per-task model map (memory=haiku, knowledge=sonnet, decisions/curation=opus).

Key changes:
- Add shared/skills/dream-{memory,decisions,knowledge,curation}/SKILL.md: each
  carries the verbatim task procedure lifted from the former dream.md body.
  allowed-tools includes Bash/Write/Edit (deliberate exception — these skills
  materialize artifacts). Includes bounded retry+backoff for .reinforce.lock
  (decisions) and .decisions.lock (curation): 9 attempts, ~30s total, explicit
  cap — no unbounded waits, no silent write loss (AC-C6/AC-P4).
- Rewrite shared/agents/dream.md: body is now plumbing-only (claim, heartbeat,
  multi-marker merge, error discipline). Per-task procedure loaded via Skill
  tool: "load devflow:dream-<task> and follow it". Supports "decisions then
  curation" sequential combined spawn.
- Rewrite session-start-context Section 2 directive: per-task Agent() calls with
  hardcoded model map; decisions+curation co-pending → exactly ONE opus spawn;
  unknown task types silently skipped (AC-F4/AC-C4).
- Register all four new skills in plugin.json and plugins.ts (core-skills).
  Add bare names to LEGACY_SKILL_NAMES for future cleanup migrations.

No learning remnants in any file. All 1464 tests green, zero warnings.
CO-AUTHORED-BY: Claude <noreply@anthropic.com>
…ts.toThrow() assertion (was auto-awaited at teardown, masking the check — fails under Vitest 3) - init.ts: add eval-learning/eval-reinforce to LEGACY_HOOK_FILES so upgrading users get the now-dead learning eval modules swept (copyDirectory is additive merge and never removes orphaned installed hooks; applies ADR-002 clean house) - includes Simplifier polish: drop redundant heartbeat note from skill preambles (explicit touch step retained), comment fixes
…kill restructure

- hooks KB: per-task spawn model, agent=plumbing+skill-load, 4 dream skills,
  lock hardening, learning pipeline removed
- cli-rules KB: devflow learn removed, dream skills registered in core-skills
- index.json: bump lastUpdated + referencedFiles (clears staleness)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant