Skip to content

feat(cmind): add consolidated /cmind.plan workflow#57

Open
QingtaoLi1 wants to merge 70 commits into
mainfrom
dev/rpgkit-consolidate-decoder
Open

feat(cmind): add consolidated /cmind.plan workflow#57
QingtaoLi1 wants to merge 70 commits into
mainfrom
dev/rpgkit-consolidate-decoder

Conversation

@QingtaoLi1
Copy link
Copy Markdown
Contributor

@QingtaoLi1 QingtaoLi1 commented May 25, 2026

  • Add /cmind.plan as the recommended Phase 2 entry point, consolidating build_skeleton, build_data_flow, design_base_classes, design_interfaces, and plan_tasks into one resumable workflow.
  • Add scripts/plan.py orchestration with progress probing, automatic resume/restart behavior, downstream rebuild cascading, per-stage verification, and forwarded tuning flags.
  • Update command docs and multilingual README guidance to replace the multi-command Phase 2 sequence with /cmind.plan.
  • Add unit coverage for the planning orchestrator's deterministic logic, including resume decisions, JSON probe parsing, stage registry checks, and forwarded CLI arguments.

HuYaSen added 30 commits May 18, 2026 21:46
Ship scripts/ and templates/commands/ as packaged assets inside the wheel
(rpgkit_cli/core_pack/) via hatch force-include, so 'rpgkit init' works
offline (air-gapped / corporate-proxy / enterprise environments).

Replace the build-time <AI_CLI_CMD> placeholder substitution with a
runtime resolver in scripts/common/llm_client.py that reads .rpgkit/config.toml
per workspace. This decouples scripts from the chosen AI agent and is
forward-compatible with running scripts from a shared installation in
the future.

Key changes:
* pyproject.toml: bump to 0.1.3 + force-include scripts/ and templates/commands/
* src/rpgkit_cli/_assets.py (new): importlib.resources access to core_pack
* src/rpgkit_cli/__init__.py:
  - _AI_TO_CLI_CMD authoritative map (mirrors release-zip CI)
  - _install_from_bundle() + _materialise_commands_for_agent()
  - _write_workspace_config() materialises .rpgkit/config.toml
  - .rpgkit/.source marker so 'rpgkit update' honours the user's channel
  - download_and_extract_template() dispatches: bundle by default, falls
    back to legacy release zip when --legacy-download is passed (or --pre,
    or when the bundle is unavailable / --script ps is requested)
  - 'rpgkit init': new --legacy-download flag
  - 'rpgkit update': new --legacy-download and --pull flags
  - _detect_install_method() + _upgrade_command() power --pull
* scripts/common/llm_client.py:
  - _load_ai_cli_cmd() with P1-P4 priority chain (constructor arg → env var
    → workspace config.toml → release-zip-baked-in fallback)
  - detect_agent_type() now resolves at call time, not module import
  - LLMClient.__init__ tolerates empty tool; generate() raises lazily with
    an actionable error message when the workspace is unconfigured
* scripts/__init__.py (new, empty): marker for hatch force-include
* .gitignore: un-ignore .rpgkit/config.toml so teams can commit the
  workspace default

Behavioural guarantees (7a route):
* Scripts still land under <workspace>/.rpgkit/scripts/
* No slash-command template was modified
* MCP config generation unchanged
* Hook installation logic unchanged
* All 27 pipeline scripts unchanged
* No new tests; the 11 pre-existing test failures on main remain unchanged

Design notes live in plans/01-package-bundle-and-ai-config.md (gitignored;
local reference only).
…/config.toml

* README: add Updating section explaining the three update flavours
  (default offline, --pull, --legacy-download) + note that bundle mode
  is the new default since 0.1.3.
* docs/cli-reference: list --legacy-download (init + update) and --pull
  (update), plus a Provisioning sources table summarising the two
  channels and the .rpgkit/.source marker.
* docs/configuration: new Workspace Configuration section covering
  .rpgkit/config.toml, the P1-P4 resolution priority chain, and the
  authoritative --ai → ai_cli_cmd mapping.
* docs/project-structure: surface config.toml and .source in the
  workspace tree.
Eight issues surfaced during a second-pass review of the bundle / config
work landed in commit 1c47c1d.  All fixes are local to
src/rpgkit_cli/__init__.py; no scripts, templates, hooks, or MCP logic
were touched.

1. _detect_install_method() reordered editable detection before uv.
   Previously an editable install placed in a uv-managed venv would be
   reported as 'uv' and 'rpgkit update --pull' would run
   'uv tool upgrade' instead of telling the user to git pull.

2. --script ps fallback notice was guarded by
   'if tracker is None and verbose:' which is never True in the actual
   init()/update() call path, so users never saw the message.  Emit it
   through the tracker as a skipped 'ps-fallback' step so it shows up
   in the live status report.

3. --pull self-upgrade is now executed BEFORE constructing rich.Live
   instead of from inside the Live context.  Mixing subprocess output
   (and the subsequent os.execvp) with Live's terminal control left a
   corrupted screen state.

4. --pull no longer continues the update flow with stale in-memory
   code after upgrading the wheel.  On successful upgrade we
   os.execvp() the rpgkit binary (with --pull stripped) so the new
   logic runs against the freshly-installed core_pack.

5. .rpgkit/.source provisioning marker is now written by whichever
   path actually ran (_install_from_bundle for bundle,
   _download_and_extract_release_zip for legacy zip).  The previous
   init/update logic decided based on flag state, which gave wrong
   results in the --script ps fallback case (bundle attempt → ps
   fallback to legacy zip → marker incorrectly left absent).

6. Removed redundant _install_command_templates_from_bundle wrapper.
   _materialise_commands_for_agent now also produces the rpgkit.<name>.md
   filename for unknown agents, matching the supported-agent paths.

7. _install_from_bundle no longer re-adds tracker step keys
   ('download', 'cleanup') that init()/update() had already registered;
   it just transitions them with .skip() to keep the live status report
   coherent.

8. Update()'s 'effective_legacy' boolean was the OR of three terms one
   of which redundantly re-checked another flag, followed by a no-op
   reassignment.  Simplified to a single OR.

No behaviour change for the offline-bundle happy path; smoke tests
continue to show 883 passed / 11 pre-existing failures (no regression).
Three more bugs surfaced in a deeper review of the 0.1.3 plumbing.
All fixes are local; existing test suite still shows the same 11
pre-existing failures from main (no regression).

1. .rpgkit/config.toml was still gitignored in user workspaces.
   1c47c1d but the gitignore content that rpgkit init/update WRITES
   into user workspaces (_GITIGNORE_RPGKIT_COMMON) did not.  As a
   result, the config a team would want to commit was still hidden
   from git.  Fixed by adding the un-ignore line to the embedded block.

2. LLMClient.generate_with_memory() silently swallowed the
   'AI CLI not configured' RuntimeError and returned None.  The caller
   then sees a generic LLM failure with no hint to run rpgkit init.
   Now pre-validates self.tool and re-raises the configuration error;
   genuine LLM-call failures continue to surface as None as before.

3. P4 (release-zip baked-in fallback) was broken by sed.  The new
   resolver introduced two module-level constants:
     _PLACEHOLDER_LITERAL = '<AI_CLI_CMD>'
     _BAKED_IN_VALUE     = _PLACEHOLDER_LITERAL
   The release-zip CI runs 'sed s|<AI_CLI_CMD>|<cmd>|g' on every script
   file, which would have rewritten BOTH constants to the same
   substituted value, making the
       _BAKED_IN_VALUE != _PLACEHOLDER_LITERAL
   guard tautologically false and the P4 fallback effectively dead.
   Now _PLACEHOLDER_LITERAL is built with string concatenation
   ("<" + "AI_CLI_CMD" + ">") so sed leaves it alone and the
   guard does the right thing on legacy-download workspaces.

Plus:
* Tightened --pre help text in both init() and update() to say
  'Implies --legacy-download', matching the new dispatcher behaviour.
* Updated the two gitignore-related tests in tests/test_hooks_install.py
  to count exact lines rather than substrings — the embedded block now
  contains both '.rpgkit/' and '!.rpgkit/config.toml', so the previous
  substring count would always be 2.  This is the only test edit
  introduced by the 0.1.3 work.
… hints

Previously `rpgkit version` listed both the local CLI version and the
latest GitHub release tag but did NOT tell the user which was which or
whether action was needed.  Combined with `uv tool upgrade rpgkit-cli`
silently printing 'Nothing to upgrade' (with no version context), users
had to mentally diff two version strings to figure out their status.

Now the output adds:

* 'Latest Release' (was 'Template Version' — clearer naming).
* 'Status' row with one of:
    - [green]up to date[/]                    local == remote
    - [yellow]outdated → X.Y.Z[/]             local < remote
    - [cyan]ahead of release (X.Y.Z)[/]       local > remote (dev build)
    - [yellow]offline[/]                      GitHub query failed
* When relevant, a second panel ('Upgrade tip') with actionable advice:
    - outdated → list of upgrade commands (uv / pipx / pip) + reminder
      to follow up with `rpgkit update` in each existing workspace.
    - ahead   → reassure the user nothing is broken (dev build).
    - offline → show the underlying error and suggest retrying online.

Version comparison goes through packaging.version.Version so PEP 440
pre-release / dev / post suffixes (eg 0.1.4.dev0, 0.1.3rc1, 0.1.3.post1)
sort correctly.  Falls back to plain comparison when packaging is
unavailable.

No new dependencies (packaging ships with setuptools / pip).
Test baseline unchanged: 883 passed, 11 pre-existing failures.
Plan 02 Batch A: build dispatcher infrastructure and convert MCP + hooks
to use the globally-installed rpgkit CLI rather than workspace-local
script copies.

- Add 'rpgkit script <relpath> [args...]' dispatcher with --list/--where.
- Add 'rpgkit-mcp' console script (rpgkit_cli.entries:mcp_main).
- _assets: add list_scripts() + dev-mode fallback to repo scripts/.
- mcp_server.py: extract main() function for console-script reuse.
- Rewrite MCP config writer: command='rpgkit-mcp', no absolute paths.
- Rewrite pre-commit / post-merge / post-commit / Claude SessionStart
  hooks to invoke 'rpgkit script update_graphs.py ...', with a PATH
  fallback line for GUI-launched commits.
- PATH self-check at end of 'rpgkit init' (warns when rpgkit-mcp
  missing from PATH).
- Update test_hooks_install.py assertions for the new contract.
- Batch B (templates + drop workspace scripts copy) to follow.

Refs: plans/02-route-scripts-via-cli.md (local)
…-1/2)

Plan 02 Batch B step 1+2: introduce cmd_for() helper and rewrite all
hardcoded 'python3 .rpgkit/scripts/X.py' references inside the
pipeline scripts.

- common/paths.py: re-anchor SCRIPTS_DIR to Path(__file__).parent.parent
  so it points at the actual filesystem location regardless of whether
  scripts are run from a workspace copy (pre-0.1.3) or the packaged
  rpgkit_cli/core_pack/scripts/ dir (post-0.1.3).
- common/paths.py: add cmd_for(relpath) helper returning the canonical
  'rpgkit script <relpath>' invocation.
- Sweep 12 script files: replace 21 hardcoded invocation strings in
  next_action messages, inter-script spawn commands, and error hints
  with cmd_for() calls. Files: init_codebase.py, smoke_test.py,
  check_skeleton.py, check_code_gen.py, run_batch.py, update_graphs.py,
  code_gen/result_builders.py, rpg_encoder/run_update_rpg.py,
  rpg_encoder/run_encode.py, rpg_edit/{validate,review,code}.py.

Workspace .rpgkit/scripts/ copy still happens (templates not yet
converted); next commit removes both.
…/4/5)

Plan 02 Batch B steps 3-5: complete the migration to globally-installed
scripts.

- Rewrite 13 slash-command templates: 'python3 .rpgkit/scripts/X.py'
  → 'rpgkit script X.py' (64 substitutions via sed).
- _install_from_bundle: stop copying scripts to .rpgkit/scripts/; only
  materialise slash-command templates and the .source marker.
- _download_and_extract_release_zip (legacy --legacy-download): strip
  the extracted .rpgkit/scripts/ after extraction; legacy channel now
  delivers commands only (D7).
- ensure_executable_scripts: collapse to a deprecated no-op stub.
- .gitignore: drop the obsolete .rpgkit/scripts/**/__pycache__/ rule
  (covered by the blanket .rpgkit/ ignore anyway).

After init the workspace contains only data/, logs/, config.toml,
.source — no scripts/ dir.  All pipeline scripts run from the wheel
via 'rpgkit script <name>'.  Test baseline unchanged (11 failed /
883 passed, all pre-existing failures).
Plan 02 Batch B step 8: rewrite documentation to match the new
contract.

- docs/cli-reference.md: add full 'rpgkit script' section (synopsis,
  options --list/--where, examples, rpgkit-mcp companion mention).
- docs/commands.md: rewrite 13 inline references from
  '.rpgkit/scripts/X.py' → 'rpgkit script X.py'.
- docs/project-structure.md: drop the obsolete .rpgkit/scripts/ subtree
  from the workspace layout diagram; add a callout explaining the
  packaged-scripts model.

Plan 02 self-repo agent prompt regeneration (Batch B-6) is gitignored
in this repo so changes there are not part of this commit; the next
'rpgkit init/update' on the RPG-Kit dev workspace picks up the new
template content automatically.
Audit pass on Plan 02 surface; address loose ends.

- templates/commands: sed prose backtick refs (5 files, 5 sites) —
  'Run the script `.rpgkit/scripts/X.py`' was missed by the previous
  python3-prefix sed pass and would mislead AI agents into trying a
  filesystem path that no longer exists.
- scripts/**.py: change all 'cmd_for("X.py")' inside f-strings to
  'cmd_for(\'X.py\')'.  PEP 701 nested same-kind quotes work on
  Python 3.12+ but trip many syntax highlighters / linters and are
  fragile.  Single quotes inside the f-string is the safer style.
- scripts/mcp_server.py: refresh module docstring to mention the
  rpgkit-mcp console-script entry instead of the legacy workspace
  path.
- scripts/feature_spec_to_json.py: usage docstring uses 'rpgkit
  script' form.
- src/rpgkit_cli/__init__.py: drop the deprecated
  'ensure_executable_scripts' no-op (no remaining callers); update
  one stale prose reference in _has_python_files docstring.

E2E + full test suite pass at baseline (11 pre-existing failures,
883 passing).
Two follow-ups from running 'rpgkit init .' on a real workspace:

1. Optional initial-encode kickoff (the 'Run the encoder now?' prompt
   at end of 'rpgkit init') still resolved the encoder via
   '$workspace/.rpgkit/scripts/rpg_encoder/run_encode.py'.  After
   plan 02 the workspace no longer has that subtree, so the kickoff
   always printed 'Encoder script not found' and aborted.  Fall back
   to '_assets.scripts_dir()' (the packaged location) when the
   workspace copy is absent.

2. The VS Code 'folderOpen' task in .vscode/tasks.json was still
   invoking sys.executable against the workspace 'update_graphs.py'
   path.  Rewrite to 'command: rpgkit, args: [script, update_graphs.py,
   status]' so it works against the globally-installed CLI.

Updated test_install_copilot_hooks_writes_folder_open_task assertions
for the new task shape.
…strings)

Full audit pass turned up a few more cases that needed updating after
'rpgkit init .' on a real workspace exposed gaps.

Functional bugs:
- scripts/rpg_edit/review.py: the review-stage prompt embedded
  '$WORKSPACE_ROOT/.rpgkit/scripts/tools/{browser,gui}.py' paths
  into the LLM instructions.  Since the workspace no longer hosts
  the scripts dir, the AI would invoke a non-existent file.  Switch
  to 'cmd_for("tools/browser.py")' / 'cmd_for("tools/gui.py")'
  and drop the leading 'python' from each invocation in the prompt
  template (it's already rooted by 'rpgkit script').
- scripts/code_gen/batch_prompts.py: the sub-agent guard-rail rule
  'You MUST NOT run any .rpgkit/scripts/*.py commands' now reads
  'run any rpgkit script ... or rpgkit-mcp commands' so the rule
  still covers what it intended to prohibit.

Docstring / comment cleanup:
- scripts/rpg_edit/__init__.py: module docstring example.
- scripts/update_graphs.py: post-commit lock comment example.
- Removed unused 'WORKSPACE_ROOT' import from rpg_edit/review.py.

Tests baseline preserved (11 pre-existing failures, 883 passing).
Every successful (or failed) 'rpgkit script <X>' invocation now
commits the current state of .rpgkit/ to a dedicated repo at
.rpgkit/.git/, giving users (and the e2e test runner) a free
'git log' / 'git diff' between any two pipeline stages.

New module rpgkit_cli._inner_git holds all the logic:
  - find_workspace_root()    walk up from cwd for .rpgkit/
  - ensure_inner_git()       create .rpgkit/.git + initial commit
  - auto_commit_after_script() snapshot after a 'rpgkit script' call
  - categorise_script()      derive [<category>] prefix
  - should_skip_script()     skip check_* / *_validation* / mcp_server
  - snapshot_count()         for 'rpgkit version' display

CLI surface:
  - rpgkit init      gains --no-rpgkit-git (default OFF = inner git ON)
  - rpgkit update    gains --no-rpgkit-git; backfills inner git for
                     pre-plan-03 workspaces, leaves pre-existing repos
                     untouched.
  - rpgkit script    auto-commits after the child exits; commit
                     message: '[<category>] <relpath> <args>'  with a
                     ' — FAILED (exit N)' suffix when the child failed.
  - rpgkit version   gains 'Inner git: N snapshots' line when present.

Commit identity uses per-call -c user.email/user.name (rpgkit-snapshot
<rpgkit@local>) — never writes to global git config.  Concurrent locks
(post-commit hook background worker) trigger a 1s retry then silent
skip; the next successful commit folds in any missed changes.

Plan: plans/03-auto-snapshot-inner-git.md (local)
Test baseline preserved (11 pre-existing failures, 883 passing).
The GitHub Copilot CLI (`copilot`) does NOT read workspace-local
`.vscode/mcp.json`; it only reads `~/.copilot/mcp-config.json` (or
accepts inline JSON via `--additional-mcp-config`).

To make `copilot` find `rpg-tools` automatically in any
rpgkit-initialised workspace, we now register the server globally on
`rpgkit init --ai copilot` and `rpgkit update --ai copilot`.  This is
safe because rpgkit-mcp is cwd-aware (walks up for rpg.json) and
stateless across workspaces — one global registration serves every
workspace the user cd-s into.  In workspaces without rpg.json the server
starts in degraded mode and tool calls return a rpg_unavailable hint
instructing the user to run /rpgkit.encode.

Safety rules baked into _register_copilot_cli_global_mcp():

  - No-op when in-sync: if the file already contains exactly our
    desired entry, we don't touch it (no mtime bump, no .bak).
  - Refuse to wipe a malformed config: file exists but isn't valid
    JSON -> abort with a clear error; user fixes it or passes
    --no-copilot-cli-mcp.  Without this a stray comma would let us
    silently drop every non-rpg-tools server.
  - Atomic write: serialise to mcp-config.json.tmp then os.replace()
    into place, so Ctrl-C mid-write can't leave the file half-written.
  - Respect user-customised entries: existing rpg-tools whose
    `command` is not `rpgkit-mcp` is left alone (user has pointed it
    elsewhere, e.g. a dev checkout).
  - One-shot .bak: only created on the first modification we actually
    perform; never on no-op runs, never overwritten.

New flag: --no-copilot-cli-mcp on both `init` and `update` to opt out.
Wired into the StepTracker plan so the tree output shows the step.

Verified against five scenarios (fresh, idempotent no-op, preserve
other-servers + update outdated entry, refuse malformed JSON, respect
user-customised command) — all PASS, no stray .tmp files.
VS Code Copilot's custom-agents schema (the rename of the old
chatmode format) no longer recognises the `mode:` frontmatter field.
Three of our prompt templates still carried `mode: agent` from the
chatmode era, producing this on load:

    Custom Agents — The following agents have warnings:
      • rpgkit.code_gen.agent.md: unknown field ignored: mode
      • rpgkit.design_interfaces.agent.md: unknown field ignored: mode
      • rpgkit.plan_tasks.agent.md: unknown field ignored: mode

Fix: replace `mode: agent` with the canonical `name:` field so
the three files match the frontmatter convention already used by
the other ten prompts (`name:` + `description:`).

Files: code_gen.md, design_interfaces.md, plan_tasks.md.
No behavioural change — `mode:` was already being silently ignored;
this just clears the warnings so the diagnostics view stays clean.

Ref: https://code.visualstudio.com/docs/copilot/customization/custom-agents#_are-custom-agents-different-from-chat-modes
Per-workspace data, logs and the inner-git snapshot repo now live at
~/.rpgkit/workspaces/<hash>/ (hash = sha256(realpath(workspace))[:12])
so the user's repository stays clean. Only the workspace marker
(.rpgkit/config.toml) and small user-facing reports (.rpgkit/reports/)
stay in the workspace; rpg.html lives there too so it can be browsed
next to the code.

Scripts:
- common/paths.py: derive DATA_DIR/LOGS_DIR/RPG_FILE/RPG_HTML_FILE
  from the new home-side helpers.  RPG_HTML_FILE points to REPORTS_DIR.
- feature_spec_to_json.py defaults to FEATURE_SPEC_FILE; the
  encoder, update_graphs.py and rpg_edit/* read and write through the
  new common.paths constants.
- New common/rpg_io.py with atomic_write_rpg and safe_load_rpg.  When
  rpg.json is truncated or invalid, scan the inner-git snapshot repo
  for the most recent valid copy, rewrite atomically and continue.
- New rpg_edit/save_plan.py so prompts can pipe EditPlan JSON into the
  home-side data store via the CLI rather than the Write tool.
- init_codebase.py: drop dead 'scripts = get_scripts_dir()' assignments
  and the now-unused import.

Prompt templates & docs:
- README.md and docs/* describe the new layout and point users at
  'rpgkit view-graph' to open rpg.html without computing the hash.
- templates/commands/*.md drop shell '> .rpgkit/logs/X.log 2>&1'
  redirections (scripts log internally) and remove every reference to
  '~/.rpgkit/workspaces/<hash>/' from agent-facing prompts.  The agent
  only reads stdout, and '<hash>' was often misread as a literal
  placeholder.

Tests:
- New tests/test_storage.py: workspace id, meta read/write, path
  helpers, WorkspaceMetaMismatch guard.
- New tests/test_rpg_io.py: atomic-write, unicode roundtrip, inner-git
  restore on corruption.
- test_e2e.py: assertion no longer hard-codes '.rpgkit/data/'.

Comment-style cleanup:
- Drop internal codename references (Plan B, Plan 03, plan A2/A3,
  plan B3/B4, plan E2/E3, plans/*.md, plan §2.5).
- Drop emphasis adverbs (deliberately, the single most, carve-out,
  Defensive:), version narration (pre-0.1.3, Pre-v4) and caps emphasis
  (MUST NOT, do NOT) from docstrings and inline comments.
src/rpgkit_cli/_storage.py (new):
- workspace_id = sha256(realpath(workspace))[:12]; helpers for home
  dir, data/, logs/, inner-git/, reports/, meta.toml.
- ensure_workspace_storage creates the layout idempotently and raises
  WorkspaceMetaMismatch on a hash collision against a different
  recorded workspace path.
- find_workspace_root_from walks up from cwd, but only accepts a
  candidate when its home-side dir exists AND meta.workspace_path
  matches.  Stale .rpgkit/config.toml markers (workspace deleted,
  moved, or renamed) are skipped instead of being silently adopted.

src/rpgkit_cli/__init__.py:
- Wire init/update through ensure_workspace_storage; logs and the
  inner-git dir are home-side, reports stay workspace-side.
- 'rpgkit version' prints the resolved workspace path, data/logs
  dirs and inner-git snapshot count, tagged '(not created yet)' when
  home-side storage hasn't been bootstrapped.
- New hidden 'rpgkit hook <name>' command.  The on-disk hooks under
  .git/hooks/ are now 3-line stubs that exec into this command, so
  path resolution, logging, locking and the detached background
  worker live in one Python place; upgrading the CLI takes effect
  without reinstalling hooks.
- post-commit: phase-1 'update_graphs.py sync' foreground, phase-2
  'update_graphs.py update-rpg --json' detached via
  Popen(start_new_session=True), with an mkdir-based directory lock
  and 60-min stale-lock recovery.
- pre-commit and post-merge: same dispatcher, sync only.

src/rpgkit_cli/_inner_git.py:
- Set RPGKIT_HOOK / RPGKIT_HOOK_SHA in 'rpgkit hook' so inner-git
  snapshot messages tag automated commits:
    [hook:post-commit @ a1b2c3d] update-rpg --json
    [hook:pre-commit  @ 9f8e7d6] sync --staged-only
    [decoder] feature_build.py --mode step1    (manual)
- _INNER_GIT_IGNORE now only excludes logs/copilot/ (MB-scale LLM
  session traces).  Other per-stage logs are tracked so users can
  'git -C ~/.rpgkit/workspaces/<hash> log -p logs/<stage>.log' to
  inspect how a stage's output evolved across snapshots.
- _ensure_gitignore_current rewrites the inner-git .gitignore before
  each commit if it has drifted from the current policy, so existing
  inner repos upgrade silently on the next snapshot.

src/rpgkit_cli/_assets.py, entries.py:
- Comment-style cleanup (drop emphasis adverbs, internal codenames,
  version narration).
When the outer repo's git commit runs a hook, git exports GIT_INDEX_FILE pointing at the outer .git/index.lock. If we leak that into the inner-git subprocess, the inner-git add -A writes entries (from $HOME/.rpgkit/...) into the outer index.lock, corrupting the outer commit ("error: invalid object ... Error building trees"). Strip GIT_INDEX_FILE / GIT_DIR / GIT_WORK_TREE / GIT_OBJECT_DIRECTORY before invoking the inner-git so the two repos stay isolated.
Several scripts logged absolute paths like '/home/user/proj/data/X.json' on success. When run via 'rpgkit script' under an LLM agent, those paths reveal the workspace layout (and home-side .rpgkit/ paths) to the agent, which then tries to access them and fails. Switch all 'saved to: {path}' messages to use path.name so stdout stays workspace-independent. Files touched: build_data_flow, build_skeleton, design_base_classes, design_interfaces, feature_spec_to_json, plan_tasks, summary_skeleton.
Per-stage script invocations now persist their stdout into the workspace logs directory (resolved via _storage.workspace_logs_dir). If the home-side dir doesn't exist yet (rpgkit init hasn't run), fall back to the previous behaviour (no log, direct passthrough).
Claude writes session traces under $HOME/.claude/projects/<hash>/sessions/ which is outside the rpgkit workspace. The metadata bookkeeping line called Path.relative_to(workspace_root) and raised ValueError, aborting the whole LLM call (and all retries) even after a successful response. Catch ValueError and fall back to the absolute path — session_trace is informational metadata and must never break the LLM call.
The pre-commit hook ran 'update_graphs.py sync --staged-only' and was followed ~1 sec later by the full post-commit sync that overwrote its output. Net effect: extra latency on every git commit and noisy [hook:pre-commit] entries in inner-git history, with no observable benefit (its writes never reach user's commit because all RPG state lives home-side).

Drop the install path; replace with idempotent _uninstall_git_pre_commit_hook so existing workspaces are cleaned on the next 'rpgkit init' / 'rpgkit update'. Keep 'rpgkit hook pre-commit' dispatcher branch as a deliberate no-op for backward compat with old hook files that haven't been stripped yet.

Also update MCP rpg-tools description ("pre-commit hook" -> "post-commit hook").
README.zh-CN.md: switch to 'Coding Agent' / 'workflow' English terminology (avoids translation drift to ja/ko/hi); rewrite '快速开始:已有仓库' to not require copying the repo; rewrite 'rpgkit init 之后会发生什么' with workspace-side vs home-side split, three-bullet rationale, multi-project parallelism callout; add troubleshooting entries for finding the home-side hash, cleaning home-side space, and full reset; split platform tables (Agent x Surface and OS x Shell) with legend.

docs/project-structure.md: fix self-contradiction (reports stay inside workspace, not outside); fix stale '.rpgkit/data/' path reference; add '.git/hooks/' (post-commit + post-merge only) and '.rpgkit/reports/' to the workspace file tree; replace the Python hashlib one-liner with 'rpgkit version' guidance; add 'Quick reference: where does each file live?' table covering workspace-side vs home-side artefacts.
markdownlint-cli2 only walks up from each linted file; running it from RPG-Kit/ never reached the workspace-root config. Add a sibling config that disables MD013 / MD041 / MD060 (line length, first-line H1, double-width column alignment — all false positives on slash-command templates and CJK docs) and ignores .venv/, plans/, workspace/ as non-published trees.
Remove the view-graph CLI command and update docs to direct users to rpgkit version and the generated reports path instead.

This avoids noisy browser-launch failures in headless and WSL environments while keeping the graph report discoverable.
Use the current zh-CN README as the source structure and sync English, Japanese, Korean, and Hindi variants.

Update installation, runtime storage, update workflow, platform support, and troubleshooting sections consistently across the localized READMEs.
…tall

# Conflicts:
#	RPG-Kit/src/rpgkit_cli/__init__.py
- _storage.find_workspace_root_from: drop home-side dir guard so
  fresh marker-only workspaces are discovered before any state is
  written; rely on .meta.toml workspace_path for stale-marker
  protection (moved/renamed dirs). Add regression test.
- scripts/common/rpg_io: strip GIT_INDEX_FILE / GIT_DIR /
  GIT_WORK_TREE / GIT_OBJECT_DIRECTORY from env before running
  inner-git recovery subprocesses so hook contexts don't bleed the
  outer repo's git state into recovery queries.
- scripts/__init__.py: refresh docstring to describe the
  'rpgkit script <name>' dispatcher path (no longer copied into
  the workspace).
- entries.mcp_main: remove unreachable 'scripts_dir is None' branch
  (_assets.scripts_dir always returns Path).
- templates/commands/feature_build.md: rewrite broken / duplicated
  Step 2 sentence into a single coherent paragraph.
- docs/project-structure.md: drop retired 'pre-commit' from the
  installed-hooks quick-reference row.
- scripts/common/rpg_io.py: add `--follow` to inner-git `git log` recovery
  call so the rename-tracking promise in the surrounding comment is
  actually delivered (works because the call uses a single path)
- src/rpgkit_cli/_inner_git.py: correct `ensure_inner_git` docstring to
  say the dropped `.gitignore` excludes `logs/copilot/` (not `logs/`),
  matching the actual `_INNER_GIT_IGNORE` content and the file's own
  earlier comment
- tests/test_e2e.py: drop `test_cli_encode_helpers` — it was a misnamed
  path-constant shape check (no encode helpers exercised, fixtures
  unused) whose responsibilities are already covered by tests/test_storage.py
Replace the opaque 12-hex-char SHA-256 workspace id with a readable
slug derived from the resolved absolute path (e.g.
'home-hys-projects-myrepo'), matching the convention used by
Claude Code under ~/.claude/projects/.

When the slug would exceed 200 chars (NAME_MAX safety margin) it is
truncated to 193 chars and suffixed with '-' plus a 6-char base36
SHA-256 digest, so deep paths still produce a deterministic, unique
directory name.

Backward compatibility: home_workspace_dir() honours a pre-0.1.4
12-hex-char directory if one already exists on disk, so existing
workspaces keep resolving to their current storage.

Also sweeps stale '<hash>' placeholders out of user-facing docs,
CLI prompts, docstrings, and comments, replacing them with the
documented '<workspace-id>' term.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 29 changed files in this pull request and generated 7 comments.

Comment thread CoderMind/scripts/plan.py
Comment thread CoderMind/scripts/plan.py Outdated
Comment thread CoderMind/scripts/plan.py Outdated
Comment thread CoderMind/scripts/feature_construct.py
Comment thread CoderMind/docs/commands.md Outdated
Comment thread CoderMind/docs/commands.md Outdated
Comment thread CoderMind/scripts/feature/prompts/spec.py
HuYaSen added 2 commits May 29, 2026 18:24
Cross-platform invoker detection:

- feature_construct.py: use Path(invoker[0]).stem to match both 'cmind' (POSIX) and 'cmind.exe' (Windows packaging), aligning with plan.py.

plan.py robustness:

- _extract_last_json_object(): skip stray '}' characters outside any object so a later well-formed object can still be captured (avoids depth going negative).

- Prerequisite error: point users at /cmind.feature_construct (the recommended Phase 1 entry point) instead of the deprecated /cmind.feature_refactor.

- _print_failure_hint(): send the leading blank line to stderr along with the rest of the hint, so output stays consistent when stdout/stderr are captured separately.

feature_spec prompt accuracy:

- prompts/spec.py: correct 'camel-case names are JSON keys' to 'snake_case attribute names are the JSON keys' — the actual schema fields (background_and_overview, line_start, etc.) are snake_case; the previous wording could prompt the LLM to emit the wrong casing and fail Pydantic validation.

docs/commands.md:

- /cmind.feature_construct output description: drop the obsolete '.cmind/data/feature_spec/' markdown directory; only the three JSON artefacts (feature_spec.json, feature_build.json, feature_tree.json) are produced now.

- Data files table: drop the obsolete 'feature_spec/' row for the same reason.
…nds.md

The user-facing commands reference now describes only what the commands *produce*, not what they used to produce and no longer do. The corresponding statement is kept in templates/commands/*.md (agent prompts) and scripts/feature/prompts/spec.py (LLM-facing) because there it remains a behaviour constraint, not a historical note.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 29 changed files in this pull request and generated 3 comments.

Comment thread CoderMind/scripts/plan.py
Comment thread CoderMind/scripts/plan.py Outdated
Comment thread CoderMind/scripts/feature_construct.py Outdated
…te hints)

plan.py main(): add explicit 'return 0' on the success path so the function honours its '-> int' annotation; previously it implicitly returned None, which still exited 0 via SystemExit but broke the contract for programmatic callers and type checkers.

_print_failure_hint() (plan.py + feature_construct.py): build the recovery commands from the active invoker (via _script_argv) instead of hard-coding 'cmind script ...'. In the Python fallback mode (cmind not on PATH) the hint now prints a runnable 'python <abspath>' form; in the normal cmind mode it still shows the short 'cmind script <name>' form because the invoker basename is used for display.
@QingtaoLi1 QingtaoLi1 requested a review from Bonytu May 29, 2026 12:25
HuYaSen added 6 commits June 2, 2026 10:29
interface_review.py:
- Re-run check_call_graph_connectivity and check_feature_dependency_coverage
  after the final iteration's fixes so feature_orphans and orphan_units
  reflect the true post-fix state (previously stored the pre-fix snapshot,
  causing 'passed' to always be false).
- _apply_fixes returns a structured dict (requested_fixes / applied_fixes /
  applied_edges / unapplied) instead of a single ambiguous int. Fixes the
  misleading 'Applied X/Y fixes' log where X could exceed Y because
  add_dependency may carry multiple callees per fix.
- Collect modify_interface / add_interface / unresolved add_dependency
  requests into final_result['unapplied_fixes']; these now block passed=true
  instead of being silently dropped.
- Treat repeat-recommended edges that already exist as applied (avoids
  false-positive unapplied when the LLM re-issues the same fix).
- Truncate unapplied descriptions to 200 chars to prevent interfaces.json
  bloat across iterations.

design_interfaces.py:
- Surface orphan_units_count, unapplied_fixes_count, and unapplied_fixes
  (with action/file/unit/reason) on global_review so downstream consumers
  and bench validators can report concrete failure causes.
- print_summary now lists the top unapplied fixes for terminal visibility.

rpg_edit/review.py:
- Fix AttributeError 'SmokeResult object has no attribute get'. run_smoke_test
  returns a SmokeResult dataclass, not a dict; call .to_dict() and synthesise
  a per-layer pass/fail map (SmokeResult has no 'summary' field) so the
  review agent actually sees smoke results instead of an error string. This
  was the root cause of agents falling back to repeated browser/smoke
  retries during rpg_edit (visible as 'FAILED (exit 1)' commits in inner
  git logs).
…mmary

Make the global-review verdict actionable instead of just observable.

interface_review.py
- _apply_add_interface: new handler materialises an LLM-requested unit
  into interfaces.json (units / units_to_features / units_to_code /
  file_code) with stub body (signature + docstring + pass). Idempotent
  on (file_path, unit_name); rejects on missing signature/docstring/
  feature_path or malformed signature; validates feature_path against
  both the skeleton and the current RPG (rejects pruned features).
  Optional incoming_calls_from installs invocation edges so the new
  unit is wired in the same review step. Tags the file entry with
  _handler_added so InterfacesStore can protect the unit from prune.
- _insert_unit_into_file_code: AST-aware insertion helper. Puts the new
  stub before any top-level 'if __name__ == "__main__":' block;
  falls back to safe append on SyntaxError. Never raises.
- _apply_fixes: dispatches add_interface to the new handler;
  modify_interface still goes to unapplied (no auto-handler — risk too
  high to rewrite existing units).
- Previous-iteration feedback: when iteration > 0, the LLM prompt now
  includes a structured 'could NOT auto-apply' block with reasons,
  preventing the LLM from blindly re-issuing identical failed fixes.
- GLOBAL_INTERFACE_REVIEW_PROMPT: extended schema for add_interface
  (signature/docstring/feature_path/incoming_calls_from REQUIRED) and
  governance rules (when to use which action, no repeats).
- Final 'passed' now also blocks on len(unapplied_fixes) > 0.

interface_agent.py
- GlobalInterfaceRegistry.register_unit: idempotent single-unit
  registration. Used by handler-added units (the bulk
  register_from_subtree_result is per-subtree-once).

interfaces_store.py
- InterfaceUnit gains handler_added: bool field.
- from_legacy_format propagates the _handler_added tag onto units.
- find_orphan_units excludes handler-added units from prune candidates.

design_interfaces.py
- collect_skeleton_features / collect_rpg_feature_paths helpers feed
  review_and_fix with the two validation sets.
- print_summary reorganised into three named stages (Generation /
  Coverage / Global Review) plus a single canonical 'Overall' verdict
  that mirrors global_review.passed. Surfaces handler-added units
  separately from orphans.

templates/commands/design_interfaces.md
- Step 4 rewritten to describe the three-stage output and tell the
  agent what to do on FAIL (list concrete blockers from
  global_review.unapplied_fixes).
…en entry-point prompt

Two follow-ups discovered while validating the add_interface handler
end-to-end against the todo-list-app fixture.

interfaces_store.py
- to_interfaces_json now writes the _handler_added tag back into each
  file entry (was only read in from_legacy_format). Without this the
  tag was lost on the round-trip through InterfacesStore: any
  handler-added unit became indistinguishable from a regular orphan in
  the next iteration / downstream stage, defeating find_orphan_units's
  exclusion logic.

interface_review.py (GLOBAL_INTERFACE_REVIEW_PROMPT)
- 'Framework callbacks' bullet expanded to explicitly enumerate HTTP
  route handlers (Flask/FastAPI/Django), CLI subcommands, event
  subscribers, background workers, scheduled jobs, and pytest hooks
  as entry points. The previous phrasing was too abstract — the LLM
  consistently missed marking Flask routes as entry, which produced
  spurious orphan_unit reports for create_todo/toggle_todo/delete_todo.
- Added 'Lean towards MORE entry points when in doubt' guidance: false
  positives are cheap (just a tag), false negatives create misleading
  WARN signals downstream.
Four small, schema-stable hardening fixes verified against the bench
runs under workspace/codermind-bench/e2e/runs/. All preserve the
existing invocation_edge / inheritance_edge / reference_edge field
shapes, so previously serialised interfaces.json files keep loading.

interface_review.py
- _apply_fixes (add_dependency branch): the duplicate-check now also
  requires callee_file to match. Same callee_name can legitimately
  resolve to different files in different modules (e.g. save() on
  UserRepo vs OrderRepo); without this the second edge was silently
  dropped.

interface_agent.py
- DependencyCollector.post_process_edges: append a set-based dedup
  pass keyed on (caller, callee, caller_file, callee_file). Two
  writers (AST inspection + LLM declarations) can produce identical
  edges; without dedup, incoming counts double and can mask real
  orphans during global review.
- GlobalInterfaceRegistry._register_symbol: new helper that wraps the
  class_to_file / function_to_file assignments. When a name is about
  to overwrite a different file_path it now logs a WARNING. The
  underlying maps remain single-valued so behaviour is unchanged for
  good repos; collisions are merely visible to operators. Full
  multi-value support is deferred.
- InterfaceOrchestrator.design_all_interfaces: wrap each subtree call
  in a 2-attempt loop. If SubtreeInterfaceAgent's internal 10-iteration
  loop exhausts and leaves any file with units=[], rerun the whole
  subtree once. Simple variant — the retry pays for ALL files in the
  subtree, but is only entered when at least one file failed, which is
  rare. The expensive scoped-retry variant (rerun only failed files,
  reuse partial state) is deferred.

Verification
- pytest 'review or interface' filter: 5/5 pass
- Handwritten unit tests:
  * callee_file dedup: pre-existing edge to file A + new edge to
    file B for the same callee_name are correctly kept as two edges
  * post_process_edges dedup: 3 input edges with 1 duplicate -> 2 out
  * registry collision: warning emitted only on differing file_path;
    idempotent same-file re-registration stays silent; last-write-wins
    preserved

Explicitly out of scope (tracked separately):
- Caller-name normalisation in post_process_edges (would change a
  hot-path field shape).
- Cumulative entry_points union across review iterations (R7).
- Substring-match prune bug in _edge_involves_unit.
- feature_nodes short-name collision fallback.
- Full multi-value GlobalInterfaceRegistry.
- Scoped subtree retry (only failed files).
Adds two layers of edge-duplication protection in DependencyCollector:

1. Write-time dedup in add_invocation / add_inheritance / add_reference:
   each new edge is compared against the existing bucket (matching the
   full key incl. callee_file) and dropped if already present. This is
   the primary defence against the dual-write pattern in
   InterfaceReviewer._apply_fixes / _apply_add_interface, which appends
   to enhanced_data_flow.invocation_edges and then re-feeds the same
   edge via dc.add_invocation — without dedup the second call doubled
   the edge inside dc.

2. Backstop in-place dedup in DependencyCollector.to_dict():
   final pass keyed on the same identity tuple, mutating the lists in
   place so callers that captured the shared reference from an earlier
   to_dict() call (e.g. design_interfaces.py phase 1.5) still observe
   the deduped contents. First-seen wins, which preserves the earliest
   'generator' value (so 'global_review' edges added directly by
   _apply_fixes survive instead of being overwritten by the
   add_invocation-side copy that hardcodes 'design_interfaces').

post_process_edges now also logs the removed-duplicate count when its
own dedup pass fires.

Verified end-to-end on the todo-list-app bench fixture:
- v5 (before): 27 invocation_edges, 8 duplicate keys
- v6 (after):  17 invocation_edges, 0 duplicate keys, 11 design +
  6 global_review generators correctly preserved.
… in review_and_fix

InterfaceReviewer.review_and_fix gated the overall 'passed' on
(last_llm_pass AND code_passed). last_llm_pass is the LLM's verdict
captured BEFORE that same iteration's fixes are applied, so it can
report FAIL even when every fix the LLM requested was successfully
applied and the post-fix structural check confirms zero orphans /
zero unapplied fixes.

Observed in todo-list-app bench v7: iter2 LLM said FAIL + 2 fixes;
both fixes applied; Final re-check showed 0 orphan units / 0 orphan
features / 0 unapplied; yet review_result.passed=False because
last_llm_pass=False.

Fix: drop last_llm_pass from the verdict and use the post-fix
structural code_passed check alone. last_llm_pass is still surfaced
as a separate field for visibility.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 3 comments.

Comment thread CoderMind/docs/commands.md
Comment thread CoderMind/templates/commands/plan.md
Comment thread CoderMind/templates/commands/feature_construct.md Outdated
@HuYaSen HuYaSen changed the title feat(rpgkit): add consolidated /rpgkit.plan workflow feat(cmind): add consolidated /cmind.plan workflow Jun 2, 2026
… clarify data path

Two small documentation fixes flagged by Copilot review on PR #57:

- templates/commands/feature_construct.md: the --check-only --json
  payload from feature_construct.py uses 'type' (update/skip/run) and
  per-stage 'done' booleans, not 'status'. Update the field list so the
  slash command's instructions match the actual JSON contract and the
  LLM parses the right keys.

- docs/commands.md: add a one-line reminder before the Data Files table
  that '.cmind/data/' is a logical name; the actual files live under
  ~/.cmind/workspaces/<workspace-id>/data/, with a pointer to the
  detailed note at the top of the document.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 3 comments.

Comment thread CoderMind/scripts/func_design/interface_review.py Outdated
Comment thread CoderMind/scripts/plan.py
Comment thread CoderMind/templates/commands/plan.md
HuYaSen added 2 commits June 2, 2026 14:45
…s are rejected

_apply_add_interface used a loose regex (^(def |class )[A-Za-z_]\w*\s*[(:])
that accepted truncated headers like 'def foo(' or 'class Bar' — feeding
syntactically invalid code into file_code and breaking the downstream
Python parse that other validators rely on.

The prompt contract already requires signature to be a single-line
Python def/class header ending with ':'. Enforce that here too:

- Tighten the regex to require a balanced parenthesised parameter
  list (where applicable), an optional '-> return-type', and a
  trailing ':' as the last non-whitespace character.
- Reject embedded newlines explicitly (multi-line signatures).
- Update the error message to point operators at the actual contract
  violation instead of a vague 'does not look like ...' message.

Raised by Copilot review on PR #57.
…probe

decide() previously rebuilt any stage whose check type was not
'update', while the post-build verification step accepted 'warning' as
success. That asymmetry meant a stage which legitimately produced a
warning-state artefact (e.g. tasks.json with auxiliary tasks lacking a
1:1 interface mapping) would be rebuilt on every subsequent plan.py
run, looping forever even though the post-build verifier kept passing.

Align both sides on the same contract: 'warning' means the artefact is
present and usable but a soft inconsistency was flagged. The orchestrator
must treat it as 'done' both during probe (so --check-only reports
completion) and during decide (so subsequent runs do not re-run that
stage or cascade into its dependents).

- plan.probe(): done = type_ in {'update', 'warning'}
- plan.decide(): skip rebuild when type in {'update', 'warning'};
  reason='up-to-date (warning)' makes the soft state visible in the
  status table.
- tests/test_plan_orchestrator.py: replace test_warning_triggers_run
  with test_warning_is_treated_as_done that pins the new contract.

Raised by Copilot review on PR #57.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 4 comments.

Comment thread CoderMind/templates/commands/plan.md Outdated
Comment thread CoderMind/templates/commands/plan.md Outdated
Comment thread CoderMind/templates/commands/feature_spec.md Outdated
Comment thread CoderMind/docs/commands.md Outdated
Follow-up to fcf1d9e (plan.py treats type='warning' as up-to-date).
Several user/LLM-facing docs still said 'done' meant only type='update',
which would have caused the slash command to keep prompting for
restart on warning-state stages.

- templates/commands/plan.md Step 1: list warning alongside update in
  the done definition and expose stages[*].type for downstream parsing.
- templates/commands/plan.md Step 2 Case C: simplify glyph rules to
  use stages[*].done (done -> tick, first not-done -> caret, rest ->
  dot). Per-stage warning details are surfaced by plan.py's own stdout
  when the stage runs, so the user-facing prompt does not need a
  separate '!' glyph that users can't act on.
- templates/commands/feature_spec.md Step 3: clarify that [SKIP] only
  fires when an existing feature_spec.json is schema-valid (missing or
  invalid files are regenerated), matching feature_spec.py / spec.py.
- docs/commands.md Resume semantics: document that both 'update' and
  'warning' count as 'this stage is done', with a one-line note on
  what 'warning' means.

Raised by Copilot review on PR #57.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants