Add pr-context-synthesis skill#4374
Conversation
97e8570 to
59935ce
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a new Claude command for PR synthesis and a corresponding skill definition. The changes establish a structured process for generating concise summaries of pull requests by fetching data from GitHub. Feedback highlights a brittle approach to identifying linked issues, suggesting the use of GitHub's native API fields instead. Additionally, a contradiction was identified between the command's behavior and the skill's rules regarding how to handle drift between the PR description and the actual diff.
| Parse the PR body for `Fixes #N` / `Closes #N` / `Resolves #N` (case-insensitive) and fetch each linked issue: | ||
|
|
||
| ```bash | ||
| gh issue view <N> -R <owner>/<repo> --json title,body,labels,state | ||
| ``` |
There was a problem hiding this comment.
Manually parsing the PR body for linked issues using keywords like Fixes #N is brittle. A more robust approach is to leverage GitHub's built-in issue linking by using the closingIssuesReferences field from the gh pr view API. This ensures you are using the same mechanism that GitHub uses to link PRs and issues.
To implement this, you'll need to add closingIssuesReferences to the --json fields in the gh pr view call on line 32. Then, you can iterate over the structured data to get issue details.
| Parse the PR body for `Fixes #N` / `Closes #N` / `Resolves #N` (case-insensitive) and fetch each linked issue: | |
| ```bash | |
| gh issue view <N> -R <owner>/<repo> --json title,body,labels,state | |
| ``` | |
| gh issue view <issue.number> -R <issue.repository.owner>/<issue.repository.name> --json title,body,labels,state |
| ## Rules | ||
|
|
||
| 1. **Synthesize, don't copy-paste.** If `<pr_text>` is five words, say so plainly: *"description is minimal; intent inferred from diff"*. Don't pad to look thorough. | ||
| 2. **Watch for description-vs-diff drift.** `<pr_text>` must describe `<diff_summary>`'s *current* state, not the author's iteration history. If a claim is no longer true of the diff, raise a finding with `Action: update the PR description to match the current diff`. Do **not** flag the absence of a changelog of removed/superseded behaviour — that belongs in commit history, not the description. |
There was a problem hiding this comment.
There is a contradiction between this rule and the implementation described in .claude/commands/pr-synthesis.md.
This file says to raise a finding with Action: update the PR description....
However, .claude/commands/pr-synthesis.md (line 57) states that in case of a drift, the skill should note it in the synthesis as "description claims X; diff shows Y".
The latter seems more appropriate for a synthesis skill, which should report facts rather than take actions. The consumer of the synthesis can then decide whether to raise a finding. I recommend aligning this rule with the implementation.
| 2. **Watch for description-vs-diff drift.** `<pr_text>` must describe `<diff_summary>`'s *current* state, not the author's iteration history. If a claim is no longer true of the diff, raise a finding with `Action: update the PR description to match the current diff`. Do **not** flag the absence of a changelog of removed/superseded behaviour — that belongs in commit history, not the description. | |
| 2. Watch for description-vs-diff drift. <pr_text> must describe <diff_summary>'s current state, not the author's iteration history. If a claim is no longer true of the diff, note it in the synthesis (e.g., "description claims X; diff shows Y"). Do not flag the absence of a changelog of removed/superseded behaviour — that belongs in commit history, not the description. |
87b5610 to
8af2427
Compare
| @@ -0,0 +1,57 @@ | |||
| --- | |||
| description: Produce a tight 1–3 paragraph what / why / how synthesis of a single PR. Fetches title, body, linked issue, and diff scope; outputs per `docs/skills/pr-context-synthesis.md`. Useful as a standalone summary or as a building block for review / incident-investigation flows. Read-only. | |||
There was a problem hiding this comment.
I don't think this bit is necessary
| description: Produce a tight 1–3 paragraph what / why / how synthesis of a single PR. Fetches title, body, linked issue, and diff scope; outputs per `docs/skills/pr-context-synthesis.md`. Useful as a standalone summary or as a building block for review / incident-investigation flows. Read-only. | |
| description: Produce a tight 1–3 paragraph what / why / how synthesis of a single PR. Fetches title, body, linked issue, and diff scope; outputs per `docs/skills/pr-context-synthesis.md`. Read-only. |
|
|
||
| Accept any of: | ||
|
|
||
| - A PR number: `4267` |
There was a problem hiding this comment.
| - A PR number: `4267` | |
| - A PR number: `4267`/`#4267` |
|
|
||
| ```bash | ||
| gh pr view <N> -R <owner>/<repo> --json title,body,files,labels,baseRefName,headRefName,closingIssuesReferences | ||
| gh pr diff <N> -R <owner>/<repo> |
There was a problem hiding this comment.
This diff can potentially be huge and break the context window.
This should first get a summary of the diff size and then decide whether to chunk it or not.
| For each entry in `closingIssuesReferences` (GitHub's own parsing of `Fixes #N` / `Closes #N` / `Resolves #N`, more reliable than regexing the body), fetch the issue: | ||
|
|
||
| ```bash | ||
| gh issue view <issue.number> -R <issue.repository.owner>/<issue.repository.name> --json title,body,labels,state | ||
| ``` | ||
|
|
||
| If `closingIssuesReferences` is empty, proceed without a linked issue — never invent one. |
There was a problem hiding this comment.
This should honestly be done even before the diff, it should help setup context for the diff and also it would allow Claude to high mismatches
|
|
||
| Two ways: | ||
|
|
||
| - **Procedurally** — follow the rules below when you need a tight 1–3 paragraph synthesis (CONTEXT block of `/review-pr`, per-candidate block in `pr-blame-walk`, ad-hoc *"summarise this PR for me"*). |
There was a problem hiding this comment.
Doesn't this mean that this PR is blocked by the one for pr-blame-walk?
|
|
||
| ## Inputs | ||
|
|
||
| - `<pr_text>` — PR title and body. If there is no PR yet (e.g. local-diff mode), substitute the branch name plus the relevant commit messages. |
There was a problem hiding this comment.
A bit more accurate (?)
| - `<pr_text>` — PR title and body. If there is no PR yet (e.g. local-diff mode), substitute the branch name plus the relevant commit messages. | |
| - `<pr_text>` — PR title and body. If there is no PR yet (e.g. local-diff mode), use the current branch name and the relevant commit messages (i.e. diff against `main`). |
|
|
||
| - `<pr_text>` — PR title and body. If there is no PR yet (e.g. local-diff mode), substitute the branch name plus the relevant commit messages. | ||
| - `<linked_issue>` — title and body of any issue referenced via `Fixes #N` / `Closes #N` / `Resolves #N`. May be empty. | ||
| - `<diff_summary>` — file scope plus a codemap or per-file note of the actual change. The ground truth the synthesis must stay anchored to. |
There was a problem hiding this comment.
If this is the ground truth, should it be first in the list so the follow up items are interpreted with this in context?
What this is
A primitive for producing a tight 1–3 paragraph what / why / how synthesis of a single PR (or PR-shaped change). Stays anchored to the actual diff and refuses vague verbs.
Ships with
/pr-synthesis <N|owner/repo#N|url>, a slash command that fetches PR metadata, linked issue, and diff and prints the synthesis verbatim.Why split out
Extracted from #4351 because the synthesis primitive is reusable beyond a single review report — review and incident-investigation flows both want the same per-PR summary shape. Splitting keeps the contract small.
Files
docs/skills/pr-context-synthesis.md— rules + output shape..claude/commands/pr-synthesis.md— slash command wrapping the procedure.Changes after Jose's review
#4267form alongside4267.<diff_summary>(the ground truth) is first, with<pr_text>and<linked_issue>interpreted relative to it.pr-blame-walkreference from the skill body (forward-looking dep that doesn't belong here).<pr_text>input description to be accurate when there's no PR yet (local-diff mode).gh pr diffon a 376k-line PR will silently corrupt the context window. The procedure now gates the full-diff fetch behindadditions + deletions <= 2000. Above that, it falls back to the per-file scope from the paginated REST endpoint and bucket-summarises lockfiles, generated bindings, and codegen output.While testing this on PR #4217, I caught a separate silent-truncation bug in the procedure I was about to ship:
gh pr view --json filescaps at 100 files, so a 384-file PR would have silently underreported its scope. The fallback path now usesgh api --paginate repos/<o>/<r>/pulls/<N>/filesinstead.Worked examples
Two real PRs, both above the 2k full-diff gate, so both exercise the per-file-bucket fallback. (The original example in the skill doc — PR #4371, ~60 lines — exercises the small-diff branch.)
Example 1 — PR #4243 (4,432 lines, 24 files)
Boundary case: just over the gate. Generated bindings dominate the line count; the substantive change is a 275-line estimator + a 236-line e2e test.
Example 2 — PR #4217 (376,829 lines, 384 files)
Stress case: two orders of magnitude over the gate. Demonstrates that the per-file bucket summary is genuinely more useful than the full diff would be — 96% of the lines are mechanical codegen output, and the substantive change hides in a 116-file
+213/−1,531workspace bucket of import-path edits.Reviewing this in isolation
The skill body is self-contained. The intro mentions consumers (
/review-pr, ad-hoc summaries, future incident walks) but doesn't link to anything that doesn't exist onmain. Rules, Shape, and Example stand on their own.How to try it