perf(producer): hdr benchmark harness — --tags filter, peak heap/RSS tracking, bench:hdr script#382
Open
vanceingalls wants to merge 1 commit intovance/frame-dir-cache-isolation-testsfrom
Conversation
This was referenced Apr 21, 2026
Collaborator
Author
This was referenced Apr 21, 2026
Open
91f129b to
fe8fd9b
Compare
cadda3c to
aa41490
Compare
253cf29 to
faa7890
Compare
aa41490 to
9b942fb
Compare
faa7890 to
e4eb138
Compare
9b942fb to
3e3fa0c
Compare
e4eb138 to
d781d68
Compare
8610f7e to
0d2175b
Compare
d62dbc2 to
b171d31
Compare
0d2175b to
cd40e4b
Compare
b171d31 to
aefd2fc
Compare
cd40e4b to
5594d94
Compare
aefd2fc to
3be90e5
Compare
5594d94 to
4ea0021
Compare
3be90e5 to
ede0291
Compare
4ea0021 to
c95ffe5
Compare
ede0291 to
a4717db
Compare
c95ffe5 to
6cf7b33
Compare
a4717db to
c76bbb2
Compare
6cf7b33 to
23df022
Compare
c76bbb2 to
1cb6854
Compare
23df022 to
53e0f64
Compare
1cb6854 to
adfcf6f
Compare
53e0f64 to
9d3aa62
Compare
adfcf6f to
56d9997
Compare
9d3aa62 to
3500be4
Compare
56d9997 to
b8fa66f
Compare
3500be4 to
f572ea8
Compare
b8fa66f to
d9a7c43
Compare
f572ea8 to
4a1a749
Compare
d9a7c43 to
dc034ec
Compare
4a1a749 to
bbaef03
Compare
dc034ec to
39201e6
Compare
bbaef03 to
0b163b9
Compare
39201e6 to
cdb1508
Compare
…tracking, bench:hdr script Makes the existing benchmark harness genuinely useful for HDR perf work before landing image-cache and debug-logging optimizations in the rest of Chunk 8. Three tightly-related changes: 1. **Positive --tags filter** in `benchmark.ts`. Existing harness only had `--exclude-tags` (which defaults to `slow`). Adds `--tags hdr` so HDR runs don't have to wait for unrelated SDR fixtures. Filters compose: a fixture must match `--tags` (if provided) AND must not match `--exclude-tags`. 2. **Peak heap + RSS tracking** in `executeRenderJob`. A 250ms periodic `process.memoryUsage()` sampler runs alongside every render and reports `peakRssMb` / `peakHeapUsedMb` in `RenderPerfSummary`. Wall-clock alone can't catch slow memory regressions like an unbounded image cache — peak RSS does. Sampler is `unref`'d and always cleared in `finally` so it never keeps the event loop alive or leaks across jobs. Both fields are optional on the interface for back-compat with serialized older summaries. 3. **bench:hdr convenience script** plus a perf README at `tests/perf/README.md` documenting the harness, the new flags, and the captured April-2026 HDR baseline (PQ regression: 34.5s / 272 MiB RSS, HLG regression: 11.5s / 227 MiB RSS, both 1080p / 1 worker / 1 run). The benchmark output table is widened and gains PeakRSS / PeakHeap columns. A new `avgOrNull` helper preserves `null` in the JSON when no run reported memory (avoids silently coercing missing data to 0 in older snapshots). No behavior change for non-benchmark renders — the sampler runs in every `executeRenderJob` but its overhead is a single `process.memoryUsage()` call every 250ms, well below noise. Verification: - `bunx tsc --noEmit -p packages/producer` — clean - `bunx oxlint` / `bunx oxfmt --check` on changed files — clean - `bun test src/services/` — 60/60 pass (frameDirCache, orchestrator, etc.) - `bunx tsx src/benchmark.ts --tags hdr --runs 1` — both HDR fixtures render successfully, summary table prints PeakRSS/PeakHeap columns, per-run output shows new memory line. - `bunx tsx src/benchmark.ts --tags nonexistent` — exits 1 with a helpful message naming the active filters. Refs: plans/hdr-followups.md Chunk 8A.
0b163b9 to
2d59a61
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Make the existing benchmark harness genuinely useful for HDR perf work: positive
--tagsfilter, peak heap/RSS sampling, abench:hdrscript, and a perf README documenting the captured April-2026 baseline. Lands first in the Chunk 8 sub-stack so subsequent perf PRs can be measured against a known starting point.Why
Chunk 8Aofplans/hdr-followups.md. Wall-clock timing alone can't catch slow memory regressions like an unbounded image cache — peak RSS does. And the existing harness only had--exclude-tags, so HDR runs had to wait for unrelated SDR fixtures.What changed
1. Positive
--tagsfilter inbenchmark.ts. Adds--tags hdrso HDR runs don't have to wait for unrelated fixtures. Filters compose: a fixture must match--tags(if provided) AND must not match--exclude-tags.2. Peak heap + RSS tracking in
executeRenderJob. A 250 ms periodicprocess.memoryUsage()sampler runs alongside every render and reportspeakRssMb/peakHeapUsedMbinRenderPerfSummary. Sampler isunref'd and always cleared infinallyso it never keeps the event loop alive or leaks across jobs. Both fields are optional on the interface for back-compat with serialized older summaries.3.
bench:hdrconvenience script plus a perf README attests/perf/README.mddocumenting the harness, the new flags, and the captured April-2026 HDR baseline (PQ regression: 34.5 s / 272 MiB RSS, HLG regression: 11.5 s / 227 MiB RSS, both 1080p / 1 worker / 1 run).The benchmark output table is widened and gains
PeakRSS/PeakHeapcolumns. A newavgOrNullhelper preservesnullin the JSON when no run reported memory (avoids silently coercing missing data to 0 in older snapshots).No behavior change for non-benchmark renders — the sampler runs in every
executeRenderJobbut its overhead is a singleprocess.memoryUsage()call every 250 ms, well below noise.Test plan
bunx tsc --noEmit -p packages/producer— clean.bunx oxlint/bunx oxfmt --checkon changed files — clean.bun test src/services/— 60/60 pass (frameDirCache, orchestrator, etc.).bunx tsx src/benchmark.ts --tags hdr --runs 1— both HDR fixtures render successfully, summary table printsPeakRSS/PeakHeapcolumns, per-run output shows new memory line.bunx tsx src/benchmark.ts --tags nonexistent— exits 1 with a helpful message naming the active filters.Stack
Chunk 8A of
plans/hdr-followups.md. First PR in the Chunk 8 perf sub-stack; subsequent PRs (image cache, logger gating) measured against this baseline.