perf(engine): extraction-phase instrumentation by jrusso1020 · Pull Request #444 · heygen-com/hyperframes

jrusso1020 · 2026-04-23T16:15:22Z

What

Adds per-phase timings and counters to extractAllVideoFrames and surfaces them on the producer's RenderPerfSummary as videoExtractBreakdown alongside a new tmpPeakBytes workDir size sample.

Why

Phase 2 video extraction has five distinct sub-phases (resolve, HDR probe, HDR preflight, VFR probe, VFR preflight, per-video extract) and today they collapse into a single videoExtractMs stage timing. That makes every subsequent perf PR in this stack immeasurable — you can't tell whether a win came from cache hits, preflight scope reduction, or pure extraction speed.

This PR is foundational for PR #445 (segment-scope HDR preflight) and PR #446 (content-addressed extraction cache).

How

New ExtractionPhaseBreakdown type with resolveMs, hdrProbeMs, hdrPreflightMs/Count, vfrProbeMs, vfrPreflightMs/Count, extractMs, cacheHits, cacheMisses. Populated inline with Date.now() wrappers — overhead is sub-millisecond on every phase.
Returned on ExtractionResult.phaseBreakdown.
Producer extends RenderPerfSummary with videoExtractBreakdown?: ExtractionPhaseBreakdown and tmpPeakBytes?: number. tmpPeakBytes is sampled from the workDir right before cleanup via a new recursive-size helper that swallows errors (purely observational — a missing workDir must never fail the render).

No changes to the capture-lifecycle resource tracking — earlier versions of this instrumentation plumbed injector LRU stats through RenderOrchestrator, which conflicted hard with upstream #371 (buildHdrCaptureOptions refactor). Dropped that piece for a marginal observability loss.

Test plan

Validation on packages/producer/tests/vfr-screen-recording:

"videoExtractBreakdown": {
  "resolveMs": 0, "hdrProbeMs": 0, "hdrPreflightMs": 0, "hdrPreflightCount": 0,
  "vfrProbeMs": 0, "vfrPreflightMs": 166, "vfrPreflightCount": 1,
  "extractMs": 97, "cacheHits": 0, "cacheMisses": 0
},
"tmpPeakBytes": 4578598

Total elapsed within noise of pre-PR baseline (2665 → 2673 → 3228ms across hosts).

Unit test: phase-breakdown assertion added to videoFrameExtractor.test.ts
Lint + format (oxlint + oxfmt)
Typecheck (engine + producer)
Manual perf validation against VFR fixture

jrusso1020 · 2026-04-23T16:15:36Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

miguel-heygen

Review: perf(engine): extraction-phase instrumentation

Clean additive-only change — zero deletions, zero regression risk. The ExtractionPhaseBreakdown type is well-typed, the forward-looking cache placeholder fields (cacheHits, cacheMisses) avoid a type-breaking change in #446, and the sampleDirectoryBytes helper is defensively coded (iterative stack, error-swallowing).

P2 — Should fix

vfrPreflightMs subtraction creates asymmetric semantics.

breakdown.vfrPreflightMs = Date.now() - vfrPreflightStart - breakdown.vfrProbeMs;

hdrPreflightMs wraps the entire HDR preflight block (probe + analysis + conversion), but vfrPreflightMs deliberately excludes probe time. A consumer looking at both *PreflightMs fields would expect analogous scopes. Either:

(a) Rename to vfrTranscodeMs to match what it actually measures, or
(b) Include probe time in vfrPreflightMs (matching hdrPreflightMs) — the separate vfrProbeMs already provides the finer decomposition.

Option (b) is cleaner.

JSDoc references phase3ExtractMs but the field is named extractMs.

*   - phase3ExtractMs wraps the parallel `extractVideoFramesRange` calls

P3 — Suggestions

Test coverage is VFR-only. HDR-path fields (hdrProbeMs, hdrPreflightMs, hdrPreflightCount) are never asserted — even a simple expect(...).toBeGreaterThanOrEqual(0) per field would catch initialization bugs.
sampleDirectoryBytes follows symlinks via statSync. Using lstatSync and skipping symlinks would be more accurate for "bytes on disk in workDir." Minor since the workDir is engine-controlled.
tmpPeakBytes returns 0 when workDir is missing, but the type is number | undefined. Using undefined for the missing-directory case would be more semantically precise.

Verdict: Approve with minor fixes

The instrumentation overhead is sub-ms (a handful of Date.now() calls). The design cleanly enables #445 and #446. Fix the vfrPreflightMs semantics and the JSDoc typo before merge.

Adds per-phase timings and counters to `extractAllVideoFrames`: resolve, HDR probe/preflight, VFR probe/preflight, per-video extract, plus cacheHits/cacheMisses placeholders for the downstream cache PR. Surfaces them on `RenderPerfSummary.videoExtractBreakdown` and samples the render's `workDir` size into `tmpPeakBytes` right before cleanup. These are foundational for measuring every other perf PR in the stack. Overhead is zero (ten Date.now() calls across a multi-second phase). A VFR-fixture breakdown is asserted in videoFrameExtractor.test.ts.

Address Miguel's review feedback on #444: vfrPreflightMs was subtracting vfrProbeMs from the wall-clock measurement, creating asymmetric semantics vs hdrPreflightMs (which includes hdrProbeMs). A consumer looking at both *PreflightMs fields expects analogous scopes — the separate *ProbeMs fields already provide the finer decomposition. Also fix the JSDoc reference to phase3ExtractMs (the field is extractMs).

jrusso1020 · 2026-04-23T22:02:27Z

Pushed 5cfabff (on top of a rebase onto current main). Addressed both P2 items from <https://github.com/heygen-com/hyperframes/pull/444#pullrequestreview-4164953230|your review>:

vfrPreflightMs semantics — went with option (b). vfrPreflightMs is now the full wall-clock from vfrPreflightStart (same shape as hdrPreflightMs), and the separate vfrProbeMs carves out the probe-only time for the finer decomposition. Also added a JSDoc note clarifying that both *PreflightMs fields include their probe-time siblings, so future readers don't make the same "subtract probe" assumption.
JSDoc phase3ExtractMs → extractMs — fixed.

On the P3s, kept as-is for this PR since they're orthogonal to the semantic fix:

HDR-path test assertions — happy to add in a follow-up; the instant-safety version (a single toBeGreaterThanOrEqual(0) on each HDR field) would catch init bugs without requiring an HDR fixture.
sampleDirectoryBytes → lstatSync — agree it's more precise. The engine-controlled workDir is the narrow reason it's fine today, but a one-line swap is cheap.
tmpPeakBytes → undefined on missing workDir — agree semantically. Current zero is a valid proxy for "directory already gone, can't measure" but undefined reads better.

All three would land clean in a single small follow-up PR; happy to do that after this stack merges so the downstream PRs aren't re-validated against moving semantics.

The CLI already ships `render_complete` events to PostHog but only carries a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With #444 adding per-phase breakdown and #446 adding cache counters, the data is sitting in `job.perfSummary.videoExtractBreakdown` and `job.perfSummary.tmpPeakBytes` but never reaching PostHog. This forwards the new fields as flat properties (flat is easier to query with PostHog insights than nested objects): - tmp_peak_bytes - stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms, stage_capture_ms, stage_encode_ms, stage_assemble_ms - extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms, extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms, extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits, extract_cache_misses `extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former is just the parallel ffmpeg extract inside Phase 3, the latter is the full stage (resolve + probe + preflight + extract). All fields optional — the Docker subprocess call site (no perfSummary available) still compiles and ships events without them.

miguel-heygen

Re-checked the live head. The vfrPreflightMs semantics and extractMs JSDoc fixes are in place, the latest CI is green, and I found no remaining blockers.

The CLI already ships `render_complete` events to PostHog but only carries a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With #444 adding per-phase breakdown and #446 adding cache counters, the data is sitting in `job.perfSummary.videoExtractBreakdown` and `job.perfSummary.tmpPeakBytes` but never reaching PostHog. This forwards the new fields as flat properties (flat is easier to query with PostHog insights than nested objects): - tmp_peak_bytes - stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms, stage_capture_ms, stage_encode_ms, stage_assemble_ms - extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms, extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms, extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits, extract_cache_misses `extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former is just the parallel ffmpeg extract inside Phase 3, the latter is the full stage (resolve + probe + preflight + extract). All fields optional — the Docker subprocess call site (no perfSummary available) still compiles and ships events without them.

## What Scopes the SDR→HDR preflight re-encode to the segment the composition actually uses, mirroring the existing VFR→CFR segment-scope fix. ## Why `convertSdrToHdr` was re-encoding entire source files, so a 30-minute SDR screen recording contributing a 2-second clip in a mixed HDR/SDR composition ate multi-second preflight time that produced frames no one would ever read. Validated on a mixed 30s-SDR + 2s-HDR fixture: `hdrPreflightMs` drops **87%** (1162→148ms), `videoExtractMs` drops **82%** (1272→231ms), `tmpPeakBytes` drops **45%** (8.2MB→4.5MB). Depends on #444 (phase-level instrumentation) for the measurement surface. ## How - `convertSdrToHdr` gains `startTime` and `duration` parameters ahead of the upstream `targetTransfer` arg added by #370. New signature: `convertSdrToHdr(input, output, startTime, duration, targetTransfer, signal, config)`. `-ss $start -t $duration` is added to the ffmpeg args. - Phase 2 now captures the full `VideoMetadata` per `resolvedVideos` entry (previously just `colorSpace`) so the caller can compute `segDuration` from `video.end - video.start` with a fallback to `metadata.durationSeconds - video.mediaStart` for unbounded (Infinity) clips — without firing another ffprobe. - After a successful convert, `entry.video.mediaStart` is zeroed out via shallow-copy (doesn't mutate the caller's `VideoElement`) so downstream extraction seeks from 0 instead of the original offset. Mirrors what the VFR→CFR path already does. ## Test plan Validation on `/tmp/hf-fixtures/hdr-sdr-mixed-scope`: ``` hdrPreflightMs: >1000 → 150 (gate: <300) ✓ videoExtractMs: 1272 → 237 (-82%) tmpPeakBytes: 8.2MB → 4.5MB (-45%) ``` - [x] Unit test: new regression test synthesizes 10s SDR + 2s HDR fixture inline and asserts the converted file's duration matches the 2s used segment (pre-fix matched the 10s source) - [x] Lint + format - [x] Typecheck - [x] Manual perf validation against synthesized fixture

The CLI already ships `render_complete` events to PostHog but only carries a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With #444 adding per-phase breakdown and #446 adding cache counters, the data is sitting in `job.perfSummary.videoExtractBreakdown` and `job.perfSummary.tmpPeakBytes` but never reaching PostHog. This forwards the new fields as flat properties (flat is easier to query with PostHog insights than nested objects): - tmp_peak_bytes - stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms, stage_capture_ms, stage_encode_ms, stage_assemble_ms - extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms, extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms, extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits, extract_cache_misses `extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former is just the parallel ffmpeg extract inside Phase 3, the latter is the full stage (resolve + probe + preflight + extract). All fields optional — the Docker subprocess call site (no perfSummary available) still compiles and ships events without them.

## What Adds a content-addressed cache for extracted video frames, keyed on the tuple `(path, mtime, size, mediaStart, duration, fps, format)`. Repeat renders of the same composition (studio edit → re-render, preview → final) skip the ffmpeg extraction entirely. ## Why Video frame extraction is the dominant non-capture phase for video-heavy compositions. Studio iteration workflows extract the same frames over and over — each render burns ffmpeg time that adds no value. Validated on `/tmp/hf-fixtures/cfr-sdr-cache`: ``` Cold (miss): extractMs=69, videoExtractMs=70, totalElapsedMs=2052 Warm (hit): extractMs=1, videoExtractMs=2, totalElapsedMs=1964 cacheHits: 0→1, cacheMisses: 1→0 ``` The fixture is tiny (3s CFR SDR @ 30fps), so the wall-clock delta is small; the extraction-time delta (69→1ms, 98%) scales linearly with source length. For heavy-iteration workflows (a user rendering the same composition while tuning encoding params), extraction time goes to zero on every repeat render. Depends on #444 (instrumentation surface) and #445 (segment-scope HDR preflight — otherwise cache keys would be unstable across renders on mixed-HDR compositions). ## How - New `packages/engine/src/services/extractionCache.ts`: - SHA-256 key over a stable JSON encoding of `(path, mtime_ms, size, mediaStart, duration, fps, format)`. Infinity duration is normalized to `-1` so unresolved natural-duration sources still produce stable keys. - Truncates to 16 hex chars in the entry directory name — 64 bits of entropy is plenty at cache scale and keeps `ls` output short. - `hfcache-v2-` schema prefix — bumping it invalidates old entries (callers own gc policy; the cache owns keys). - `.hf-complete` dotfile sentinel. An entry dir without the sentinel is treated as a miss (covers crash-mid-extract and abandoned writes); the next render re-extracts over the partial frames with `-y`. - `FRAME_FILENAME_PREFIX = "frame_"` shared with the extractor — future refactors only need to touch one place to rename frames. - `EngineConfig.extractCacheDir` (env: `HYPERFRAMES_EXTRACT_CACHE_DIR`) gates the feature. Undefined disables caching — extraction runs into the render's workDir and cleanup removes it on render end, preserving the prior behaviour exactly. No default root is chosen by the engine; the caller (CLI, app, studio) owns the location policy. - `ExtractedFrames.ownedByLookup` flag prevents `FrameLookupTable.cleanup` from rm'ing a shared cache dir at render end. Set to `true` on both hits and misses (misses own the directory they wrote into, but hand it over to the cache rather than deleting it). - Phase 3 extractor flow: 1. Snapshot `(videoPath, mediaStart, start, end)` per resolved video BEFORE Phase 2a/2b preflight mutates them — so cache keys are stable across renders that use workDir-local normalized files (those files have fresh mtimes every render). 2. Compute key, `lookupCacheEntry`. 3. On hit: rebuild `ExtractedFrames` from the cache dir plus the Phase 2-probed `VideoMetadata` — no re-ffprobe. 4. On miss: `ensureCacheEntryDir`, extract with `extractVideoFramesRange(..., outputDirOverride)`, then `markCacheEntryComplete` (the sentinel write is the last step so a crash leaves the dir un-sentineled). - `extractVideoFramesRange` gains an `outputDirOverride` parameter so cache-miss writes land directly in the keyed dir (no `join(outputDir, videoId)` wrapping). ## Test plan - [x] 19 unit tests in `extractionCache.test.ts` covering key determinism, mtime/size invalidation, format/fps/mediaStart/duration invalidation, Infinity normalization, sentinel semantics, missing-file tolerance - [x] 2 integration tests in `videoFrameExtractor.test.ts`: - "reuses extracted frames on a warm cache hit" — asserts `cacheHits=1`, `extractMs<50ms` on second call against a CFR SDR fixture - "invalidates the cache when fps changes" — different fps on second call forces a new miss - [x] End-to-end validation with `HYPERFRAMES_EXTRACT_CACHE_DIR` set, two runs of the same fixture - [x] Lint + format (oxlint + oxfmt) - [x] Typecheck (engine + producer)

The CLI already ships `render_complete` events to PostHog but only carries a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With #444 adding per-phase breakdown and #446 adding cache counters, the data is sitting in `job.perfSummary.videoExtractBreakdown` and `job.perfSummary.tmpPeakBytes` but never reaching PostHog. This forwards the new fields as flat properties (flat is easier to query with PostHog insights than nested objects): - tmp_peak_bytes - stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms, stage_capture_ms, stage_encode_ms, stage_assemble_ms - extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms, extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms, extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits, extract_cache_misses `extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former is just the parallel ffmpeg extract inside Phase 3, the latter is the full stage (resolve + probe + preflight + extract). All fields optional — the Docker subprocess call site (no perfSummary available) still compiles and ships events without them.

…#454) ## What Forwards the new per-phase extraction breakdown and `tmpPeakBytes` fields from `RenderPerfSummary` (added in #444 and #446) to PostHog via the CLI's existing `render_complete` telemetry event. ## Why The CLI already ships `render_complete` events to PostHog (`packages/cli/src/telemetry/client.ts`), but `events.ts:trackRenderComplete` only carried a subset of `RenderPerfSummary` — top-level timings, composition dims, and memory snapshots. After #444 added per-phase extraction breakdown (`videoExtractBreakdown`) and #446 added cache hit/miss counters, the data lives on `job.perfSummary` at render-complete but never reaches PostHog dashboards. Without this, any PostHog insight built around "how often are we hitting the cache?", "what's the median HDR preflight cost?", or "where in the extract phase do compositions spend time?" has to be answered by Datadog log scraping instead. ## How - **`packages/cli/src/telemetry/events.ts`** — extend `trackRenderComplete` props with 17 new optional fields: `tmpPeakBytes`, the six named stage timings, and the ten `videoExtractBreakdown` fields. All sent as flat properties (`extract_cache_hits`, `stage_capture_ms`, etc.) — PostHog insights query flat keys more ergonomically than nested objects. - **`packages/cli/src/commands/render.ts`** — wire `job.perfSummary.videoExtractBreakdown` / `stages` / `tmpPeakBytes` into the `trackRenderMetrics` → `trackRenderComplete` hand-off. - Naming: `extract_phase3_ms` deliberately disambiguates from `stage_video_extract_ms` — the former is just the parallel ffmpeg extract inside Phase 3; the latter is the full stage (resolve + probe + preflight + extract). - All new fields are optional. The Docker-subprocess branch of `render.ts` that doesn't have a local `perfSummary` still compiles and ships events without them. ## Test plan - [x] `bun run --cwd packages/cli test` — 161/161 pass - [x] `bunx tsc -p packages/cli/tsconfig.json --noEmit` — no errors - [x] `bunx oxlint` + `bunx oxfmt` — clean - [ ] Once merged, verify PostHog receives the new properties on a real render event (run `hyperframes render` against a fixture and watch PostHog ingestion — telemetry auto-disables in CI, so this requires a local dev render with `HYPERFRAMES_NO_TELEMETRY` unset). ## Stack Depends on #444 (adds the `videoExtractBreakdown` + `tmpPeakBytes` fields to `RenderPerfSummary`) and transitively on #445 → #446. ## Future work (not in this PR) - The HeyGen internal producer server (`hyperframes-internal/packages/producer/src/server.ts`) logs `perfSummary` to Datadog via `log.info` but has no PostHog integration. Production renders are the bulk of the traffic — separate PR to either ship perfSummary to PostHog from the internal server, or materialize Datadog log-based metrics for per-phase timings.

This was referenced Apr 23, 2026

perf(engine): segment-scope SDR→HDR preflight #445

Merged

perf(engine): content-addressed extraction cache for video frames #446

Merged

perf(producer): phase-level render instrumentation #430

Closed

miguel-heygen reviewed Apr 23, 2026

View reviewed changes

jrusso1020 mentioned this pull request Apr 23, 2026

feat(cli): forward perf breakdown + tmpPeakBytes to PostHog telemetry #454

Merged

4 tasks

jrusso1020 added 2 commits April 23, 2026 22:00

jrusso1020 force-pushed the perf/v2/phase-instrumentation branch from 76948c9 to 5cfabff Compare April 23, 2026 22:02

miguel-heygen approved these changes Apr 23, 2026

View reviewed changes

vanceingalls approved these changes Apr 24, 2026

View reviewed changes

jrusso1020 merged commit 31354d5 into main Apr 24, 2026
34 checks passed

jrusso1020 deleted the perf/v2/phase-instrumentation branch April 24, 2026 03:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(engine): extraction-phase instrumentation#444

perf(engine): extraction-phase instrumentation#444
jrusso1020 merged 2 commits intomainfrom
perf/v2/phase-instrumentation

jrusso1020 commented Apr 23, 2026 •

edited

Loading

Uh oh!

jrusso1020 commented Apr 23, 2026 •

edited

Loading

Uh oh!

miguel-heygen left a comment

Uh oh!

jrusso1020 commented Apr 23, 2026

Uh oh!

miguel-heygen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jrusso1020 commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Test plan

Uh oh!

jrusso1020 commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

miguel-heygen left a comment

Choose a reason for hiding this comment

Review: perf(engine): extraction-phase instrumentation

P2 — Should fix

P3 — Suggestions

Verdict: Approve with minor fixes

Uh oh!

jrusso1020 commented Apr 23, 2026

Uh oh!

miguel-heygen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jrusso1020 commented Apr 23, 2026 •

edited

Loading

jrusso1020 commented Apr 23, 2026 •

edited

Loading