feat: Redis cache + pre-warm for dashboard summary endpoints (Phase C of #20) by Boanerges1996 · Pull Request #28 · peermetrics/api

Boanerges1996 · 2026-04-23T14:10:20Z

Summary

Stacked on top of #26 and #27 — PR diff will show all Phase 2-5 + Phase C commits until those upstream PRs merge, then rebase down to just the cache commit (`6b2a3a6`).

Phases 0-5 moved dashboard aggregation to SQL. Phase C caches the results: one Redis entry per (endpoint + filter params), 60s TTL, pre-warmed so first visitors don't pay cold-query cost.

`app/summary_cache.py` — `cached_json(endpoint, request, compute)` wraps any summary view. Reads Redis; on miss, runs the compute closure and writes back with 60s TTL. Uses existing Django-redis cache backend with `IGNORE_EXCEPTIONS=True`, so a Redis outage degrades to the pre-cache behavior, never breaks.
All 8 summary views refactored to pass their compute bodies through the helper. No change to JSON shape or query logic.
`manage.py prewarm_summaries` — iterates apps with recent traffic and warms every summary view for the 30-day window the dashboard requests by default. Intended as an ECS scheduled task at ~30s cadence.

Local benchmark (7-day Production clone, ~18k conferences / 38k sessions / 38k connections)

endpoint	cold	warm	speedup
conferences/summary	391ms	12ms	33×
sessions/summary	748ms	11ms	68×
connections/setup-time-summary	373ms	11ms	34×
conferences/participant-count-summary	216ms	7ms	31×
issues/gum-summary	107ms	6ms	18×
connections/summary	57ms	6ms	9.5×
issues/summary	45ms	86ms	noise
conferences/duration-summary	19ms	8ms	2.3×

Total warm dashboard cost ≈ 150ms across all 8 endpoints (vs ~2s cold).

Test plan

Verify `/v1/conferences/summary` returns identical JSON before and after enabling cache
Flush redis, hit all 8 endpoints, confirm 8 keys appear at `:1:summary:*`
Hit the same endpoints again — confirm p99 < 50ms
Set `SUMMARY_CACHE_TTL=5`, wait 6s, confirm re-query triggers a new compute
Kill redis, confirm endpoints still return correct data (just slower)
Run `manage.py prewarm_summaries` — confirm 8 keys/app written, slow-query log line for any >500ms compute

Follow-ups (not in this PR)

Hook `prewarm_summaries` into the ECS scheduled task config (infra repo).
Consider surfacing `X-Cache: HIT|MISS` response header for ops visibility.
Pre-existing: local `docker-compose` sets `REDIS_HOST=redis://127.0.0.1:6379` which doesn't resolve inside the api container. Added a local override file; worth a one-liner fix in compose.

🤖 Generated with Claude Code

Five new endpoints for the remaining dashboard charts that fetch raw data: - GET /v1/conferences/duration-summary Returns conference counts bucketed by duration range (< 1m, 1-3m, etc.) - GET /v1/conferences/participant-count-summary Returns distribution of conferences by participant count - GET /v1/issues/summary Returns issue counts grouped by code with titles - GET /v1/issues/gum-summary Returns getusermedia_error issue counts grouped by error name Also adds three new filter params to /v1/conferences for click-to-detail modals on these charts: - duration_gte, duration_lt (for duration chart) - issue_code (for most-common-issues chart) All endpoints accept appId, created_at_gte, created_at_lte and handle both Python native ISO format and JavaScript's toISOString Z suffix. Phases 2 and 3 of peermetrics#20 — eliminates the need for the dashboard to download all conferences (~38MB) and all issues (~73MB).

…ermetrics#20) Adds three new aggregation endpoints that let the dashboard stop downloading full /connections and /sessions payloads to build charts client-side: - GET /v1/connections/summary — relay vs direct connection counts (replaces the Relayed-connections pie chart's client-side reduce) - GET /v1/connections/setup-time-summary — connection setup-time buckets with per-bucket conference_ids for click-to-detail - GET /v1/sessions/summary — browsers, OS, country, and city/geo aggregates (powers Browsers, OS, and Map charts in one roundtrip) Also accepts `conference_ids=a,b,c` on /conferences so the setup-time chart can page through matched conferences on click. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…se C of peermetrics#20) With Phases 0-5 merged, every dashboard chart reads from a server-side aggregation endpoint. The SQL is fast with indexes, but the same ~8 queries run on every page load, and the heavy ones (sessions.summary, connections.setup_time_summary) still cost 400-800ms on a live tenant. Adds a thin caching layer in front of each summary view: - `app/summary_cache.py` — `cached_json(endpoint, request, compute)` hashes (endpoint + filter params) into a short key, reads Redis, falls through to `compute()` on miss, and writes back with a 60s TTL. Redis failures are tolerated (settings already has IGNORE_EXCEPTIONS). - Each of the eight summary views moves its existing compute body into a local `compute()` closure and returns through the helper. No change to the JSON shape, query logic, or error handling. - `manage.py prewarm_summaries` — scheduled command that iterates apps with recent traffic (default: any conference in the last 2 days) and runs every summary view with the 30d-window filters the dashboard sends by default. Intended to run every ~30s as an ECS scheduled task so first visitors never see a cold miss. Measured locally against a 7-day Production clone (~18k conferences / 38k sessions / 38k connections): endpoint cold warm conferences/summary 391ms → 12ms (33x) sessions/summary 748ms → 11ms (68x) connections/setup_time_summary 373ms → 11ms (34x) conferences/participant_count_summary 216ms → 7ms (31x) issues/gum_summary 107ms → 6ms (18x) connections/summary 57ms → 6ms (9.5x) issues/summary 45ms → 86ms (noise; both <100ms) conferences/duration_summary 19ms → 8ms (2.3x) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

agonza1 · 2026-04-24T23:16:47Z

Some additional feedback:

P1 — real bug: GET /v1/conferences?issue_code=... can return the same conference multiple times when several issues share that code, which breaks pagination and count for dashboard drilldowns (each page should be unique conferences, aligned with aggregated chart semantics).

P2 — policy / correctness: issue_code should not match soft-deleted issues. Per existing BaseModel / API patterns, issue_code should only consider active issues → add issues__is_active=True whenever issue_code is applied.

P2/P3 — hardening: GET /v1/issues/gum-summary walks Issue.data with .get(). If data is not a dict (null is fine; bad legacy JSON is not), the view can 500. Skip non-dict rows and keep aggregating the rest.

Suggested fix direction:

ConferencesView.filter (app/views/conference_view.py):
When issue_code is present: filter with issues__code (already mapped) and issues__is_active=True, then .distinct() on the conference queryset (scoped to the issue_code path so other filters stay unchanged).
GetUserMediaSummaryView (app/views/issue_summary_view.py):
In the loop over Issue.data, continue if data is missing or not isinstance(data, dict).

Follow-ups: regression tests for the three bullets above; one-line README under private /conferences for issue_code + “one row per conference.” No migrations required (behavior-only).

agonza1 · 2026-04-24T23:18:38Z

Filtering conferences by issue code joins issues, so one conference can show up many times (e.g: camera issue happening 5 times would be counted 5 times and repeated in dashboard) and mess up pagination/counts. Deduplicate and only match active issues, like elsewhere.

The GUM chart reads Issue.data as a dict; if a row isn’t, the handler can crash—skip those rows better than failing the whole request.

- Unit-test cache key rules, hit/miss, TTL override, and soft-fail on get/set errors. - Smoke-test prewarm_summaries for zero apps and one recent app (8 views). Made-with: Cursor

agonza1 · 2026-04-24T23:42:57Z

Ready for one last run @Boanerges1996, if you confirm it is still working for you we can merge all changes here and previous PRs we used as base here.

Boanerges1996 and others added 3 commits April 20, 2026 21:47

agonza1 added 2 commits April 24, 2026 19:23

fix: dedupe conferences for issue_code; harden gum-summary

3f8694f

test: summary_cache LocMem tests and prewarm smoke

e7e51f7

- Unit-test cache key rules, hit/miss, TTL override, and soft-fail on get/set errors. - Smoke-test prewarm_summaries for zero apps and one recent app (8 views). Made-with: Cursor

agonza1 force-pushed the feat/summary-redis-cache-phase-c branch from a16d91f to e7e51f7 Compare April 24, 2026 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Redis cache + pre-warm for dashboard summary endpoints (Phase C of #20)#28

feat: Redis cache + pre-warm for dashboard summary endpoints (Phase C of #20)#28
Boanerges1996 wants to merge 5 commits intopeermetrics:masterfrom
Boanerges1996:feat/summary-redis-cache-phase-c

Boanerges1996 commented Apr 23, 2026

Uh oh!

agonza1 commented Apr 24, 2026

Uh oh!

agonza1 commented Apr 24, 2026

Uh oh!

agonza1 commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Boanerges1996 commented Apr 23, 2026

Summary

Local benchmark (7-day Production clone, ~18k conferences / 38k sessions / 38k connections)

Test plan

Follow-ups (not in this PR)

Uh oh!

agonza1 commented Apr 24, 2026

Uh oh!

agonza1 commented Apr 24, 2026

Uh oh!

agonza1 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

agonza1 commented Apr 24, 2026 •

edited

Loading