use has()/hasAny() on KV token columns for SQL/facet map filters#2295
use has()/hasAny() on KV token columns for SQL/facet map filters#2295vinzee wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 26fc055 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
@vinzee is attempting to deploy a commit to the HyperDX Team on Vercel. A member of the Team first needs to authorize it. |
PR ReviewSolid PR with comprehensive test coverage for the regex-based rewrite. SQL injection risks are mitigated by
No blocking issues — the word-boundary fix is the only one I'd push for before merge. |
7555402 to
501593a
Compare
Deep Review🔴 P0/P1 -- must fix
🟡 P2 -- recommended
🔵 P3 nitpicks (8)
Reviewers (6): correctness, performance, maintainability, testing, security, kieran-typescript. Testing gaps:
|
SQL and facet filters such as ResourceAttributes['key'] IN ('value')
were passed through unchanged, so ClickHouse used map subscript/IN and
could not use text-index direct read on materialized key=value tokens.
Changes:
- Add rewriteSqlWithKvItems() for Map['key'] IN (=) patterns
- Add Metadata.getKvItemsLookup() (shared with Lucene kvItems path)
- Export KV_ITEMS_STRATEGIES and normalizeChExpression from queryParser
When a materialized KV column (e.g. ResourceAttributeTokens) has a
text(tokenizer=array) skip index, SQL filters are rewritten to
has()/hasAny() with concat(key, separator, value), matching the
existing Lucene quoted-term optimization.
501593a to
26fc055
Compare
Summary
Before this change, only quoted Lucene terms could use KV-items direct read, and only when HyperDX successfully detected the materialized token column at startup. SQL and facet filters were always emitted as Map['key'] IN (...), which cannot use the text index.
This PR adds the same has()/hasAny() rewrite for SQL filters. Lucene behavior is unchanged and still depends on upstream kvItems detection (hyperdx#2248) plus ResourceAttributeTokens and its text(tokenizer=array) index on the cluster.
How to test on Vercel preview
N/A — non-UI change (@hyperdx/common-utils only).
References