fix(core): normalize boolean range predicates in Gremlin filters#2991
fix(core): normalize boolean range predicates in Gremlin filters#2991contrueCT wants to merge 9 commits intoapache:masterfrom
Conversation
imbajin
left a comment
There was a problem hiding this comment.
整体设计合理,规范化路径(TraversalUtil 层提前将 boolean 范围谓词转换为 eq/in/empty)是正确做法,测试覆盖也较为全面。有两处中等优先级问题需要决策:compareBoolean 的 null 处理语义,以及 neq 未被规范化。
| } | ||
|
|
||
| private static int compareBoolean(Object first, Boolean second) { | ||
| if (first == null) { |
There was a problem hiding this comment.
compareDate 遇到非 Date 的 first 时直接 throw,但 compareBoolean 将 null 静默设为 false。这会掩盖两类问题:
- 属性不存在(null = 未设置)与属性值为 false 在语义上是不同的
- 如果
first是 null,后面的first.getClass().getSimpleName()本会 NPE,这里只是绕过了 NPE,没有真正解决问题
建议对齐 compareDate 的风格,同时修复 NPE 隐患:
private static int compareBoolean(Object first, Boolean second) {
if (first instanceof Boolean) {
return Boolean.compare((Boolean) first, second);
}
throw new IllegalArgumentException(String.format(
"Can't compare between %s(%s) and %s(%s)",
first, first == null ? "null" : first.getClass().getSimpleName(),
second, second.getClass().getSimpleName()));
}如果业务上确实需要 null→false 的语义(某些 backend 存储缺省值为 null),请在注释中说明原因。
| } | ||
|
|
||
| private static int compareBoolean(Object first, Boolean second) { | ||
| if (first == null) { |
There was a problem hiding this comment.
hugegraph-struct 的 compareBoolean 有相同的 null→false 静默处理问题。两份拷贝需保持一致,建议同步修改。
| Boolean value) { | ||
| switch (compare) { | ||
| case eq: | ||
| return Condition.eq(key, value); |
There was a problem hiding this comment.
neq 未被规范化,可能无法利用二级索引
neq(true) 语义等价于 eq(false),neq(false) 等价于 eq(true)。当前直接透传 Condition.neq(key, value) 到 backend,而 neq 通常走全扫描 + 过滤,无法命中 secondary index(例如 strikeByArrested)。
该 PR 的目标之一是让 boolean 谓词走正确的索引路径,neq 也应一并处理:
case neq:
return Condition.eq(key, !value);这样 neq(true) → eq(false),可以命中 secondary index。
| lteTrueIds.add(edge.value("id")); | ||
| } | ||
| Assert.assertEquals(ImmutableSet.of(1, 2), lteTrueIds); | ||
|
|
There was a problem hiding this comment.
🧹 缺少 neq 谓词的端到端测试,且仅覆盖了 Edge
P.neq(true)/P.neq(false)没有集成测试,而上面的 neq 实现路径与其他谓词不同(如果接受上述 neq 规范化建议,需要补充验证)- 所有新增集成测试均在
EdgeCoreTest,建议同步在VertexCoreTest中添加对应的 boolean range 覆盖,确认TraversalUtil.convCompare2UserpropRelation对HugeType.VERTEX也正常工作
|
@imbajin 这次提交去掉了 compareBoolean 里 null -> false 的静默转换,现在 boolean 比较遇到 null 会显式失败;hugegraph-struct 中对应实现也已同步修改,保持两边语义一致。另外,TraversalUtil 中已将 boolean neq(value) 规范化为 eq(!value),让它可以和 boolean equality 一样走 secondary index 路径。测试也补充了 null 比较异常、Edge 上的 P.neq(true/false),以及 Vertex 上对应的 boolean neq 路径覆盖。 |
Purpose of the PR
This PR fixes HugeGraph's inconsistent handling of boolean range predicates and unifies the internal ordering semantics as false < true. Previously, filter-step queries such as where(__.has("xxx", P.lt(true))) were pushed down through HugeGraph's condition/query pipeline and incorrectly treated as numeric-style range conditions, while equivalent match() queries could still return results normally. As a result, semantically equivalent traversals behaved differently.
Main Changes
The fix has two parts. First, boolean support is added to backend condition comparison and range-flattening logic so gt/gte/lt/lte can be evaluated consistently. Second, runtime Gremlin boolean range predicates are normalized into equivalent eq/in/empty conditions before query planning, so they are no longer misclassified as ordinary numeric range queries that require a range index. This preserves the intended boolean ordering semantics while making has(), where(), and match() behave consistently for boolean range predicates.
Verifying these changes
Regression coverage was also added for:
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need