Skip to content

fix(core): normalize boolean range predicates in Gremlin filters#2991

Open
contrueCT wants to merge 9 commits intoapache:masterfrom
contrueCT:task/fix-2934
Open

fix(core): normalize boolean range predicates in Gremlin filters#2991
contrueCT wants to merge 9 commits intoapache:masterfrom
contrueCT:task/fix-2934

Conversation

@contrueCT
Copy link
Copy Markdown
Contributor

Purpose of the PR

This PR fixes HugeGraph's inconsistent handling of boolean range predicates and unifies the internal ordering semantics as false < true. Previously, filter-step queries such as where(__.has("xxx", P.lt(true))) were pushed down through HugeGraph's condition/query pipeline and incorrectly treated as numeric-style range conditions, while equivalent match() queries could still return results normally. As a result, semantically equivalent traversals behaved differently.

Main Changes

The fix has two parts. First, boolean support is added to backend condition comparison and range-flattening logic so gt/gte/lt/lte can be evaluated consistently. Second, runtime Gremlin boolean range predicates are normalized into equivalent eq/in/empty conditions before query planning, so they are no longer misclassified as ordinary numeric range queries that require a range index. This preserves the intended boolean ordering semantics while making has(), where(), and match() behave consistently for boolean range predicates.

Verifying these changes

Regression coverage was also added for:

  • low-level boolean lt/lte/gt/gte comparison semantics
  • boolean range merge and conflicting-range flattening
  • end-to-end consistency across has(), where(), and match()
  • empty-result edge cases such as lt(false) and gt(true)
  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • mvn -Drat.skip=true -P core-test,memory -pl hugegraph-server/hugegraph-test,hugegraph-struct -am -DfailIfNoTests=false '-Dtest=ConditionTest,ConditionQueryFlattenTest,EdgeCoreTest#testQueryEdgeByBooleanRangePredicate' clean test

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working gremlin TinkerPop gremlin tests Add or improve test cases labels Apr 12, 2026
Comment thread hugegraph-struct/src/main/java/org/apache/hugegraph/query/Condition.java Outdated
Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体设计合理,规范化路径(TraversalUtil 层提前将 boolean 范围谓词转换为 eq/in/empty)是正确做法,测试覆盖也较为全面。有两处中等优先级问题需要决策:compareBoolean 的 null 处理语义,以及 neq 未被规范化。

}

private static int compareBoolean(Object first, Boolean second) {
if (first == null) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ null 静默转为 false,与 compareDate 行为不一致

compareDate 遇到非 Date 的 first 时直接 throw,但 compareBooleannull 静默设为 false。这会掩盖两类问题:

  1. 属性不存在(null = 未设置)与属性值为 false 在语义上是不同的
  2. 如果 first 是 null,后面的 first.getClass().getSimpleName() 本会 NPE,这里只是绕过了 NPE,没有真正解决问题

建议对齐 compareDate 的风格,同时修复 NPE 隐患:

private static int compareBoolean(Object first, Boolean second) {
    if (first instanceof Boolean) {
        return Boolean.compare((Boolean) first, second);
    }
    throw new IllegalArgumentException(String.format(
            "Can't compare between %s(%s) and %s(%s)",
            first, first == null ? "null" : first.getClass().getSimpleName(),
            second, second.getClass().getSimpleName()));
}

如果业务上确实需要 null→false 的语义(某些 backend 存储缺省值为 null),请在注释中说明原因。

}

private static int compareBoolean(Object first, Boolean second) {
if (first == null) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 同上:hugegraph-structcompareBoolean 有相同的 null→false 静默处理问题。两份拷贝需保持一致,建议同步修改。

Boolean value) {
switch (compare) {
case eq:
return Condition.eq(key, value);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ neq 未被规范化,可能无法利用二级索引

neq(true) 语义等价于 eq(false)neq(false) 等价于 eq(true)。当前直接透传 Condition.neq(key, value) 到 backend,而 neq 通常走全扫描 + 过滤,无法命中 secondary index(例如 strikeByArrested)。

该 PR 的目标之一是让 boolean 谓词走正确的索引路径,neq 也应一并处理:

case neq:
    return Condition.eq(key, !value);

这样 neq(true)eq(false),可以命中 secondary index。

lteTrueIds.add(edge.value("id"));
}
Assert.assertEquals(ImmutableSet.of(1, 2), lteTrueIds);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 缺少 neq 谓词的端到端测试,且仅覆盖了 Edge

  • P.neq(true) / P.neq(false) 没有集成测试,而上面的 neq 实现路径与其他谓词不同(如果接受上述 neq 规范化建议,需要补充验证)
  • 所有新增集成测试均在 EdgeCoreTest,建议同步在 VertexCoreTest 中添加对应的 boolean range 覆盖,确认 TraversalUtil.convCompare2UserpropRelationHugeType.VERTEX 也正常工作

@contrueCT
Copy link
Copy Markdown
Contributor Author

@imbajin 这次提交去掉了 compareBoolean 里 null -> false 的静默转换,现在 boolean 比较遇到 null 会显式失败;hugegraph-struct 中对应实现也已同步修改,保持两边语义一致。另外,TraversalUtil 中已将 boolean neq(value) 规范化为 eq(!value),让它可以和 boolean equality 一样走 secondary index 路径。测试也补充了 null 比较异常、Edge 上的 P.neq(true/false),以及 Vertex 上对应的 boolean neq 路径覆盖。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working gremlin TinkerPop gremlin size:L This PR changes 100-499 lines, ignoring generated files. tests Add or improve test cases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Unexpected NumberFormatException for filter-step

2 participants