[c++] Add nullability support for data type in c++#525
[c++] Add nullability support for data type in c++#525charlesdong1991 wants to merge 3 commits intoapache:mainfrom
Conversation
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for the PR, left a question.
| nullable: true, | ||
| element_nullable: true, |
There was a problem hiding this comment.
Can you elaborate why and the impact of hardcoding nullability here? Also other places where element nullability is hardcoded to true.
There was a problem hiding this comment.
hardcoding here is on purpose actually, since the spec struct has 2 nullability concept, where nullable is the nullability of this constructed type where element_nullable is for the leaf element of an array.
and function like build_array_type_from_leaf recursively calls to construct leaf which is scaler, so element_nullable is unused.
ArrayWriter it only needs the type for encoding i believe, and putting default as true is aligned with Fluss default, which means all types are nullable unless explicitly marked.
I think probably best to add some comments in case of questions in future, WDYT? @leekeiabstraction
|
thanks for the question. i replied, let me know if further question @leekeiabstraction if find time, would be great to give a look @fresh-borzoni @leekeiabstraction 🙏 |
fresh-borzoni
left a comment
There was a problem hiding this comment.
@charlesdong1991 Ty for the PR, overall LGTM
Left some comments, PTAL
| pub const DATA_TYPE_BINARY: i32 = 16; | ||
| pub const DATA_TYPE_ARRAY: i32 = 17; | ||
|
|
||
| struct FfiDataTypeSpec { |
There was a problem hiding this comment.
Not sure this struct is the right shape. One struct lumps two jobs (build scalar / build array). Most call sites zero out the majority of the fields. Shall we split into separate Scalar/Array variants?
| element_data_type: i32, | ||
| element_precision: i32, | ||
| element_scale: i32, | ||
| element_nullable: bool, |
There was a problem hiding this comment.
If we split FfiDataTypeSpec to Scalar/Array, this field can go away too = just make array_nullability one entry longer (last = leaf)
| bool nullable = true; | ||
| if (static_cast<size_t>(i) < flat.array_nullability.size()) { | ||
| nullable = flat.array_nullability[static_cast<size_t>(i)] != 0; | ||
| } else if (i == 0) { |
There was a problem hiding this comment.
but do we have the case for this?
| array_nesting: u32, | ||
| array_nullability: Vec<u8>, | ||
| nullable: bool, | ||
| element_nullable: bool, |
There was a problem hiding this comment.
is it only for backward compatibility, mb we can drop it?
Purpose
Preserve nullability when constructing Fluss schemas through the C++ binding. Previously, the C++
DataTypeclass had no nullable flag, so all types defaulted to nullable regardless of what the user specified.Linked issue: close #510
Tests
All tests have passed locally
Documentation
Docs are updated accordingly.