Skip to content

[FLINK-39531][table-planner] Fix duplicated Python UDF extraction in calc split rule#28037

Open
auroflow wants to merge 2 commits intoapache:masterfrom
auroflow:auroflow/fix-udf
Open

[FLINK-39531][table-planner] Fix duplicated Python UDF extraction in calc split rule#28037
auroflow wants to merge 2 commits intoapache:masterfrom
auroflow:auroflow/fix-udf

Conversation

@auroflow
Copy link
Copy Markdown
Contributor

What is the purpose of the change

Currently, ScalarFunctionSplitter bookkeeps extracted RexNodes to deduplicate identical remote functions during calc splitting, but the bookkeeping is keyed on the RexInputRef rather than the original RexNode. Therefore, when the same Python or Async UDF call appears in the input of multiple projections of one SELECT, each occurrence is extracted once and the UDF is invoked every time.

This PR solves this problem by making the bookkeeping keyed on the original RexNodes, not the RexInputRef.

Brief change log

  • Make the bookkeeping in ScalarFunctionSplitter keyed on the original RexNodes, not the RexInputRef.

Verifying this change

This change added tests and can be verified as follows:

  • Added a test on duplicated Python async functions

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Claude Opus 4.7

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 26, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@auroflow
Copy link
Copy Markdown
Contributor Author

@flinkbot run azure

@auroflow auroflow force-pushed the auroflow/fix-udf branch 2 times, most recently from 036f1a2 to 1a3daf5 Compare April 28, 2026 05:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants