Skip to content

[AURON #2217] Support Iceberg _spec_id metadata in native scan#2218

Open
weimingdiit wants to merge 1 commit intoapache:masterfrom
weimingdiit:feat/support-iceberg-spec-id-metadata
Open

[AURON #2217] Support Iceberg _spec_id metadata in native scan#2218
weimingdiit wants to merge 1 commit intoapache:masterfrom
weimingdiit:feat/support-iceberg-spec-id-metadata

Conversation

@weimingdiit
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2217

Rationale for this change

Native Iceberg scan currently supports projecting the _file metadata column, but falls back when _spec_id is requested. _spec_id is a file-level Iceberg metadata column and can be materialized as a per-file constant value in the native scan, similar to _file.

What changes are included in this PR?

This PR adds native scan support for the Iceberg _spec_id metadata column.

  • Allows _spec_id in Iceberg native scan metadata column validation.
  • Materializes _spec_id from FileScanTask.file().specId() as a per-file partition value.
  • Adds integration tests for _spec_id projection and mixed data/metadata projection with _file and _spec_id.

Are there any user-facing changes?

Yes. Queries that project Iceberg _spec_id can now use the native Iceberg scan path instead of falling back to Spark.

How was this patch tested?

CI.

Signed-off-by: weimingdiit <weimingdiit@gmail.com>
@weimingdiit weimingdiit marked this pull request as ready for review April 28, 2026 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Iceberg _spec_id metadata column in native scan

1 participant