Add design for ALTER TABLE EXPORT + design review skill by hodgesrm · Pull Request #1673 · Altinity/ClickHouse

hodgesrm · 2026-04-22T03:26:02Z

Changelog category (leave one):

Documentation (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added a skill for Antalya design creation and review. Also added a first cut design spec for ALTER TABLE EXPORT based on current documentation and tests.

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

Regression jobs to run:

arthurpassos

I see some mistakes done by Claude here in regards to plain object storage vs iceberg exports. I assume it is because it has consumed the antalya-26.1 branch docs that do not mention Iceberg at all. Those docs are in my 1618 PR.

Half way through, still more to come

arthurpassos · 2026-04-22T13:58:19Z

+-- Split large part across multiple Parquet files
+ALTER TABLE big EXPORT PART '2025_0_32_3' TO TABLE big_dest
+SETTINGS allow_experimental_export_merge_tree_part = 1,
+         export_merge_tree_part_max_bytes_per_file = 10000000,


export_merge_tree_part_max_bytes_per_file

We have a funny situation with this. When I added this setting, ClickHouse did not have the ability to split out parquet files. Plus, we did not support exporting to Iceberg.

The iceberg engine now has its own settings. Which one should we respect?

Good point. I vote we go with the Iceberg engine for consistency and to simplify EXPORT commands. Questions:

iceberg_insert_max_bytes_in_data_file should supersede export_merge_tree_part_max_bytes_per_file , correct?

Export should also obey iceberg_insert_max_rows_in_data_file, as you should in your example?

What about settings for grow groups? (e.g., output_format_parquet_row_group_size and output_format_parquet_row_group_size_bytes). I assume we observe all of these. It also seems like you would want to set them at the table level for sanity since it's a lot to remember if you export more than one time.

TLDR:

You are correct on your expectations, but that's not quite how it works nowadays.

Full

You touched on an important topic. There is a subtle difference on how settings are handled between PART and PARTITION export.

For Part export, the settings used at the moment of the query are all preserved and re-stored in the background export task.

For partition export, it is slightly harder to preserve the settings because we need to serialize them in ZooKeeper. I have done that only for a few very important settings, including export_merge_tree_part_max_bytes_per_file.

Somse settings are not preserved simply because I did not have the time and I was trying to find a better way to handle this case, which I did not yet.

Just pushed a change to 1618 that preserves Iceberg max rows and max bytes

What do you think about modifying export_merge_tree_max_bytes_per_file to output_format_parquet_file_size_bytes?

hmm, I don't like this idea very much. This implies it would have to be implemented for regular inserts as well, which is extra work and introduces some conflicts with upstream.

Additionally, by putting parquet in the setting name you are scoping in to parquet files. The code that splits the files is agnostic to the format. It would also conflict with the iceberg max size and etc.

OK. export_merge_tree_max_bytes_per_file overrides more specific values (in the design).

arthurpassos

I'll stop my review here (https://github.com/Altinity/ClickHouse/pull/1673/changes#diff-59e17f10a30e7277ca6b00d367036de94c72a8e1ddcdb063ad9255e3aa707a81R516) as we'll discuss these things in a meeting shortly. It is a long doc lol.

arthurpassos · 2026-04-24T12:15:31Z

+-- Split large part across multiple Parquet files
+ALTER TABLE big EXPORT PART '2025_0_32_3' TO TABLE big_dest
+SETTINGS allow_experimental_export_merge_tree_part = 1,
+         export_merge_tree_part_max_bytes_per_file = 10000000,


hmm, I don't like this idea very much. This implies it would have to be implemented for regular inserts as well, which is extra work and introduces some conflicts with upstream.

Additionally, by putting parquet in the setting name you are scoping in to parquet files. The code that splits the files is agnostic to the format. It would also conflict with the iceberg max size and etc.

Signed-off-by: Robert Hodges <rhodges@altinity.com>

Also incorporated design feedback. Signed-off-by: Robert Hodges <rhodges@altinity.com>

Signed-off-by: Robert Hodges <rhodges@altinity.com>

arthurpassos

LGTM - I did not review the testing section, will leave that for QA

arthurpassos · 2026-04-27T12:05:47Z

+- EXPORT PARTITION for MergeTree tables. Must work without Keeper installation. 
+- Export history. Provide a system table to track the history of part exports. 
+- Flexible casting that addresses issues like the following. 
+  - Handling potentially lossy casts like INSERT SELECT: int64 -> int32. 


By handling you mean throw, accept or a setting that allows the user to decide?

arthurpassos · 2026-04-27T12:09:00Z

+swap in `DatabaseIceberg` or an `iceberg(...)` catalog-backed table
+function to route through REST / Glue / Unity.
+
+**Open Issue** We need to specify how to commit via a catalog. 


Wdym by open issue? It is already supported to export to Iceberg tables backed by a catalog.

arthurpassos · 2026-04-27T12:10:29Z

+
+The following notes expand on expected behavior of commands. 
+
+1. When writing to object storage using `partition_strategy = 'wildcard'`, either wildcard


Writing here is weird:

"when writing with partition_strategy = 'wildcard', either wildcard and hive can be used"

I think you meant "when writing to plain object storage, either wildcard and hive can be used" - or something of the sort

DimensionWieldr · 2026-04-27T18:26:23Z

+- For best results `EXPORT PARTITION` requires a ZooKeeper / `clickhouse-keeper` ensemble with 
+  the `multi_read` feature flag. This reduces API calls and ensures transactional consistency 
+  when reading multiple fields. 
+  governed by the destination table's Iceberg partition spec instead.


Minor formatting issue.

DimensionWieldr · 2026-04-27T18:27:47Z

+- **Plain object storage.** One Parquet data file per part (or per chunk when split by
+  size/rows) plus one commit file per transaction — `<dest>/<partition>/<part>_<checksum>.<N>.parquet`
+  and `<dest>/commit_<...>`. Readers that want atomicity filter by commit. This is the default path, 
+  but it can be customized to handle sharding, which is not covered by the default case. See the


"See the ..."?

DimensionWieldr · 2026-04-27T18:28:28Z

+   export artifacts; the Iceberg sidecars (`*_clickhouse_export_part_sidecar.avro`)
+   are safe to delete once the commit has landed.
+
+9. **Settings in `PART` vs `PARTIION` export.** There is a subtle difference in 


DimensionWieldr · 2026-04-27T18:50:45Z

+
+Both commands accept two destination families:
+
+1. **Plain object storage** (`S3`, `AzureBlobStorage`, equivalents).


Is AzureBlobStorage something we want to explicitly test? Most of the regression tests are S3-centric.

DimensionWieldr · 2026-04-27T19:14:39Z

+This design only covers user visible behavior. It does not internal implementatation 
+details. The implementation section is omitted. 
+
+## 4. Test plan


Is this section targeted towards tests written by devs, or a guide for regression test development, or both?

I've been wondering for a while actually, what the difference is between the regression repo and the tests written by devs included in the CH repo under tests/? I'm aware that regression has deeper coverage and also some overlap with the dev tests but beyond that, why do we have both, what is the purpose of each?

why do we have both, what is the purpose of each

I summon @vzakaznikov in attack position

hodgesrm requested a review from arthurpassos April 22, 2026 03:26

hodgesrm force-pushed the 01-alter-table-export-design branch from 06d5b84 to 227f7b1 Compare April 22, 2026 03:42

arthurpassos reviewed Apr 22, 2026

View reviewed changes

svb-alt added antalya antalya-26.1 labels Apr 23, 2026

hodgesrm force-pushed the 01-alter-table-export-design branch from 227f7b1 to bcb2b8c Compare April 23, 2026 23:56

hodgesrm changed the title ~~Add directory for antalya docs, skills, and designs~~ Add design for ALTER TABLE EXPORT + design review skill Apr 24, 2026

arthurpassos reviewed Apr 24, 2026

View reviewed changes

hodgesrm added 4 commits April 25, 2026 21:24

Add directory for antalya docs, skills, and designs

e8b52bf

Signed-off-by: Robert Hodges <rhodges@altinity.com>

Update design to pull in EXPORT PARTITION behavior

2fb4a60

Also incorporated design feedback. Signed-off-by: Robert Hodges <rhodges@altinity.com>

Fixed typo.

6257cf9

Signed-off-by: Robert Hodges <rhodges@altinity.com>

Address PR comments and drop Implementation section.

915df13

Signed-off-by: Robert Hodges <rhodges@altinity.com>

hodgesrm force-pushed the 01-alter-table-export-design branch from ebbd2e2 to 915df13 Compare April 26, 2026 04:25

Fixed typo in design doc

ac2ebc0

Signed-off-by: Robert Hodges <rhodges@altinity.com>

arthurpassos approved these changes Apr 27, 2026

View reviewed changes

DimensionWieldr reviewed Apr 27, 2026

View reviewed changes


		The following notes expand on expected behavior of commands.

		1. When writing to object storage using `partition_strategy = 'wildcard'`, either wildcard


		Both commands accept two destination families:

		1. Plain object storage (`S3`, `AzureBlobStorage`, equivalents).

Conversation

hodgesrm commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

CI/CD Options

Exclude tests:

Regression jobs to run:

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hodgesrm Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurpassos Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DimensionWieldr Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurpassos Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

hodgesrm commented Apr 22, 2026 •

edited

Loading

hodgesrm Apr 23, 2026 •

edited

Loading

arthurpassos Apr 23, 2026 •

edited

Loading

DimensionWieldr Apr 27, 2026 •

edited

Loading

arthurpassos Apr 27, 2026 •

edited

Loading