When RisingWave streams data into Iceberg, the process can generate many small data files and delete files. These files accumulate over time, increasing storage overhead and slowing down query performance.

Periodic compaction

Iceberg compaction for sinks and engine tables periodically merges small data files and delete files into larger, optimized data files. These operations run in the background at user-defined intervals, ensuring efficient storage management with minimal disruption. To configure Iceberg compaction when creating a sink or table, specify the following parameters in the WITH clause:
ParameterDescription
enable_compactionWhether to enable Iceberg compaction (true/false).
compaction_interval_secInterval (in seconds) between two compaction runs. Defaults to 3600 seconds.
For detailed syntax, parameters, and configuration details, see Compaction for Iceberg sink and Compaction for Iceberg engine table.

Manual compaction

In addition to periodic background compaction, you can also trigger compaction manually. By running the VACUUM FULL command, RisingWave will compact small files and simultaneously expire old snapshots. This gives you finer control over storage cleanup and is useful when you want to immediately reduce storage overhead or reset snapshot history.