Skip to main content
This guide explains how to deliver processed data from RisingWave into existing Iceberg tables. Use this when you have Iceberg tables managed by external systems and want RisingWave to deliver processed results into them.

Prerequisites

  • An upstream source, table, or materialized view in RisingWave to output data from.
  • Existing Iceberg tables that you can deliver to, or the ability to create them via external systems.
  • Appropriate permissions to deliver to the target Iceberg catalog and storage.
  • Access credentials for the underlying object storage (e.g., S3 access key and secret key).

Create an Iceberg sink

To write data to an external Iceberg table, create a SINK. This statement defines how data from an upstream object should be formatted and delivered to the target Iceberg table.
CREATE SINK my_iceberg_sink FROM processed_events
WITH (
    connector = 'iceberg',
    type = 'append-only',
    warehouse.path = 's3://my-data-lake/warehouse',
    database.name = 'analytics',
    table.name = 'processed_user_events',
    catalog.type = 'glue',
    catalog.name = 'my_glue_catalog',
    s3.access.key = 'your-access-key',
    s3.secret.key = 'your-secret-key',
    s3.region = 'us-west-2'
);

Configuration parameters

ParameterRequiredDescription
connectorYesMust be 'iceberg'.
typeYesSink mode. 'append-only' for new records only; 'upsert' to handle updates and deletes.
database.nameYesThe name of the target Iceberg database.
table.nameYesThe name of the target Iceberg table.
primary_keyYes, if type is upsertA comma-separated list of columns that form the primary key.
force_append_onlyNoIf true, converts an upsert stream to append-only. Updates become inserts and deletes are ignored. Default: false.
is_exactly_onceNoIf true, enables exactly-once delivery semantics. This provides stronger consistency but may impact performance. Default: false.
commit_checkpoint_intervalNoThe number of checkpoints between commits. The approximate time to commit is barrier_interval_ms × checkpoint_frequency × commit_checkpoint_interval. Default: 60.
commit_retry_numNoThe number of times to retry a failed commit. Default: 8.
For detailed storage and catalog configuration: You can configure commit_checkpoint_interval and commit_retry_num to manage commit frequency and retry behavior. The approximate time to commit is calculated as:
time = barrier_interval_ms × checkpoint_frequency × commit_checkpoint_interval
For details about barrier_interval_ms and checkpoint_frequency, see ALTER DATABASE.

Table maintenance

When you continuously sink data to an Iceberg table, it is important to perform periodic maintenance, including compaction and snapshot expiration, to maintain good query performance and manage storage costs. RisingWave provides both automatic and manual maintenance options. For complete details, see the Iceberg table maintenance guide.

Monitoring

-- Check sink status
SHOW SINKS;

-- View sink details
DESCRIBE SINK my_iceberg_sink;
I