Prerequisites
- An upstream source, table, or materialized view in RisingWave to output data from.
- Existing Iceberg tables that you can deliver to, or the ability to create them via external systems.
- Appropriate permissions to deliver to the target Iceberg catalog and storage.
- Access credentials for the underlying object storage (e.g., S3 access key and secret key).
Create an Iceberg sink
To write data to an external Iceberg table, create aSINK
. This statement defines how data from an upstream object should be formatted and delivered to the target Iceberg table.
Configuration parameters
Parameter | Required | Description |
---|---|---|
connector | Yes | Must be 'iceberg' . |
type | Yes | Sink mode. 'append-only' for new records only; 'upsert' to handle updates and deletes. |
database.name | Yes | The name of the target Iceberg database. |
table.name | Yes | The name of the target Iceberg table. |
primary_key | Yes, if type is upsert | A comma-separated list of columns that form the primary key. |
force_append_only | No | If true , converts an upsert stream to append-only . Updates become inserts and deletes are ignored. Default: false . |
is_exactly_once | No | If true , enables exactly-once delivery semantics. This provides stronger consistency but may impact performance. Default: false . |
commit_checkpoint_interval | No | The number of checkpoints between commits. The approximate time to commit is barrier_interval_ms × checkpoint_frequency × commit_checkpoint_interval . Default: 60 . |
commit_retry_num | No | The number of times to retry a failed commit. Default: 8 . |
- Object storage: Object storage configuration
- Catalogs: Catalog configuration
Advanced features
Exactly-once delivery
Enable exactly-once delivery semantics for critical data pipelines:Enabling exactly-once delivery provides stronger consistency guarantees but may impact performance due to additional coordination overhead.
Commit configuration
Control commit frequency and retry behavior:Table maintenance
When you continuously sink data to an Iceberg table, it is important to perform periodic maintenance, including compaction and snapshot expiration, to maintain good query performance and manage storage costs. RisingWave provides both automatic and manual maintenance options. For complete details, see the Iceberg table maintenance guide.Integration patterns
Real-time analytics pipeline
Stream aggregated results to analytics tables:Change data capture
Stream database changes to data lake:Best practices
- Choose appropriate sink mode: Use append-only for event logs, upsert for dimensional data.
- Configure commit intervals: Balance latency vs file size based on your requirements.
- Enable exactly-once for critical data: Use for financial transactions or other critical data.
- Monitor sink lag: Track how far behind your sink is from the source data.
- Design proper partitioning: Ensure target tables are properly partitioned for query performance.
- Handle backpressure: Monitor sink performance and adjust resources as needed.
Monitoring and troubleshooting
Monitor sink performance
Limitations
- Schema evolution: Limited support for automatic schema changes.
- Concurrent writers: Coordinate with other systems writing to the same tables.
Next steps
- Ingest from Iceberg: Set up sources with Ingest from Iceberg.
- Configure catalogs: Review Catalog configuration for your setup.
- Storage setup: Configure your object storage in Object storage configuration.