Prerequisites
Before you begin, make sure you have:- A running RisingWave cluster.
- (Optional) An Iceberg compactor if you plan to sink upsert streams. Contact our support team or sales team if you need this.
- A Databricks cluster.
- An Amazon S3 bucket.
- AWS Glue.
Iceberg catalog and warehouse
The Iceberg catalog should be AWS Glue. As for warehouse, we recommended using AWS S3.Sink data from RisingWave into Iceberg
Follow the instruction to create a sink to sink your data into Iceberg table. Below are two examples.Glue + S3 (append-only)
Glue + S3 (upsert)
For
upsert
type, since Databricks doesn’t support reading position delete and equality delete files, please use Copy-on-Write mode write_mode = 'copy-on-write'
and enable the Iceberg compaction as well. The compaction_interval_sec
determines the freshness of the Iceberg table, since Copy-on-Write mode relies on the Iceberg compaction.