- Databricks Unity Catalog: Write data from RisingWave directly to a Databricks-managed Iceberg table.
- AWS Glue as a federated catalog: Write data from RisingWave to an Iceberg table that uses AWS Glue as its catalog, and then connect Databricks to Glue.
- Using Unity Catalog
- Using AWS Glue Catalog
This pattern is ideal when you want to manage your Iceberg tables centrally within the Databricks ecosystem. RisingWave acts as a streaming ETL engine, writing data directly into your Unity Catalog.How it works
RisingWave → Iceberg table on S3 → Databricks Unity Catalog
Prerequisites
- A running RisingWave cluster.
- A Databricks workspace with Unity Catalog enabled.
- Permissions to create and access credentials for external access in Unity Catalog.
Step 1: Configure Unity Catalog for external access
- Follow the Databricks documentation to configure your Unity Catalog metastore to allow external clients like RisingWave to access it.
-
Acquire the necessary credentials to connect. You will need the following parameters for your sink:
catalog.uri
: The REST endpoint for your Unity Catalog.catalog.oauth2_server_uri
: The OAuth token endpoint.catalog.credential
: Your client ID and secret, formatted as<oauth_client_id>:<oauth_client_secret>
.warehouse.path
: The name of the catalog in Unity Catalog.
Step 2: Sink data from RisingWave to Unity Catalog
Create aSINK
in RisingWave that writes to your Databricks-managed table. Note that currently, only append-only
sinks are supported for this integration.