This guide shows how to sink data from RisingWave into Databricks managed Iceberg tables using Unity Catalog.

Enable external data access in Unity catalog

  1. Configure your Unity Catalog metastore to allow external data access. See Enable external data access on the metastore for more details.
  2. Grant a principal Unity Catalog privileges on Databricks.
    -- Users
    GRANT EXTERNAL USE SCHEMA ON SCHEMA catalog_name.schema_name TO `user@company.com`
    -- Service principal
    GRANT EXTERNAL USE SCHEMA ON SCHEMA catalog_name.schema_name TO `32ab2e99-69a0-45bc-a110-123456eae110`
    
  3. Acquire Databricks credentials. See Access Databricks tables from Apache Iceberg clients for more information. You need to fetch these parameters:
  • catalog.uri
    • Format: <workspace-url>/api/2.1/unity-catalog/iceberg-rest
    • Value: <workspace-url> is the Databricks workspace URL.
  • catalog.oauth2_server_uri
  • catalog.credential
    • Format: <oauth_client_id>:<oauth_client_secret>
    • Value: <oauth_client_id> is OAuth client ID for the authenticating principal; <oauth_client_secret> is OAuth client secret for the authenticating principal.
  • catalog.scope: all-apis
  • warehouse.path:
    • Format: <uc-catalog-name>
    • Value: The name of the catalog in Unity Catalog that contains your tables.

Sink data into Databricks managed Iceberg table

create table t_test1 (
  a int primary key,
  b int
) append only;

insert into t_test1 values (1, 2), (3, 4);

create sink ice_sink from t_test1
with (
  primary_key = 'a',
  type = 'append-only',
  connector = 'iceberg',
  create_table_if_not_exists = true,
  s3.region = 'ap-southeast-1',
  catalog.type = 'rest_rust',
  catalog.uri = 'https://<workspace-url>/api/2.1/unity-catalog/iceberg-rest',
  warehouse.path = '<uc-catalog-name>',
  database.name = 'default',
  table.name = 't_test1',
  catalog.oauth2_server_uri = 'https://<workspace-url>/oidc/v1/token',
  catalog.credential='<oauth_client_id>:<oauth_client_secret>',
  catalog.scope='all-apis',
  commit_checkpoint_interval = 3
);

Query Databricks managed Iceberg table

Query from Databricks:
select * from <uc-catalog-name>.default.t_test1

Limitation

You can only use append only sink to Databricks managed Iceberg table.