Learn how to use the Hosted Iceberg Catalog in RisingWave. This guide shows you how to create and manage Iceberg tables without the need to configure or maintain an external catalog service like AWS Glue or a JDBC database.
The Hosted Iceberg Catalog is a built-in catalog service for the Iceberg table engine. It simplifies the process of creating Iceberg tables by eliminating the need to set up, configure, and manage an external catalog service like AWS Glue, a separate JDBC database, or a REST service. For users new to Iceberg, or for those who want to reduce operational overhead, the hosted catalog provides the quickest way to get started with the Iceberg table engine in RisingWave.
When you enable the hosted catalog in an Iceberg connection, RisingWave utilizes its internal metastore (which is typically a PostgreSQL instance) to function as a standard Iceberg JDBC Catalog.
ENGINE = iceberg
clause is stored within two system views in RisingWave: iceberg_tables
and iceberg_namespace_properties
.To use the hosted catalog, you create an Iceberg connection and set the hosted_catalog
parameter to true
.
Field | Description |
---|---|
hosted_catalog | Required. Set to true to enable the Hosted Iceberg Catalog for this connection. This instructs RisingWave to manage the catalog metadata internally. |
warehouse.path | Required. The S3 path to your Iceberg warehouse, where table data and metadata will be stored. |
s3.region | Required. The AWS region of your S3 bucket. |
s3.access.key | Required. The Access Key ID for your IAM user. For enhanced security, you can use secrets. |
s3.secret.key | Required. The Secret Access Key for your IAM user. For enhanced security, you can use secrets. |
s3.endpoint | Optional. The S3-compatible object storage endpoint. |
Here is a complete walkthrough of creating a connection, setting it as the default for the Iceberg engine, and creating a table.
Create an Iceberg connection with the hosted catalog enabled.
Configure the Iceberg table engine to use the connection you just created.
Now you can create a table using ENGINE = iceberg
. RisingWave will use the hosted catalog to manage its metadata.
Since the hosted catalog is a standard JDBC catalog, you can connect external query engines to read the tables managed by RisingWave.
External tools can connect to the hosted catalog using a RisingWave database user account and password. The SELECT privileges of the user account determine which Iceberg tables can be accessed.
When accessing the tables from an external tool, use the following mapping between Iceberg and RisingWave naming conventions:
Iceberg | RisingWave |
---|---|
catalog name | database name |
namespace | schema |
table | table |
You can configure a Spark session to read the t_hosted_catalog
table created in the example above. In this Spark SQL command, assume the table was created in the RisingWave database named dev
and schema public
.
Once connected, you can query the table using its three-part name (catalog.namespace.table
):
You can query the system views iceberg_tables
and iceberg_namespace_properties
directly in RisingWave to see the catalog’s metadata. The catalog exposed by RisingWave is read-only.
Learn how to use the Hosted Iceberg Catalog in RisingWave. This guide shows you how to create and manage Iceberg tables without the need to configure or maintain an external catalog service like AWS Glue or a JDBC database.
The Hosted Iceberg Catalog is a built-in catalog service for the Iceberg table engine. It simplifies the process of creating Iceberg tables by eliminating the need to set up, configure, and manage an external catalog service like AWS Glue, a separate JDBC database, or a REST service. For users new to Iceberg, or for those who want to reduce operational overhead, the hosted catalog provides the quickest way to get started with the Iceberg table engine in RisingWave.
When you enable the hosted catalog in an Iceberg connection, RisingWave utilizes its internal metastore (which is typically a PostgreSQL instance) to function as a standard Iceberg JDBC Catalog.
ENGINE = iceberg
clause is stored within two system views in RisingWave: iceberg_tables
and iceberg_namespace_properties
.To use the hosted catalog, you create an Iceberg connection and set the hosted_catalog
parameter to true
.
Field | Description |
---|---|
hosted_catalog | Required. Set to true to enable the Hosted Iceberg Catalog for this connection. This instructs RisingWave to manage the catalog metadata internally. |
warehouse.path | Required. The S3 path to your Iceberg warehouse, where table data and metadata will be stored. |
s3.region | Required. The AWS region of your S3 bucket. |
s3.access.key | Required. The Access Key ID for your IAM user. For enhanced security, you can use secrets. |
s3.secret.key | Required. The Secret Access Key for your IAM user. For enhanced security, you can use secrets. |
s3.endpoint | Optional. The S3-compatible object storage endpoint. |
Here is a complete walkthrough of creating a connection, setting it as the default for the Iceberg engine, and creating a table.
Create an Iceberg connection with the hosted catalog enabled.
Configure the Iceberg table engine to use the connection you just created.
Now you can create a table using ENGINE = iceberg
. RisingWave will use the hosted catalog to manage its metadata.
Since the hosted catalog is a standard JDBC catalog, you can connect external query engines to read the tables managed by RisingWave.
External tools can connect to the hosted catalog using a RisingWave database user account and password. The SELECT privileges of the user account determine which Iceberg tables can be accessed.
When accessing the tables from an external tool, use the following mapping between Iceberg and RisingWave naming conventions:
Iceberg | RisingWave |
---|---|
catalog name | database name |
namespace | schema |
table | table |
You can configure a Spark session to read the t_hosted_catalog
table created in the example above. In this Spark SQL command, assume the table was created in the RisingWave database named dev
and schema public
.
Once connected, you can query the table using its three-part name (catalog.namespace.table
):
You can query the system views iceberg_tables
and iceberg_namespace_properties
directly in RisingWave to see the catalog’s metadata. The catalog exposed by RisingWave is read-only.