Create an internal Iceberg table
Creating and using an internal Iceberg table is a two-step process: first, you define the storage and catalog details in aCONNECTION object, and then you create the table itself.
Step 1: Create an Iceberg Connection
An IcebergCONNECTION defines the catalog and object storage configuration.
You must specify the type and warehouse.path parameters, along with the required parameters for your catalog and object storage. To use the JDBC-based hosted catalog, set hosted_catalog to true.
You can also set the optional commit_checkpoint_interval parameter to control the commit frequency. For example, setting it to 10 means RisingWave will commit data every 10 checkpoints.
The following tabs show examples for different catalog types. For a complete list of parameters, refer to Catalog configuration.
- Hosted catalog - JDBC
- JDBC catalog
- Glue catalog
- REST catalog
- S3 Tables catalog
For the simplest setup, use RisingWave’s built-in JDBC-based hosted catalog. This requires no external dependencies.For more details, see Hosted Iceberg catalog.
Step 2: Create an internal Iceberg table
Create an internal Iceberg table using theENGINE = iceberg clause and associate it with your connection. To simplify creation, you can set a default connection for your session.
WITH clause to optimize query performance.
bucket(n, column) or truncate(n, column). The partition key must be a prefix of the primary key.
Work with internal tables
Once created, an internal Iceberg table behaves like any other table in RisingWave.Ingest data
You can ingest data using standardINSERT statements or by streaming data from a source using CREATE SINK ... INTO.
Query data
Query the table directly withSELECT or use it as a source for a materialized view.
Time travel
Query historical snapshots of the table usingFOR SYSTEM_TIME AS OF or FOR SYSTEM_VERSION AS OF.
Partition strategy
RisingWave’s Iceberg table engine supports table partitioning using thepartition_by option. Partitioning helps organize data for efficient storage and query performance. You can partition by one or multiple columns, separated by commas, and optionally apply a Transform function to each column to customize partitioning.
Supported transformations include identity, truncate(n), bucket(n), year, month, day, hour, and void. For more details on Iceberg partitioning, see Partition transforms.
Table maintenance
To maintain good performance and manage storage costs, internal Iceberg tables require periodic maintenance, including compaction and snapshot expiration. RisingWave provides both automatic and manual maintenance options. For complete details, see the Iceberg table maintenance guide.External access
Because internal tables are standard Iceberg tables, they can be read by external query engines like Spark or Trino using the same catalog and storage configuration. Spark Example:Limitations
- Advanced schema evolution operations are not yet supported.
- To ensure data consistency, only RisingWave should write to internal Iceberg tables.