- Continuous ingestion (default): Create an Iceberg source with the
CREATE SOURCEstatement for continuous, streaming ingestion of append-only data. - Periodic full reload: Create an Iceberg table with
refresh_mode = 'FULL_RELOAD'for scheduled full table refreshes. Note that you must useCREATE TABLE(notCREATE SOURCE), and data will only be loaded after you trigger a refresh—either manually or via the configured schedule.
Prerequisites
- An existing Apache Iceberg table managed by external systems.
- Access credentials for the underlying storage system (e.g., S3 access key and secret key).
- Network connectivity between RisingWave and your storage system.
- Knowledge of your Iceberg catalog type and configuration.
Continuous ingestion with CREATE SOURCE
Basic connection example
The following example creates a source for a table in S3 using AWS Glue as the catalog:Parameters
| Parameter | Description | Example |
|---|---|---|
connector | Required. For Iceberg sources, it must be 'iceberg' | 'iceberg' |
database.name | Required. The Iceberg database/namespace name. | 'analytics' |
table.name | Required. The Iceberg table name. | 'user_events' |
commit_checkpoint_interval | Optional. Determines the commit frequency (RisingWave commits every N checkpoints). | 60 |
CREATE SOURCE statement. Because these parameters are shared across all Iceberg objects—sources, sinks, and internal Iceberg tables—they are documented separately.
- Object storage: Object storage configuration
- Catalogs: Catalog configuration
Source example
For a REST catalog:Periodic full reload with CREATE TABLE
Added in v2.7.0. It is currently in technical preview stage.
refresh_mode = 'FULL_RELOAD'. This mode is useful when:
- The external Iceberg table supports mutable data (updates and deletes).
- You need a point-in-time snapshot of the entire table.
- Periodic full reloads fit your use case better than continuous streaming.
Create a refreshable table
Parameters
| Parameter | Description | Required | Example |
|---|---|---|---|
refresh_mode | Must be set to 'FULL_RELOAD' to enable periodic refresh functionality | Yes | 'FULL_RELOAD' |
refresh_interval_sec | Interval in seconds between automatic refresh operations | No | '60' |
stream_refresh_scheduler_interval_sec in the RisingWave configuration file). Setting a refresh_interval_sec value lower than this scheduler interval may result in refresh triggers not occurring at the expected frequency.
Manual refresh
You can manually trigger a refresh at any time using theREFRESH TABLE command:
Monitor refresh status
Query therw_catalog.rw_refresh_table_state system catalog to monitor refresh operations:
current_status field shows the current state of the refresh job:
IDLE: No refresh operation is currently in progressREFRESHING: A refresh operation is in progress