RisingWave Managed Iceberg

RisingWave Managed Iceberg refers to scenarios where RisingWave is the primary owner and manager of Iceberg tables. In this approach, you create tables directly within RisingWave that store their data in the Apache Iceberg format on object storage, while RisingWave handles the table lifecycle, schema management, and write operations.

When to use RisingWave Managed Iceberg

Choose this approach when:

You want RisingWave to be the primary data platform: RisingWave creates, owns, and manages the Iceberg tables.
Simplified architecture: No need to set up separate external systems to manage Iceberg metadata.
Streaming-first workflows: Data flows directly from streaming sources into Iceberg format without additional ETL steps.
Quick start: Get started with Iceberg without setting up external catalogs or table management systems.
Interoperability desired: Want tables that can be read by other Iceberg-compatible engines (Spark, Trino, Flink) while being managed by RisingWave.

Key capabilities

Iceberg Table Engine

Create tables using ENGINE = iceberg to store data natively in the Iceberg format:

CREATE TABLE user_events (
    user_id INT,
    event_type VARCHAR,
    timestamp TIMESTAMPTZ,
    PRIMARY KEY (user_id, timestamp)
) ENGINE = iceberg
WITH (connection = 'my_iceberg_connection');

Native management: Tables behave like any other RisingWave table for queries, inserts, and materialized views
Iceberg format: Data is stored according to Iceberg specification for ecosystem compatibility
Time travel: Query historical versions of your data
External access: Tables can be read by external Iceberg-compatible tools

Hosted Iceberg Catalog

Use RisingWave’s built-in catalog service to eliminate external catalog setup:

CREATE CONNECTION my_connection WITH (
    type = 'iceberg',
    warehouse.path = 's3://my-bucket/warehouse/',
    s3.access.key = 'your-access-key',
    s3.secret.key = 'your-secret-key',
    s3.region = 'us-west-2',
    hosted_catalog = true  -- Use RisingWave's built-in catalog
);

Zero external dependencies: No need for AWS Glue, JDBC databases, or REST catalog services
Standard compliance: Uses standard Iceberg JDBC catalog protocol for compatibility
Quick setup: Get started immediately without catalog infrastructure

Architecture benefits

Simplified data pipeline: Streaming data → RisingWave processing → Iceberg storage (all in one platform).
Reduced operational overhead: Fewer external systems to manage and monitor.
Consistent interface: Use familiar RisingWave SQL for all table operations.
Ecosystem compatibility: Standard Iceberg tables accessible to the broader ecosystem.

What’s included in this section

Iceberg Table Engine: Complete guide to creating and using Iceberg tables in RisingWave.
Hosted Iceberg Catalog: How to use RisingWave’s built-in catalog service.
Configuration: Setup and configuration options for managed scenarios.

Next steps

Start with hosted catalog: Use the Hosted Iceberg Catalog for the quickest setup.
Create your first table: Follow the Iceberg Table Engine guide.
Consider external catalogs: If you need integration with existing infrastructure, see configuration options.

Comparing approaches: If you already have Iceberg tables managed by other systems and want to read from or write to them, see Bring Your Own Iceberg instead.

Iceberg

​When to use RisingWave Managed Iceberg

​Key capabilities

​Iceberg Table Engine

​Hosted Iceberg Catalog

​Architecture benefits

​What’s included in this section

​Next steps