This guide shows you the essential steps to build a simple, streaming Iceberg table from scratch with RisingWave. It’s ideal for users who are new to RisingWave and Iceberg and want to quickly understand the process of building an Iceberg-based streaming lakehouse. The core process involves three simple steps:
  1. Create a connection: Tell RisingWave where to store your Iceberg data and metadata.
  2. Create a table: Define your table with the ENGINE = iceberg clause.
  3. Stream and query data: Insert data and query it in real time.
For a complete, runnable demo with a pre-configured environment, please see the full tutorial in our demo repository.

Hands-on Tutorial: Streaming Iceberg Quickstart

This end-to-end tutorial provides a Docker Compose file to instantly set up the environment and includes all the code you need to run the examples below.

Step 1: Create a connection with the hosted catalog

First, you need to tell RisingWave where to store the table files and metadata. For the simplest setup, you can use RisingWave’s built-in hosted catalog, which manages the metadata for you without requiring any external services like AWS Glue or a separate database.
While you can also use external catalogs like AWS Glue or a JDBC database to create native Iceberg tables, this tutorial uses the hosted catalog because it requires no additional setup. For details on all available options, see the Catalogs guide.
CREATE CONNECTION my_iceberg_connection
WITH (
    type                 = 'iceberg',
    warehouse.path       = 's3://icebergdata/demo',
    s3.access.key        = 'minioadmin',
    s3.secret.key        = 'minioadmin',
    s3.endpoint          = 'http://minio:9000',
    hosted_catalog       = 'true'
);

Step 2: Create a native Iceberg table

Next, you create the table using the ENGINE = iceberg clause. This tells RisingWave to store the data in the Iceberg format. You can set a low commit_checkpoint_interval to enable low-latency commits, which is ideal for streaming workloads.
-- Set the connection for the session
SET iceberg_engine_connection = 'public.my_iceberg_connection';

-- Define the streaming table
CREATE TABLE machine_sensors (
  sensor_id   INT PRIMARY KEY,
  temperature DOUBLE,
  reading_ts  TIMESTAMP
)
WITH (commit_checkpoint_interval = 1)  -- For low-latency commits
ENGINE = iceberg;

Step 3: Stream data in and query it

The table is now ready to accept streaming data. You can insert data into the table, and it will be committed to Iceberg in near real-time.
-- Insert data into the table
INSERT INTO machine_sensors
VALUES
  (101, 25.5, NOW()),
  (102, 70.2, NOW());

-- Query the table to verify the commit
SELECT * FROM machine_sensors;
Because the table is stored in the open Iceberg format, you can immediately query it from other engines like Spark or Trino by pointing them to the same warehouse path and catalog.

Next steps