> ## Documentation Index
> Fetch the complete documentation index at: https://docs.risingwave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# RisingWave storage overview

> Understand how RisingWave stores data: Hummock row-based storage on object storage (S3, GCS) for streaming state, plus native Iceberg table engine for columnar analytics.

RisingWave uses **Hummock**, a row-based LSM-tree storage engine that stores tables, materialized views, and streaming state in object storage (S3, GCS, Azure Blob). For analytics workloads requiring columnar storage, RisingWave also supports a native **Iceberg table engine** that stores data in the Apache Iceberg format. This page explains what data gets persisted, where it lives, and how to choose between the two options.

## What gets stored (and what doesn’t)

RisingWave persists data for:

* **Tables**
* **Materialized views (MVs)**

By contrast, a **Source** is just a connection to an external system and **does not store data inside RisingWave**. If you want RisingWave to keep a durable copy of ingested data, use a connector-backed table (`CREATE TABLE ... WITH (connector=...)`).

For a practical comparison, see [CREATE SOURCE vs. CREATE TABLE](/ingestion/create-source-vs-create-table) and [Source, Table, MV, and Sink](/get-started/source-table-mv-sink).

## Where the data is persisted

When you create tables or MVs, RisingWave persists their internal state in your configured **object store** (for example, Amazon S3). Compute nodes may cache hot data locally for performance, but the durable copy is stored in the object store.

## Two storage options

RisingWave offers two ways to persist data:

* **Row-based storage (default)** via the Hummock storage engine
* **Columnar storage** using **Apache Iceberg**

### Quick guide: which one should I use?

| Workload / requirement                                               | Recommended storage     |
| -------------------------------------------------------------------- | ----------------------- |
| Low-latency point lookups, “latest state”, frequent updates/deletes  | **Row-based (Hummock)** |
| Streaming pipelines with MVs that need fast incremental maintenance  | **Row-based (Hummock)** |
| Large scans, long-range analytics, and lakehouse interoperability    | **Iceberg (columnar)**  |
| Need to query the same dataset from other engines (Spark/Trino/etc.) | **Iceberg (columnar)**  |

## Row-based storage (Hummock)

By default, tables and materialized views are stored in row-based storage using [Hummock](https://risingwave.com/blog/hummock-a-storage-engine-designed-for-stream-processing/), a storage engine designed for streaming updates.

**Best for**:

* Serving **up-to-date results** with low latency.
* Workloads dominated by **point queries** and **short-range scans**.
* Pipelines with frequent incremental updates (CDC, upserts, streaming aggregations).

**Trade-offs**:

* Not optimized for very large **full-table scans** compared with columnar formats.

## Columnar storage (Apache Iceberg)

RisingWave can store analytical datasets in Apache Iceberg, a widely adopted columnar table format in the lakehouse ecosystem.

**Best for**:

* Analytical queries that scan lots of data (reporting, dashboards, ad-hoc BI).
* Sharing the same tables with external engines that understand Iceberg.

To learn how Iceberg storage works in RisingWave and how to manage it, see:

* [Apache Iceberg storage overview](/iceberg/overview)
* [Create and manage internal Iceberg tables](/iceberg/internal-iceberg-tables)

## Common patterns

* **Retain raw data + compute derived results**: Ingest into a table for durable retention, then build MVs for continuously maintained aggregations.
* **Explore first, persist later**: Start with `CREATE SOURCE` for quick exploration; switch to a connector-backed table when you need durability or performance.
* **Hybrid analytics**: Keep operational/serving state in Hummock, and keep large analytical datasets (or shared lakehouse tables) in Iceberg.

## What’s next?

* If you’re choosing between connector objects, start with [CREATE SOURCE vs. CREATE TABLE](/ingestion/create-source-vs-create-table).
* If you want to expose results to tools and applications, see [Access overview](/serve/overview).
