Unified platform for streaming data

RisingWave simplifies end-to-end development of real-time data pipelines and applications—going beyond traditional stream processors.

Like other stream processors, RisingWave supports:

  • Ingestion: Ingest millions of events per second from streaming and batch sources.
  • Stream processing: Perform real-time incremental processing to join and analyze live data with historical tables.
  • Delivery: Deliver fresh, consistent results to data lakes (e.g., Apache Iceberg) or any destination.

But RisingWave does more. It provides both online and offline storage:

  • Online serving: Row-based storage for ad-hoc point/range queries with single-digit millisecond latency.
  • Offline persistence: Apache Iceberg-based storage that persists streaming data at low cost, enabling open access by other query engines.

Key design decisions

RisingWave is built for ease of use and cost efficiency.

PostgreSQL compatibility

RisingWave is wire-compatible with PostgreSQL, enabling:

  • Expressive SQL: Supports structured, semi-structured, and unstructured data with a familiar SQL dialect.
  • Seamless integration: Works with psql, JDBC, and any PostgreSQL-compatible tool via the PostgreSQL wire protocol.
  • No manual state tuning: Eliminates the need for complex state management configurations.

S3 as primary storage

RisingWave stores tables, materialized views, and internal states of stream processing jobs in S3 (or equivalent object storage), providing:

  • High performance: Optimized for complex queries, including joins and time windowing.
  • Fast recovery: Restores from system failures within seconds.
  • Dynamic scaling: Instantly adjusts resources to handle workload spikes.

Pluggable disk cache

Beyond caching hot data in memory, RisingWave supports pluggable disk cache, leveraging local disks or EBS for efficient data caching.

In what use cases does RisingWave excel?

RisingWave is particularly effective for the following use cases:

  • Streaming analytics: Achieve sub-second data freshness in live dashboards, ideal for high-stakes scenarios like stock trading, sports betting, and IoT monitoring.
  • Event-driven applications: Develop sophisticated monitoring and alerting systems for critical applications such as fraud and anomaly detection.
  • Real-time data enrichment: Continuously ingest data from diverse sources, conduct real-time data enrichment, and efficiently deliver the results to downstream systems.
  • Feature engineering: Transform batch and streaming data into features in your machine learning models using a unified codebase, ensuring seamless integration and consistency.

Comparing RisingWave with other systems

RisingWave is not simply an “alternative” to any existing product, but it is often compared with stream processors, analytical databases, and operational databases.

Stream processors

Stream processors like ksqlDB, Spark Structured Streaming, and Flink SQL are frequently compared to RisingWave. While these systems have their strengths, RisingWave offers an exceptionally simple, PostgreSQL-style user experience, and eliminates the need for manual state management. It excels in:

  • Handling complex queries like joins, aggregations, and time windows with high performance.
  • Transparent dynamic scaling, allowing for scaling in and out within seconds rather than minutes or hours.
  • Instant failure recovery, where RisingWave recovers in seconds rather than minutes or hours.

Additionally, RisingWave greatly simplifies overall architecture, see How does RisingWave simplify your event-driven architecture?. However, compared to these stream processors, RisingWave does not offer low-level Java and Scala APIs, but compensates by offering various language UDFs and SDKs.

Analytical databases

Modern analytical databases, such as ClickHouse with materialized views, Snowflake with dynamic tables, BigQuery with continuous queries, and Databricks with Delta Live Tables, offer continuous processing capabilities. RisingWave surpasses these solutions in continuous processing by:

  • Offering a rich feature set for stream processing, including time windowing, watermarks, and more.
  • Being particularly optimized for handling complex streaming joins.
  • Allowing data ingestion from and delivery to any system, without locking you into a specific ecosystem.

Moreover, RisingWave’s transparent dynamic scaling and instant failure recovery mechanisms are superior to other analytical databases.

However, RisingWave does not feature columnar storage. If your workloads mostly involve ad-hoc, long-range scans rather than predefined queries, an analytical database might be a better fit.

Operational databases

RisingWave is PostgreSQL wire-compatible, enabling seamless integration with most tools in the PostgreSQL ecosystem. RisingWave is designed specifically for storing and processing streaming data, making it particularly well-suited for managing metrics and events rather than transactional data.

Note that RisingWave does not use the PostgreSQL engine internally, which results in certain PostgreSQL tools not being supported. Additionally, RisingWave does not support read-write transactions.

How does RisingWave simplify your event-driven architecture?

RisingWave aims to help simplify event-driven architecture. You can think of RisingWave as a unified system that combines event streaming, stream processing, storage, and serving capabilities. Developers can express intricate stream-processing logic through cascaded materialized views. Additionally, it allows users to persist data directly within the system, eliminating the need to deliver results to external databases for storage and query serving.