RisingWave vs. Flink: Feature-by-feature comparison
A comprehensive comparison of features between RisingWave and Apache Flink, covering SQL capabilities, data types, streaming operations, and system functionalities.
This document provides a detailed feature-by-feature comparison between RisingWave (targeting v2.0) and Apache Flink (targeting v1.20). While both systems excel at stream processing, they have different architectural approaches and feature sets resulting from their different designs: RisingWave as a streaming database and Flink as a stream processing framework. This comparison helps you understand their similarities and differences across various features.
Features noted as “RisingWave-specific” highlight capabilities tied to its database architecture, while “Flink-specific” features reflect its framework nature. Version numbers (RW 2.0, Flink 1.20) are targets; some features might have evolved slightly around these versions. For definitive support details, always consult the official Flink and RisingWave documentation.
Fundamental concepts
RisingWave and Flink share the following core stream processing concepts:
- Dynamic Tables: Both represent streams as tables that change over time.
- Continuous Queries: Both execute SQL queries continuously, producing results as input data changes.
- Time attributes: Both support Event Time (timestamps embedded in data) and Processing Time (the system clock time) for operations such as windowing.
- Result Update Modes: Both use Append, Update, and Delete semantics to handle changes, which allows them to correctly process changelog streams and maintain state.
- Deterministic Queries: Both systems aim for deterministic results given the same ordered input events when using event time.
While sharing these fundamentals, their architectures differ: RisingWave stores state internally within its database storage layer, treating streams and materialized views as first-class database objects. Flink typically uses pluggable state backends (such as RocksDB or filesystems), and its connectors manage external storage.
Data types
Both systems support a wide range of standard SQL data types. The following table highlights common types and key differences.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Fixed-Length | |||
CHAR | Supported | Not Supported | Fixed-length character string. |
BINARY | Supported | Not Supported | Fixed-length binary string. |
Variable-Length | |||
VARCHAR /STRING | Supported | Supported | Flink allows specifying a maximum length; RisingWave does not. |
VARBINARY /BYTES | Supported | BYTEA | Variable-length binary string. |
Numeric | |||
DECIMAL | Supported | DECIMAL /NUMERIC | Arbitrary precision numbers. |
TINYINT | Supported | Not Supported | 1-byte integer. |
SMALLINT | Supported | Supported | 2-byte integer. |
INT | Supported | Supported | 4-byte integer. |
BIGINT | Supported | Supported | 8-byte integer. |
RW_INT256 | Not Supported | Supported | 32-byte integer (RisingWave-specific). |
FLOAT | Supported | REAL | 4-byte floating-point. |
DOUBLE | Supported | DOUBLE PRECISION | 8-byte floating-point. |
Temporal | |||
DATE | Supported | Supported | Calendar date. |
TIME | Supported | Supported | Time of day (without timezone). |
TIMESTAMP | Supported | Supported | Timestamp (without timezone). |
TIMESTAMP_LTZ | Supported | TIMESTAMPTZ | Timestamp with local timezone interpretation (Flink); Timestamp with timezone, stored as UTC (RisingWave). |
INTERVAL | Supported | Supported | Time interval. |
Structured | |||
STRUCT | Supported | ROW | Row type with named fields. |
ARRAY | Supported | Supported | Ordered collection of elements. |
MAP | Supported | Supported | Key-value pairs. |
MULTISET | Supported | Not Supported | Unordered collection allowing duplicates (Flink). |
JSON | Via Functions | JSONB | Flink typically processes JSON using string functions, while RisingWave offers a native JSONB (binary JSON) type. |
Other | |||
BOOLEAN | Supported | Supported | True/False. |
Summary: Flink offers fixed-length types and MULTISET
. RisingWave provides native JSONB
, a large integer type (RW_INT256
), and PostgreSQL-compatible naming (REAL
, TIMESTAMPTZ
, BYTEA
). Both support the core data types required for streaming analytics. Timezone handling differs (TIMESTAMP_LTZ
vs. TIMESTAMPTZ
).
SQL query capabilities
Common SELECT clauses
Flink SQL and RisingWave both support the standard clauses of a SELECT
query:
WITH
(Common Table Expressions)SELECT
(includingDISTINCT
)FROM
WHERE
GROUP BY
(includingGROUPING SETS
,ROLLUP
, andCUBE
)HAVING
ORDER BY
LIMIT
Key differences in SELECT
DISTINCT ON
: RisingWave supports the PostgreSQLSELECT DISTINCT ON (...)
syntax. Flink 1.20 has limited or no support for this specific syntax.- Complex Subqueries: Both support subqueries (e.g., in
WHERE IN (...)
,WHERE EXISTS (...)
, derived tables inFROM
). Flink’s SQL planner might optimize certain complex or correlated subquery patterns more effectively. - Pattern Matching: Flink provides the
MATCH_RECOGNIZE
clause for Complex Event Processing (CEP) directly within SQL. RisingWave does not supportMATCH_RECOGNIZE
.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
DISTINCT ON | Limited/No | Supported | PostgreSQL-style unique row selection. |
MATCH_RECOGNIZE | Supported | Not Supported | SQL standard for Complex Event Processing. |
Windowing operations
Both systems provide essential windowing capabilities for analyzing data over time or rows.
Window types (Table Value Functions)
Flink uses Table Value Functions (TVFs) like TUMBLE
, HOP
, SESSION
, CUMULATE
for windowing aggregations. RisingWave primarily uses standard SQL GROUP BY
with time functions or dedicated window syntax where applicable. Support for TVFs is also being added to RisingWave.
Window Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Tumbling Window | Supported (TVF) | Supported | Fixed-size, non-overlapping windows. |
Sliding Window (Hop) | Supported (TVF) | Supported | Fixed-size, overlapping windows. |
Session Window | Supported (TVF) | Supported | Windows defined by gaps of inactivity. |
Cumulative Window | Supported (TVF) | Not Supported | Windows expanding from a start time to window_end . |
Late Data Handling | Supported | Supported | Mechanisms to handle data arriving after window close. |
Window Offset | Supported | Supported | Adjusting window start/end times. |
Window functions (OVER clause)
Both support standard SQL window functions using the OVER
clause for calculations across sets of table rows.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
OVER (...) | Supported | Supported | Defines the window specification. |
Ranking | Supported | Supported | RANK , DENSE_RANK , ROW_NUMBER , NTILE . |
Navigation | Supported | Supported | LEAD , LAG . |
Value | Supported | Supported | FIRST_VALUE , LAST_VALUE . |
Note: Flink’s NTILE
is a window function, not directly related to percentile aggregates.
Joins
Both systems support various SQL join types for combining data from multiple streams or tables.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Regular Joins | |||
INNER JOIN | Supported | Supported | |
LEFT JOIN | Supported | Supported | |
RIGHT JOIN | Supported | Supported | |
FULL JOIN | Supported | Supported | |
State Handling Note | Configurable TTL | Internal Storage | Flink requires you to configure Time-To-Live (TTL) for join state. RisingWave manages state internally. |
Interval Join | Supported | Supported | Joins streams based on time proximity (e.g., t1.time BETWEEN t2.time - INTERVAL '5' SECOND AND t2.time + INTERVAL '5' SECOND ). |
Temporal Join | Joins a stream against a versioned table/changelog source. | ||
Event Time Temporal Join | Supported (FOR ... AS OF ) | Supported (Implicit / TEMPORAL JOIN ) | Flink requires explicit syntax. RisingWave support depends on the source type and query structure. |
Processing Time Temporal Join | Supported (FOR ... AS OF ) | Supported (Lookup Joins) | Flink requires explicit syntax. RisingWave typically achieves this using lookup joins against external tables. |
Window Join | Supported | Supported | Joins aggregated results from windows. |
Lookup Join | Supported | Supported | Joins a stream against an external dimension table (often async). |
Set operations
Both support standard SQL set operations to combine or compare result sets.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
UNION | Supported | Supported | Combines results, removes duplicates. |
UNION ALL | Supported | Supported | Combines results, keeps duplicates. |
INTERSECT | Supported | Supported | Returns common rows, removes duplicates. |
INTERSECT ALL | Supported | Not Supported | Returns common rows, keeps duplicates. |
EXCEPT | Supported | Supported | Returns unique rows from first set not in second. |
EXCEPT ALL | Supported | Not Supported | Returns rows from first set not in second, keeps duplicates. |
IN (Subquery) | Supported | Supported | Checks for membership in a subquery result. |
EXISTS (Subquery) | Supported | Supported | Checks if a subquery returns any rows. |
CORRESPONDING | Not Supported | Supported | Performs set ops on specified columns by name (RisingWave-specific). |
DDL statements (Data Definition Language)
As a streaming database, RisingWave has a broader set of Data Definition Language (DDL) commands, especially for managing sources, sinks, users, and connections.
Common DDL
Operation | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
CREATE/DROP DATABASE | Supported | Supported | Logical grouping of objects. |
CREATE/DROP VIEW | Supported | Supported | Defines a logical view based on a query. |
CREATE/DROP FUNCTION | Supported | Supported | Defines user-defined functions. |
CREATE/DROP TABLE | Supported | Supported | Flink: Defines the schema and connector for a stream or table. RisingWave: Defines the schema for a source or internal table. |
CREATE MATERIALIZED VIEW | Supported | Supported | Defines a query whose results are stored. This is a core concept in RisingWave. |
DROP MATERIALIZED VIEW | Supported | Supported | Removes a materialized view. |
ALTER DATABASE/VIEW/FUNCTION | Supported | Supported | Modifies existing objects (scope varies). |
RisingWave-specific DDL
RisingWave includes the following DDL commands to manage its database-specific entities:
CREATE/DROP/ALTER SOURCE
: Define connections to external data sources (e.g., Kafka, Kinesis).CREATE/DROP/ALTER SINK
: Define destinations for outputting data.CREATE/DROP/ALTER CONNECTION
: Reusable connection configurations for sources/sinks.CREATE/DROP/ALTER SCHEMA
: Organize objects within a database.CREATE/DROP/ALTER USER
: Manage database users.CREATE/DROP SECRET
: Securely store credentials.CREATE/DROP/ALTER INDEX
: Create indexes on materialized views/tables for faster lookups.CREATE/DROP AGGREGATE
: Define custom aggregate functions.ALTER SYSTEM SET
: Modify system-level configuration parameters.
Flink-specific DDL aspects
CREATE OR REPLACE TABLE
: Flink supports this atomic replacement syntax.ALTER TABLE
: Flink’sALTER TABLE
often has more capabilities for schema evolution depending on the connector and format, compared to RisingWave’s more restricted alteration of stateful sources/MVs. RWALTER TABLE
mainly supports renaming.
DML statements (Data Manipulation Language)
Data Manipulation Language (DML) statements interact with data in tables or trigger updates to materialized views.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
INSERT INTO ... SELECT ... | Supported | Supported | Inserts data from a query result. Triggers MV updates. |
INSERT INTO ... VALUES ... | Supported | Supported | Inserts explicit rows. Triggers MV updates. |
UPDATE ... SET ... WHERE ... | Supported | Supported | Updates existing rows based on conditions. Triggers MV updates. |
DELETE FROM ... WHERE ... | Supported | Supported | Deletes rows based on conditions. Triggers MV updates. |
Multi-table INSERT | Supported | Not Supported | Flink can insert into multiple tables from a single source query. |
FLUSH | Not Supported | Supported | RisingWave-specific: Ensures visibility of preceding DML changes in subsequent queries within a session (mainly for sinks). |
Introspection and utility statements
Commands for exploring metadata, explaining queries, and managing sessions.
Common statements
SHOW DATABASES | TABLES | VIEWS | FUNCTIONS | JOBS
: List common objects or entities (supported by both).DESCRIBE <object>
: Shows object metadata (e.g., columns, types). In Flink,<object>
can be a table, view, or catalog. In RisingWave,<object>
can be a table, source, view, sink, or materialized view.EXPLAIN [statement]
: Shows the logical and physical execution plan for a query.USE [database/catalog]
: Sets the current context for queries.SET [key = value]
: Modifies session configuration settings.
RisingWave-specific SHOW commands
Reflecting its database architecture, RisingWave offers additional SHOW
commands for system introspection:
SHOW CLUSTER
,SHOW PROCESSLIST
,SHOW PARAMETERS
SHOW CONNECTIONS
,SHOW SOURCES
,SHOW SINKS
,SHOW SCHEMAS
SHOW INDEX
,SHOW MATERIALIZED VIEWS
,SHOW INTERNAL TABLES
SHOW CREATE ...
for various RW objects (Sources, Sinks, MVs, etc.)SHOW CURSORS
,SHOW SUBSCRIPTION CURSORS
Flink-specific SHOW commands
SHOW JARS
: Lists user-uploaded JAR files (relevant for Java UDFs).SHOW PARTITIONS
: Shows partitions for partitioned tables (common in batch/Hive integration).SHOW CURRENT DATABASE
: Flink command. RW usesSELECT current_database()
.
JAR management (Flink-specific)
Flink requires managing JAR files containing UDFs or connectors.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
ADD JAR | Supported | Not Supported | Adds a JAR to the SQL classpath. |
SHOW JARS | Supported | Not Supported | Lists added JAR files. |
REMOVE JAR | Supported | Not Supported | Removes a JAR (often session-specific). |
Job management
Both systems allow managing running streaming queries/jobs.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
SHOW JOBS | Supported | Supported | Lists active streaming jobs/pipelines. |
CANCEL JOB [id] | Supported | Supported | Stops a specific running job/pipeline. |
Materialized views
While both support the concept, their role differs significantly.
- RisingWave: Materialized Views are a core architectural concept. Queries are typically defined as MVs, which RisingWave keeps incrementally updated and stores efficiently. State management is built around MVs.
- Flink: Materialized Views (or Materialized Tables) are an optional feature (experimental/evolving in 1.20). Flink can materialize query results using specific connectors, but it’s not the default way queries are executed or state is managed.
Access control (RisingWave-specific)
As a database system, RisingWave provides standard SQL access control.
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
GRANT | Not Supported | Supported | Grants privileges on objects. |
REVOKE | Not Supported | Supported | Revokes privileges from users. |
Flink typically relies on external systems (like cluster managers or catalog providers) for access control.
User-defined Functions (UDFs)
Both systems allow you to extend SQL with custom logic using User-defined Functions (UDFs).
- Supported Languages:
- Flink (1.20): Java, Scala, and Python (primarily via an external Remote Procedure Call (RPC) mechanism).
- RisingWave (2.0): SQL, Python, Java (via Java Native Interface (JNI)), JavaScript, and Rust (the latter via embedded WebAssembly (WASM) or Foreign Function Interface (FFI)).
- Execution Models:
- Flink: Embedded JVM functions (Java/Scala), External RPC functions (Python).
- RisingWave: Embedded SQL functions, Embedded language functions via WASM/FFI/JNI.
- Function Types: Both support Scalar Functions (UDF), Aggregate Functions (UDAF), and Table Functions (UDTF), although the specific implementation details and language support for each type can vary.
- SQL UDFs: RisingWave allows you to define simple scalar functions directly using SQL expressions; Flink does not offer this capability.
Summary: Both offer extensibility, but RisingWave provides native SQL UDFs and uses WASM/FFI for several embedded languages, while Flink primarily uses JVM languages and an RPC mechanism for Python.
Built-in functions
Both Flink and RisingWave provide a rich library of built-in functions covering standard SQL categories. Listing every function is impractical here; instead, we highlight general coverage and key differences.
-
Common Coverage: Both systems offer extensive support for:
- Comparison operators (
=
,>
,<
, etc.) and predicates (IS NULL
,BETWEEN
). - Logical operators (
AND
,OR
,NOT
). - Standard arithmetic operators (
+
,-
,*
,/
,%
) and functions (POWER
,SQRT
,ABS
,ROUND
,CEIL
,FLOOR
). - Common string manipulation functions (
CONCAT
,SUBSTRING
,UPPER
,LOWER
,TRIM
,REPLACE
,LENGTH
,POSITION
,LIKE
). - Basic temporal functions (
EXTRACT
,DATE_FORMAT
/TO_CHAR
, casting between date/time types). - Standard aggregate functions (
COUNT
,SUM
,AVG
,MAX
,MIN
,ARRAY_AGG
,STRING_AGG
). - Conditional expressions (
CASE WHEN ... END
,COALESCE
,NULLIF
). - Type casting (
CAST
). - Basic Array and Map functions/operators.
- Set returning functions like
generate_series
.
- Comparison operators (
-
Key Differences & Specific Functions:
- Flink Specific:
- URL parsing:
PARSE_URL
function. - Safe Casting:
TRY_CAST
(returns NULL on failure instead of error). - More extensive built-in time functions:
LOCALTIME
,CURRENT_TIME
, etc. - Potentially more advanced collection functions (e.g.,
ARRAY_TRANSFORM
). - Specific functions related to
MATCH_RECOGNIZE
.
- URL parsing:
- RisingWave Specific:
- Native
JSONB
operators (->
,->>
,#>
, etc.) and functions. - PostgreSQL compatibility functions (e.g.,
pg_typeof
,pg_sleep
). - Encryption functions:
encrypt()
,decrypt()
. TIMESTAMPTZ
handling functions.- Uses standard
CAST
which errors on failure in queries (but NULLs in MVs).
- Native
- Flink Specific:
Recommendation: Always consult the official Flink SQL and RisingWave SQL documentation for the most current and comprehensive list of supported built-in functions.
System management & information
- RisingWave: Provides SQL functions and
SHOW
commands specifically designed for managing and monitoring the streaming database system itself, providing information such as cluster status, internal tables, process lists, and system parameters. - Flink: Relies on its framework APIs, metrics system, and external tools (like its Web UI or platform integrations) for system management and monitoring, rather than using built-in SQL functions for these tasks.
Other features
Feature | Flink (v1.20) | RisingWave (v2.0) | Notes |
---|---|---|---|
Catalogs | Supported | Supported | Manage metadata namespaces (e.g., connecting to external catalogs). |
SQL Client | Supported | Supported | Interactive command-line interface for SQL execution. |
SQL Gateway | Supported | Supported | Service for accepting remote SQL submissions (e.g., via REST/JDBC). |
Hive Compatibility | Supported | Not Supported | Flink has strong integration with Apache Hive (catalog, formats). |
Conclusion
RisingWave and Flink SQL, while both powerful stream processing tools, cater to slightly different needs and design philosophies, which are reflected in their SQL dialects and features.
- Flink SQL (1.20) is a flexible processing framework with strong Hive integration,
MATCH_RECOGNIZE
for CEP, and specific features likeTRY_CAST
and JAR management. - RisingWave SQL (2.0) offers a PostgreSQL-compatible experience tailored for a streaming database, featuring core materialized views, native
JSONB
, built-in sources/sinks management, access control, and unique DDL/system management capabilities.
Your choice between them depends on whether you need a standalone streaming database solution (RisingWave) or a flexible stream processing framework that integrates with diverse ecosystems (Flink).
Was this page helpful?