Workload isolation and interaction
Manage multiple workloads efficiently in RisingWave using database isolation, resource groups for dedicated compute resources, and cross-database queries for unified access.
Running multiple distinct streaming applications, managing data for different teams, or separating environments (like development and staging) within a single RisingWave cluster offers significant operational advantages. However, it also presents challenges: ensuring stability by preventing interference between workloads, managing resource allocation effectively, and providing ways to access data across logical boundaries when necessary.
From v2.3, RisingWave provides a comprehensive suite of features to address these challenges, enabling robust workload isolation and flexible interaction:
- Database isolation: Core improvements providing inherent resilience between databases by default.
- Resource groups: Assign databases to dedicated pools of compute nodes for strong fault isolation, performance tuning, and resource governance.
- Cross-database queries: Query tables and materialized views across different databases, enabling unified analytics without data duplication, especially powerful when combined with isolated environments.
PREMIUM EDITION FEATURE
Database isolation, resource groups, and cross-database queries are Premium Edition features. All Premium Edition features are available out of the box without additional cost on RisingWave Cloud. For self-hosted deployments, users need to purchase a license key to access this feature. To purchase a license key, please contact sales team at sales@risingwave-labs.com.
For a full list of Premium Edition features, see RisingWave Premium Edition.
PUBLIC PREVIEW
Database isolation, resource groups, and cross-database queries are currently in public preview. While nearing the final product, it may still evolve. Please refer to our Public Preview Feature List for the latest status or reach out via our Slack channel with feedback or issues.
This topic explains these capabilities and how they work together.
Database isolation
RisingWave incorporates foundational improvements that enhance the operational isolation between different databases within the same cluster, providing a more resilient environment out-of-the-box:
- Independent checkpoints: The checkpointing process, vital for maintaining consistent state, generally operates independently per database. An issue triggering a checkpoint failure for a job in
database_A
is significantly less likely to halt or affect the checkpoint process for unrelated jobs indatabase_B
, assuming no direct cross-database dependencies are created (e.g., via cross-database materialized views, see below). - Localized job error recovery: When a streaming job within a database fails (due to internal errors, UDF issues, etc.), the recovery process is primarily contained within that specific database. This prevents localized job failures from unnecessarily disrupting unrelated streaming jobs in other databases.
Benefits:
- Increased stability: Reduces the risk of cascading failures across independent workloads.
- Improved availability: Localized issues have a smaller impact radius, allowing unaffected parts of the cluster to continue functioning.
While database isolation improves logical and operational separation, databases might still share the same underlying pool of compute nodes (CNs) by default. This means they are not isolated from hardware failures affecting that shared pool or from resource contention (CPU, memory). For stronger infrastructure isolation and dedicated resource allocation, use Resource Groups.
Resource groups
Resource groups offer a powerful mechanism for fine-grained control over compute resource allocation and fault domains, essential for critical workloads, multi-tenant scenarios, and performance optimization.
Concept
A resource group is a named logical pool consisting of specific compute nodes (CNs) within your RisingWave cluster. Administrators typically define these groups and their associated CNs during cluster setup or configuration.
How resource groups provide isolation and control
When you assign a database to a resource group, all streaming jobs belonging to that database are constrained to run only on the compute nodes within that designated group. This provides two primary advantages:
-
Granular fault isolation (limiting blast radius):
- By assigning independent databases to resource groups backed by distinct sets of compute nodes, you achieve robust infrastructure fault isolation.
- Example: If a compute node within
resource_group_A
crashes, it only impacts databases assigned toresource_group_A
. Databases inresource_group_B
(running on different CNs) remain unaffected by that specific hardware failure. This is crucial for ensuring the availability of critical applications.
-
Workload matching & performance optimization:
- You can tailor resource allocation to workload needs by creating resource groups backed by specialized hardware.
- Example: Create
rg_high_cpu
using CNs optimized for CPU andrg_high_memory
using CNs with large RAM. Assign CPU-intensive jobs (like complex pattern matching) torg_high_cpu
and memory-intensive jobs (like large stateful aggregations) torg_high_memory
for optimal performance and efficient resource utilization. - This also provides Resource Governance, offering clearer control and visibility over how compute resources are consumed by different applications or tenants.
Using and monitoring resource groups
1. Assigning a database to a resource group:
Use the WITH RESOURCE_GROUP
clause when creating a database using CREATE DATABASE
. The named resource group must exist beforehand.
If the WITH resource_group
clause is omitted, the database is typically assigned to a default resource group, potentially sharing resources more broadly with other databases not explicitly assigned elsewhere.
2. Monitoring resource group usage:
Use system catalogs to inspect resource group configuration and job assignments:
-
rw_resource_groups
: Lists defined groups and summarizes their assigned workers, parallelism, and the number of databases/jobs. -
rw_streaming_jobs
: Shows active streaming jobs, including theirdatabase_id
and assignedresource_group
.
Cross-database queries
While isolating databases and resources using Resource Groups is crucial for stability and organization, you often need to query or join data residing in different databases. Cross-Database Queries allow you to do this directly within RisingWave (starting v2.3), avoiding cumbersome data duplication or external pipelines. This enables unified analytics even across isolated environments.
Capabilities
RisingWave’s cross-database functionality supports:
- Querying tables and materialized views in other databases using
SELECT
statements. - Creating streaming Materialized Views (
CREATE MATERIALIZED VIEW
) and Sinks (CREATE SINK
) that read from tables or MVs in other databases. - Enforcement of necessary permissions (
CONNECT
,SELECT
, etc.) for secure access.
Permissions requirements
To perform cross-database operations, users generally need:
- The
CONNECT
privilege on the target (remote) database(s) they need to access. - Standard object-level privileges (e.g.,
SELECT
orUSAGE
) on the specific tables, views, or schemas being accessed in the remote database. - Alternatively, being a
superuser
bypasses these specific checks.
Ad-hoc queries across databases (SELECT
)
You can query tables or views in another database using the standard three-part naming convention: database_name.schema_name.object_name
.
Interaction with Resource Groups: When you run an ad-hoc cross-database query, the compute resources used for executing that specific query are typically managed within the resource group associated with the database you are currently connected to.
Cross-database streaming DDL (CREATE MV
, CREATE SINK
) - Requires subscriptions
The real power for continuous analysis comes from creating Materialized Views (MVs) or Sinks that process data across database boundaries. This allows you to maintain incrementally updated, unified views or data streams originating from different sources.
Critical requirement: CREATE SUBSCRIPTION
- Why? Streaming MVs and Sinks in RisingWave update incrementally by consuming change streams (changelogs) from their upstream sources (tables or MVs). For an MV/Sink in
database_A
to react to ongoing changes in a table/MV fromdatabase_B
, it needs durable, cross-boundary access to the changelog of the source object indatabase_B
. - How? The
CREATE SUBSCRIPTION
command makes this change stream persistently available across databases. Before creating an MV or Sink that reads from a table or MV in another database, you must first create at least one subscription for that specific upstream object in the source database. The upstream object must also have a primary key defined. - Verification: RisingWave verifies the existence of these necessary log stores (enabled by subscriptions) for all cross-database upstreams during
CREATE MATERIALIZED VIEW
orCREATE SINK
operations.
Example workflow:
-
In the source database (
d1
), create a subscription for the upstream table (t1
): -
In the target database (
d2
), create the cross-database MV:
Creating a cross-database Materialized View or Sink introduces a dependency between the databases involved regarding checkpointing and job execution. Actors processing the MV/Sink might run within its database’s resource group while consuming data originating from another resource group’s source database via the subscription.
Additionally, RisingWave provides safeguards: DROP MATERIALIZED VIEW ... CASCADE
or DROP SINK ... CASCADE
operations are rejected if they have cross-database dependencies to prevent accidental data loss in the upstream database.
Common use cases and benefits summary
Leveraging database isolation, resource groups, and cross-database queries together enables several powerful patterns:
- Multi-tenancy: Isolate tenants in separate databases/resource groups for security and stability, while using cross-database queries for aggregated administrative reporting.
- Domain-driven design: Separate microservice or business domain data (e.g.,
orders_db
,customers_db
) for clarity and independent scaling, then use cross-database MVs in ananalytics_db
for holistic views. - Critical workload protection: Guarantee resources and isolate critical systems (like payments or fraud detection) from less critical workloads using dedicated resource groups.
- Performance optimization: Assign resource-intensive jobs to appropriately configured hardware pools via resource groups.
- Simplified architecture: Avoid complex external ETL/replication pipelines for integrating data that already resides within the same RisingWave cluster. Query it directly.
- Data consistency: Access the single source of truth via direct queries, eliminating stale data issues common with duplication methods.
Related topics
CREATE DATABASE
(includesWITH RESOURCE_GROUP
)- Manage subscriptions
- RisingWave system catalogs (see
rw_resource_groups
,rw_streaming_jobs
) - RisingWave Premium
Was this page helpful?