Performance highlights

RisingWave demonstrates low-latency query responses and high throughput for various OLAP workloads, even under high concurrency. Key takeaways from the sysbench results include:

  • Low latency: RisingWave consistently delivers low-latency query responses, with average latencies as low as 4.96ms for point selects and 13.84ms - 16.40ms for more complex range queries. P95 latencies are also within a desirable range, indicating predictable performance for most requests.
  • High throughput: RisingWave achieves high query throughput, processing up to 25,814 queries per second (QPS) for point selects and over 8,000 QPS for random point and range selects.
  • Stable performance: RisingWave exhibits stable performance even when the data size increases from 1 million rows to 10 million rows, as long as the data can be cached in memory.

Benchmark methodology

The performance tests were conducted using a modified version of the sysbench benchmark, a widely recognized tool for evaluating database performance. The RisingWave team forked the official sysbench repository and made necessary adjustments for compatibility.

Environment

  • Hardware:
    • One AWS EC2 instance (8 vCPUs, 16GB memory) for the RisingWave compute node.
    • One AWS EC2 instance (8 vCPUs, 16GB memory) for the RisingWave frontend node.
    • One AWS EC2 instance (8 vCPUs, 16GB memory) for the sysbench client (query generator).
  • Software: RisingWave (specific version not mentioned in the report, but presumably a recent stable or nightly build).

Test procedure

  1. A table named sbtest was created in RisingWave with the following schema:

    CREATE TABLE sbtest(
      id INTEGER,
      k INTEGER,
      c VARCHAR,
      pad VARCHAR,
      PRIMARY KEY (id)
    );
    
  2. Data was inserted into the sbtest table. Two data sizes were tested: 1 million rows (approximately 186MB) and 10 million rows (approximately 1.86GB).

  3. 128 threads were used to concurrently issue queries against the table. This level of concurrency is sufficient to saturate the CPU of the RisingWave deployment.

  4. Results, including latency and throughput, were collected.

Workload

Three types of queries from the sysbench OLTP workload were used:

  • oltp_point_select:

    SELECT c FROM sbtest WHERE id = ?;
    
  • select_random_points:

    SELECT id, k, c, pad FROM sbtest WHERE k IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?);
    
  • select_random_ranges:

    SELECT count(k) FROM sbtest WHERE k BETWEEN ? AND ? OR k BETWEEN ? AND ?;
    

Detailed benchmark results

Data size: 1 million rows

Latency (ms)MinAvgP95P99MaxThroughput (QPS)
oltp_point_select0.405.059.2215.27131.7525,335
select_random_points1.2715.1121.8931.94131.878,467
select_random_ranges1.9213.9820.0035.59233.909,156

Data size: 10 million rows

Latency (ms)MinAvgP95P99MaxThroughput (QPS)
oltp_point_select0.404.968.9018.28233.9025,814
select_random_points0.8416.4024.3832.53175.828,203
select_random_ranges0.8613.8419.6538.94142.349,247

Key observations

  • Memory impact: When the memory is sufficient to cache all the queried data, performance remains relatively stable as the table size grows. This highlights the importance of memory capacity for OLAP workloads.
  • Bottlenecks: In the oltp_point_select workload, the frontend node becomes the primary bottleneck due to the lightweight nature of the operations. For select_random_points and select_random_ranges, both the compute and frontend nodes experience CPU saturation.