Skip to main content

Top-N by group

Top-N queries return only the N top-most or the N bottom-most records from a table or view based on a condition.

In RisingWave, a Top-N query includes one or more ranking function clauses and one or more rank filtering conditions. In the ranking function clause, you can include a PARTITION BY clause to fetch top N rows per group.

Syntax

SELECT [column_list] 
FROM (
SELECT [column_list],
ranking_function_clause AS rank1
[ranking_function_clause AS rank2 [, ...]]
FROM table_name)
WHERE rank_range;

The syntax of the ranking_function_clause is:

function_name() OVER ([PARTITION BY col1[, col2...]] 
ORDER BY col1 [ ASC | DESC ][, col2 [ ASC | DESC ]...])
note

rank cannot be included in column_list.

info

You must follow the pattern exactly to construct a valid Top-N query.

ParameterDescription
function_nameRisingWave supports two window functions in top-N queries:
  • row_number(): Returns the sequential row ordinal (1-based) of each row for each ordered partition.
  • rank(): Returns the ordinal (1-based) rank of each row within the ordered partition. All peer rows receive the same rank value. The next row or set of peer rows receives a rank value which increments by the number of peers with the previous rank value.
PARTITION BY clauseSpecifies the partition columns. Each partition will have a Top-N result.
ORDER BY clauseSpecifies how the rows are ordered.
rank_rangeSpecifies the range of the rank number. The rank range is required for the query to be recognized as a top-N query. If multiple ranking function clauses are included, you can include multiple rank filtering conditions as the rank range. For example, rank1 < 5 AND rank2 < 3. The range can be specified in other forms.
Examples: WHERE M < rank AND rank < N or WHERE rank between M and N. Optionally, you can specify any additional conditions to further filter the results.

Example

Create a table
CREATE TABLE t (x int, y int, z int);
Insert data
INSERT INTO t (x, y, z) VALUES
(1, 10, 50),
(1, 10, 60),
(1, 10, 70),
(1, 11, 55),
(1, 11, 65),
(2, 20, 30),
(2, 20, 40),
(2, 21, 25),
(2, 21, 35),
(2, 21, 45);
Run a top-N query
SELECT r1
FROM (
SELECT
*,
row_number() OVER (PARTITION BY x ORDER BY y) r1,
row_number() OVER (PARTITION BY x, y ORDER BY z) r2,
rank() OVER (PARTITION BY x, y ORDER BY z) r3
FROM t
) Q
WHERE Q.r1 < 10 AND Q.r2 < 10 AND Q.r3 < 10;

Help us make this doc better!