Syntax
Indexes can be standard for accelerating lookups and sorts or vector indexes for similarity search in machine learning, AI applications, etc.Standard index syntax
Vector index syntax
Vector index is added in v2.6.0 and is in technical preview stage. Currently, we only support creating vector indexes on append-only inputs, such as append-only tables or materialized views.
Parameters
Parameter or clause | Description |
---|---|
IF NOT EXISTS | This clause is used to check if an index with the specified name already exists before creating a new index. If the index already exists, the clause prevents an error from occurring and the index creation operation is skipped. A notice is issued in this case. Note that there is no guarantee that the existing index is anything like the one that would have been created. Index name is required when IF NOT EXISTS is specified. |
index_name | The name of the index to be created. |
object_name | The name of the table or materialized view where the index is created. |
index_column | The column on which the index is created, or an expression involving one or more columns. For example, you can create an index on a single column (column_name ) or on an expression such as some_function(column_name) . This applies to both standard indexes and vector indexes. |
DESC | Sort the data returned in descending order. |
INCLUDE clause | Specify the columns to include in the index as non-key columns. An index-only query can return the values of non-key columns without having to visit the indexed table thus improving the performance. If you omit the INCLUDE clause, all columns of the table or materialized view will be indexed. This is recommended in RisingWave. If you only want to include the index_column , use CREATE INDEX ON object_name(index_column) INCLUDE(index_column) . See How to decide which columns to include for more information. |
DISTRIBUTED BY clause | Specify the index distribution key. As a distributed database, RisingWave distributes the data across multiple nodes. When an index is created, the distribution key is used to determine how the data should be distributed across these nodes. If you omit the DISTRIBUTED BY clause, the first index column will be be used as the default distribution key.distributed_column has to be the prefix of index_column . See How to decide the index distribution key for more information. |
USING index_method | Specify the index method. Supported methods are default (for standard index), flat (for vector index) and HNSW (for vector index). |
distance_type | Specify the similarity metric for vector indexes. Supported types are l1 , l2 , inner_product , and cosine . |
Examples
Standard index
Let’s create two tables,customers
and orders
.
c_phone
column in the customers
table.
o_custkey
column in the orders
table.
Vector index
In an e-commerce platform, each product can be represented by an embedding vector that captures semantic features derived from its title and description. A vector index can then be used to recommend items similar to a user’s query.- Create a table with vector data. This stores product information along with a semantic embedding vector that represents its features.
- Build a vector index to enable fast similarity search. Include product attributes (name, description) in the index for direct retrieval.
- Given a query vector (e.g., representing “blue running shoes”), retrieve the top 5 most similar products.
Related topics
- RisingWave supports creating indexes on expressions. For more details, see Indexes on expressions.
- Standard indexes
- Vector indexes