> ## Documentation Index
> Fetch the complete documentation index at: https://docs.risingwave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# CREATE INDEX

> Use the `CREATE INDEX` command to construct an [index](/processing/indexes) on a table or a materialized view.

## Syntax

Indexes can be standard for accelerating lookups and sorts or vector indexes for similarity search in machine learning, AI applications, etc.

```sql Standard index syntax theme={null}
CREATE INDEX [ IF NOT EXISTS ] index_name ON object_name ( index_column [ ASC | DESC ], [, ...] )
[ INCLUDE ( include_column [, ...] ) ]
[ DISTRIBUTED BY ( distributed_column [, ...] ) ];
```

```sql Vector index syntax theme={null}
CREATE INDEX [ IF NOT EXISTS ] index_name ON object_name
USING index_method ( index_column )
[INCLUDE ( include_column [, ...] ) ]
WITH (distance_type = '<type>', ...);
```

<Note>
  Vector index is added in v2.6.0 and is in **[technical preview](/changelog/product-lifecycle#product-release-lifecycle)** stage. Currently, we only support creating vector indexes on append-only inputs, such as append-only tables or materialized views.
</Note>

## Parameters

| Parameter or clause       | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| :------------------------ | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **IF NOT EXISTS**         | This clause is used to check if an index with the specified name already exists before creating a new index. If the index already exists, the clause prevents an error from occurring and the index creation operation is skipped. A notice is issued in this case. Note that there is no guarantee that the existing index is anything like the one that would have been created. Index name is required when `IF NOT EXISTS` is specified.                                                                                                                                                        |
| index\_name               | The name of the index to be created.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| object\_name              | The name of the table or materialized view where the index is created.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| index\_column             | The column on which the index is created, or an expression involving one or more columns. For example, you can create an index on a single column (`column_name`) or on an expression such as `some_function(column_name)`. This applies to both [standard indexes](/processing/indexes#indexes-on-expressions) and [vector indexes](/processing/vector-indexes#vector-indexes-on-function-expressions).                                                                                                                                                                                            |
| **DESC**                  | Sort the data returned in descending order.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **INCLUDE** clause        | Specify the columns to include in the index as non-key columns. An index-only query can return the values of non-key columns without having to visit the indexed table thus improving the performance. If you omit the `INCLUDE` clause, all columns of the table or materialized view will be indexed. This is recommended in RisingWave. If you only want to include the `index_column`, use `CREATE INDEX ON object_name(index_column) INCLUDE(index_column)`. See [How to decide which columns to include](/processing/indexes#how-to-decide-which-columns-to-include%3F) for more information. |
| **DISTRIBUTED BY** clause | Specify the index distribution key. As a distributed database, RisingWave distributes the data across multiple nodes. When an index is created, the distribution key is used to determine how the data should be distributed across these nodes. If you omit the `DISTRIBUTED BY` clause, the first index column will be be used as the default distribution `key.distributed_column` has to be the prefix of `index_column`. See [How to decide the index distribution key](/processing/indexes#how-to-decide-the-index-distribution-key%3F) for more information.                                 |
| USING index\_method       | Specify the index method. Supported methods are `default`(for standard index), `flat`(for vector index) and `HNSW`(for vector index).                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| distance\_type            | Specify the similarity metric for vector indexes. Supported types are `l1`, `l2`, `inner_product`, and `cosine`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |

## Backfill and isolation

When creating an index, RisingWave will backfill historical data from the indexed table or materialized view. Snapshot backfill is enabled by default to improve isolation between the backfill phase and streaming phase. This helps prevent resource contention between backfilling historical data and processing new streaming data. To disable it if needed, set `SET streaming_use_snapshot_backfill=false;` before creating the index. For more details, see [View and configure runtime parameters](/operate/view-configure-runtime-parameters).

## Examples

### Standard index

Let's create two tables, `customers` and `orders`.

```sql theme={null}
CREATE TABLE customers (
    c_custkey INTEGER,
    c_name VARCHAR,
    c_address VARCHAR,
    c_nationkey INTEGER,
    c_phone VARCHAR,
    c_acctbal NUMERIC,
    c_mktsegment VARCHAR,
    c_comment VARCHAR,
    PRIMARY KEY (c_custkey)
);

CREATE TABLE orders (
    o_orderkey BIGINT,
    o_custkey INTEGER,
    o_orderstatus VARCHAR,
    o_totalprice NUMERIC,
    o_orderdate DATE,
    o_orderpriority VARCHAR,
    o_clerk VARCHAR,
    o_shippriority INTEGER,
    o_comment VARCHAR,
    PRIMARY KEY (o_orderkey)
);
```

If you want to speed up the query of fetching a customer record by the phone number, you can build an index on the `c_phone` column in the `customers` table.

```sql theme={null}
CREATE INDEX idx_c_phone on customers(c_phone);

SELECT * FROM customers where c_phone = '123456789';

SELECT * FROM customers where c_phone in ('123456789', '987654321');
```

If you want to speed up the query of fetching all the orders of a customer by the customer key, you can build an index on the `o_custkey` column in the `orders` table.

```sql theme={null}
CREATE INDEX idx_o_custkey ON orders(o_custkey);

SELECT * FROM customers JOIN orders ON c_custkey = o_custkey
WHERE c_phone = '123456789';
```

### Vector index

In an e-commerce platform, each product can be represented by an embedding vector that captures semantic features derived from its title and description. A vector index can then be used to recommend items similar to a user’s query.

1. Create a table with vector data. This stores product information along with a semantic embedding vector that represents its features.

```sql theme={null}
CREATE TABLE products (
    id INT PRIMARY KEY,
    name STRING,
    description STRING,
    embedding VECTOR(128)
) APPEND ONLY; -- Must be append only mode here
```

2. Build a vector index to enable fast similarity search. Include product attributes (name, description) in the index for direct retrieval.

```sql theme={null}
CREATE INDEX prod_vec_idx
ON products USING hnsw (embedding)
INCLUDE (name, description)
WITH (distance_type = 'inner_product');
```

3. Given a query vector (e.g., representing “blue running shoes”), retrieve the top 5 most similar products.

```sql theme={null}
SELECT id, name, description
FROM products
ORDER BY embedding <-> $1::vector(128)
LIMIT 5;
```

## Related topics

* RisingWave supports creating indexes on expressions. For more details, see [Indexes on expressions](/processing/indexes#indexes-on-expressions).

* [Standard indexes](/processing/indexes)

* [Vector indexes](/processing/vector-indexes)
