Skip to main content

Sink data from RisingWave to StarRocks

This guide describes how to sink data from RisingWave to StarRocks.

StarRocks is an open-source, massively parallel processing (MPP) database. For details on how to get started with StarRocks, see the Quick start guide.

The StarRocks stream load does not support sinking struct type.

Prerequisites

Before sinking data from RisingWave to StarRocks, please ensure the following:

  • The StarRocks database you want to sink to is accessible from RisingWave.
  • Ensure you have an upstream materialized view or source in RisingWave that you can sink data from.

Syntax

CREATE SINK [ IF NOT EXISTS ] sink_name
[FROM sink_from | AS select_query]
WITH (
connector='starrocks',
connector_parameter = 'value', ...
);

Parameters

All parameters are required unless specified otherwise.

Parameter namesDescription
starrocks.hostThe StarRocks host address.
starrocks.query_portThe port to the MySQL server of the StarRocks frontend.
starrocks.http_portThe port to the HTTP server of the StarRocks frontend.
starrocks.userThe user name used to access the StarRocks database.
starrocks.passwordThe password associated with the user.
starrocks.databaseThe StarRocks database where the target table is located
starrocks.tableThe StarRocks table you want to sink data to.
starrocks.partial_updateOptional. If you set the value to "true", the partial update optimization feature of StarRocks will be enabled. This feature enhances ingestion performance in scenarios where there is a need to update a large number of rows with only a small number of columns. You can learn more about this feature in the partial update optimization section of the StarRocks documentation.
typeData format. Allowed formats:
  • append-only: Output data with insert operations.
  • upsert: Output data as a chagelog stream. In StarRocks, Primary Key table must be selected.
force_append_onlyIf true, forces the sink to be append-only, even if it cannot be.
primary_keyRequired if type is upsert. The primary key of the downstream table.
commit_checkpoint_intervalOptional. You can use this parameter to decouple the downstream system’s commit from RisingWave’s commit. This means that instead of committing data to the downstream system at every barrier, RisingWave will commit data only when the specified checkpoint interval is reached. For instance, if commit_checkpoint_interval is set to 5, RisingWave will commit data every five checkpoints. The default value is 1. Note that when commit_checkpoint_interval is a positive integer larger than 1, the sink_decouple option will be enabled automatically.

Examples

Assume we have a materialized view, bhv_mv.

CREATE SINK bhv_starrocks_sink
FROM bhv_mv WITH (
connector = 'starrocks',
type = 'append-only',
starrocks.host = 'starrocks-fe',
starrocks.mysqlport = '9030',
starrocks.httpport = '8030',
starrocks.user = 'users',
starrocks.password = '123456',
starrocks.database = 'demo',
starrocks.table = 'demo_bhv_table',
force_append_only='true'
);

Data type mapping

The following table shows the corresponding data type in RisingWave that should be specified when creating a sink. For details on native RisingWave data types, see Overview of data types.

StarRocks typeRisingWave type
BOOLEANBOOLEAN
SMALLINTSMALLINT
INTINTEGER
BIGINTBIGINT
FLOATREAL
DOUBLEDOUBLE
DECIMALDECIMAL
DATEDATE
VARCHARVARCHAR
No supportTIME
DATETIMETIMESTAMP WITHOUT TIME ZONE
No supportTIMESTAMP WITH TIME ZONE(Can be converted to timestamp in RisingWave then sinked into StarRocks )
No supportINTERVAL
No supportSTRUCT
ARRAYARRAY
No supportBYTEA
JSONJSONB
BIGINTSERIAL
note

Before v1.9, when inserting data into a StarRocks sink, an error would be reported if the values were "nan (not a number)", "inf (infinity)", or "-inf (-infinity)". Since v1.9, we have made a change to the behavior. If a decimal value is out of bounds or represents "inf", "-inf", or "nan", we will insert null values.

Help us make this doc better!

Was this page helpful?

Happy React is loading...