This guide describes how to sink data from RisingWave to ClickHouse using the ClickHouse sink connector in RisingWave.
Parameter Names | Description |
---|---|
type | Required. Specify if the sink should be upsert or append-only. If creating an upsert sink, see the Overview on when to define the primary key and Upsert sinks on limitations. |
primary_key | Optional. A string of a list of column names, separated by commas, that specifies the primary key of the ClickHouse sink. |
clickhouse.url | Required. Address of the ClickHouse server that you want to sink data to. Format: http://ip:port . The default port is 8123. |
clickhouse.user | Required. User name for accessing the ClickHouse server. |
clickhouse.password | Required. Password for accessing the ClickHouse server. |
clickhouse.database | Required. Name of the ClickHouse database that you want to sink data to. |
clickhouse.table | Required. Name of the ClickHouse table that you want to sink data to. |
commit_checkpoint_interval | Optional. Commit every N checkpoints (N > 0). Default value is 10. The behavior of this field also depends on the sink_decouple setting:
|
clickhouse.delete.column | Optional. You can run an upsert sink using the ReplacingMergeTree engine. When using the ReplacingMergeTree engine, you can specify the delete column with this parameter. |
append-only
sinks for all ClickHouse engines, support for upsert
sinks is limited. Additionally, for ReplacingMergeTree engines, an append-only
sink will not insert duplicate data.
RisingWave supports upsert
sinks for the following ClickHouse engines:
DELETE
operations are transformed into INSERT with SIGN = -1
.DELETE
operations are transformed into INSERT with SIGN = -1
.DELETE
operations are transformed into INSERT with SIGN = 1
.seq_id
and the ENGINE set to ReplacingMergeTree
. It’s important to emphasize that without using ReplacingMergeTree
or other deduplication techniques, there is a significant risk of duplicate writes to ClickHouse.
Note that only S3-compatible object store is supported, such as AWS S3 or MinIO.
type = append-only
in the CREATE SINK
SQL query.
type = append-only
and force_append_only = true
. This will ignore delete messages in the upstream, and turn upstream update messages into insert messages.
type = upsert
. When the sink type is upsert, be sure to set the primary_key
field to specify the primary key of the downstream ClickHouse table.
RisingWave Data Type | ClickHouse Data Type |
---|---|
boolean | Bool |
smallint | Int16 or UInt16 |
integer | Int32 or UInt32 |
bigint | Int64 or UInt64 |
real | Float32 |
double precision | Float64 |
decimal | Decimal |
character varying | String |
bytea | Not supported |
date | Date32 |
time without time zone | Not supported |
timestamp | Not supported. Please convert timestamp to timestamptz within RisingWave before sinking. |
timestamptz | DateTime64 |
interval | Not supported |
struct | Nested |
array | Array |
JSONB | Not supported |
Nested
data type doesn’t support multiple levels of nesting. Therefore, when sinking RisingWave’s struct
data to ClickHouse, you need to flatten or restructure the nested data to align with ClickHouse’s requirement.
Before v1.9, when inserting data into a ClickHouse sink, an error would be reported if the values were “nan (not a number)”, “inf (infinity)”, or “-inf (-infinity)”. Since v1.9, we have made a change to this behavior. If the ClickHouse column is nullable, we will insert null values in such cases. If the column is not nullable, we will insert 0
instead.
Please be aware that the range of specific values varies among ClickHouse types and RisingWave types. Refer to the table below for detailed information.
ClickHouse type | RisingWave type | ClickHouse range | RisingWave range |
---|---|---|---|
Date32 | DATE | 1900-01-01 to 2299-12-31 | 0001-01-01 to 9999-12-31 |
DateTime64 | TIMESTAMPTZ | 1900-01-01 00:00:00 to 2299-12-31 23:59:59.99999999 | 0001-01-01 00:00:00 to 9999-12-31 23:59:59 |