FORMAT and ENCODE parameters
When creating a source or table using a connector, you need to specify the FORMAT
and ENCODE
section of the CREATE SOURCE
or CREATE TABLE
statement. This topic provides an overview of the formats and encoding options. For the complete list of formats we support, see Supported sources and formats
CREATE SOURCE src_user WITH (
connector = 'kafka',
topic = 'sr_pb_test',
properties.bootstrap.server = 'message_queue:29092',
scan.startup.mode = 'earliest'
)
FORMAT PLAIN ENCODE PROTOBUF(
schema.registry = 'http://message_queue:8081',
message = 'test.User');
The FORMAT
parameter represents the organization format of the data and includes the following options:
PLAIN
: No specific data format, and data in this format can be imported into RisingWave usingCREATE SOURCE
andCREATE TABLE
.UPSERT
: UPSERT format, where messages consumed from the message queue will perform UPSERT in RisingWave based on the primary key. To ensure UPSERT correctness, data in UPSERT format from the message queue can only be imported into RisingWave usingCREATE TABLE
.DEBEZIUM
,MAXWELL
,CANAL
,DEBEZIUM_MONGO
: Mainstream Change Data Capture (CDC) formats, where messages consumed from the message queue will be processed and imported into RisingWave according to the corresponding CDC format's specification. To ensure CDC correctness, data in CDC format from the message queue can only be imported into RisingWave usingCREATE TABLE
.
The ENCODE
parameter represents the data encoding and includes the following options:
JSON
: Data serialized in JSON format in the message queue, compatible with allFORMAT
options.AVRO
: Data serialized in AVRO format in the message queue, compatible withFORMAT PLAIN / UPSERT / DEBEZIUM
.Protobuf
: Data serialized in Protobuf format in the message queue, compatible withFORMAT PLAIN / UPSERT
.CSV
: Data serialized in CSV format in the message queue, compatible withFORMAT PLAIN
.Bytes
: Data exists in the message queue in raw bytes format, compatible withFORMAT PLAIN
.
-
We support
FORMAT UPSERT ENCODE PROTOBUF
but DON'T RECOMMEND using it, because this may disrupt the order of upserts. For more details, see the documentation of Confluent. -
Please distinguish between the parameters set in the FORMAT and ENCODE options and those set in the WITH clause. Ensure that you place them correctly and avoid any misuse.