Syntax
Parameters
| Parameter names | Description |
|---|---|
connector | Required. Support the S3 connector only. |
s3.region_name | Required. The service region. |
s3.bucket_name | Required. The name of the bucket where the sink data is stored in. |
s3.path | Required. The directory where the sink file is located. |
s3.credentials.access | Conditional. Your AWS access key ID. Required if enable_config_load is false or not set. |
s3.credentials.secret | Conditional. Your AWS secret access key. Required if enable_config_load is false or not set. |
enable_config_load | Optional. Set to true to use the default AWS credential provider chain for authentication. This is useful for environments where credentials are provided through IAM roles or environment variables. If this is set to true, do not provide s3.credentials.access and s3.credentials.secret. You must also set the DISABLE_DEFAULT_CREDENTIAL=false environment variable on the Meta and Streaming Nodes. |
s3.endpoint_url | Optional. The host URL for an S3-compatible object storage server. This allows users to use a different server instead of the standard S3 server. |
s3.assume_role | Optional. Specifies the ARN of an IAM role to assume when accessing S3. It allows temporary, secure access to S3 resources without sharing long-term credentials. |
type | Required. Defines the type of the sink. The S3 sink only supports append-only |
max_row_count | Optional. Maximum number of rows per output file. Defaults to 10240. A file is finalized when this threshold or rollover_seconds is reached first. |
rollover_seconds | Optional. Maximum duration (in seconds) before the current file is finalized and a new one is started. Defaults to 10. A file is finalized when this threshold or max_row_count is reached first. |
In RisingWave Cloud, the default AWS credential provider chain is disabled. Provide
s3.credentials.access and s3.credentials.secret (or use a supported assume-role setup). These credentials cannot be omitted. The enable_config_load option is supported only in self-hosted deployments.Example
Advanced topics
For more information about encodeParquet or JSON, see Sink data in parquet or json format.
For more information about batching strategy, see Batching strategy for file sink.
Multiple output files during backfill
When a sink is created, RisingWave performs an initial backfill of the upstream table. Because a file is finalized when eitherrollover_seconds (default: 10) or max_row_count (default: 10240) is reached, a table with more than 10,240 rows will produce multiple output files under the configured s3.path. Seeing roughly 10K rows in a single file does not mean data is missing.
To verify that all rows were delivered, list all objects under the sink path and sum their row counts:
max_row_count and rollover_seconds when creating the sink if you need different file sizes.