Azure Blob Storage, provided by Microsoft Azure, allows you to store and manage large amounts of unstructured data.
Field | Notes |
---|---|
azblob.container_name | Required. The name of the container the data source is stored in. |
azblob.credentials.account_name | Optional. The name of the Azure Blob Storage account. |
azblob.credentials.account_key | Optional. The account key for the Azure Blob Storage account. |
azblob.endpoint_url | Required. The URL of the Azure Blob Storage service endpoint. |
match_pattern | Conditional. Set to find object keys in azblob.container_name that match the given pattern. Standard Unix-style glob syntax is supported. |
compression_format | Optional. Specifies the compression format of the file being read. When set to gzip or gz, the file reader reads all files with the .gz suffix; when set to None or not defined, the file reader will automatically read and decompress .gz and .gzip files. |
Field | Notes |
---|---|
data_format | Supported data format: PLAIN. |
data_encode | Supported data encodes: CSV, JSON, PARQUET. |
without_header | This field is only for CSV encode, and it indicates whether the first line is header. Accepted values: true , false . Default is true . |
delimiter | How RisingWave splits contents. For JSON encode, the delimiter is \n ; for CSV encode, the delimiter can be one of , , ; , E'\t' . |
Field | Notes |
---|---|
file | Optional. The column contains the file name where current record comes from. |
offset | Optional. The column contains the corresponding bytes offset (record offset for parquet files) where current message begins |
file_scan()
to read Parquet files from Azure Blob, either a single file or a directory of Parquet files.