Generally, the performance of streaming jobs can be significantly affected by the sources and sinks.
This guide aims to help you identify problems caused by sources and sinks. To troubleshoot these issues, you usually need to check the upstream or downstream systems and fix the root causes.
To identify a source problem, navigate to Grafana dashboard (dev) > Streaming section > Source Throughput panel. If the ingestion rate is zero or significantly lower than expected, the source may be the bottleneck.
Example of a stuck source:
A sink problem is usually more serious because it back-pressures the entire streaming job, causing high barrier latency. To mitigate this issue, RisingWave introduced sink decoupling (i.e. buffering) before writing to the sink since v1.10. However, if the problem lasts for a long time, the buffer may eventually become full, causing the entire streaming job to be blocked.
To identify a sink problem, navigate to Grafana dashboard (dev) > Sink Metrics section > Log Store Lag panel. This panel shows how much data is waiting to be written out. If the lag is increasing, the sink may be the bottleneck.
Example of a stuck sink:
We are currently rolling out sink decoupling to all sinks. Track the latest progress here.
It is helpful to check the logs of both RisingWave and the source or sink system, especially when the source or sink is failing instead of being slow.
Source and sink connectors are running on RisingWave Compute nodes, and sometimes require validation in the Meta node. Search for “risingwave_connector_node
” in the RisingWave logs to find related information.
The root causes of source and sink problems are various. You may need to check their documents or logs to address the root causes.
Here are some common issues:
Generally, the performance of streaming jobs can be significantly affected by the sources and sinks.
This guide aims to help you identify problems caused by sources and sinks. To troubleshoot these issues, you usually need to check the upstream or downstream systems and fix the root causes.
To identify a source problem, navigate to Grafana dashboard (dev) > Streaming section > Source Throughput panel. If the ingestion rate is zero or significantly lower than expected, the source may be the bottleneck.
Example of a stuck source:
A sink problem is usually more serious because it back-pressures the entire streaming job, causing high barrier latency. To mitigate this issue, RisingWave introduced sink decoupling (i.e. buffering) before writing to the sink since v1.10. However, if the problem lasts for a long time, the buffer may eventually become full, causing the entire streaming job to be blocked.
To identify a sink problem, navigate to Grafana dashboard (dev) > Sink Metrics section > Log Store Lag panel. This panel shows how much data is waiting to be written out. If the lag is increasing, the sink may be the bottleneck.
Example of a stuck sink:
We are currently rolling out sink decoupling to all sinks. Track the latest progress here.
It is helpful to check the logs of both RisingWave and the source or sink system, especially when the source or sink is failing instead of being slow.
Source and sink connectors are running on RisingWave Compute nodes, and sometimes require validation in the Meta node. Search for “risingwave_connector_node
” in the RisingWave logs to find related information.
The root causes of source and sink problems are various. You may need to check their documents or logs to address the root causes.
Here are some common issues: