Overview
RisingWave provides two system catalogs to monitor Kafka consumer lag:rw_kafka_source_metrics(table): Raw Kafka metrics for each source partitionrw_kafka_job_lag(view): Computed lag per job, source, fragment, and partition
System catalog: rw_kafka_source_metrics
The rw_kafka_source_metrics table exposes raw Kafka metrics for each source partition.
Schema
| Column | Type | Description |
|---|---|---|
source_id | int | ID of the Kafka source |
partition_id | varchar | Kafka partition ID |
high_watermark | bigint | Latest offset available in the Kafka topic partition (nullable) |
latest_offset | bigint | Latest offset processed by RisingWave (nullable) |
Example usage
View all Kafka source metrics:System catalog: rw_kafka_job_lag
The rw_kafka_job_lag view provides a summary of Kafka consumption lag, designed to help diagnose consumption issues across materialized views and sinks.
Schema
| Column | Type | Description |
|---|---|---|
job_id | int | ID of the streaming job (materialized view or sink) |
source_id | int | ID of the Kafka source |
fragment_id | int | ID of the fragment processing this partition |
partition_id | varchar | Kafka partition ID |
lag_phase | varchar | Phase of consumption: BACKFILL (processing historical data) or LIVE (processing real-time data) |
high_watermark | bigint | Latest offset available in the Kafka topic partition (nullable) |
consumer_offset | bigint | Current consumer offset — during BACKFILL, this is the backfill progress offset; during LIVE, this is the latest offset processed by RisingWave (nullable) |
lag | bigint | Number of unprocessed messages, calculated as greatest(high_watermark - consumer_offset - 1, 0) (nullable) |
Lag phase
Thelag_phase column indicates the current phase of Kafka consumption:
BACKFILL: The consumer is processing historical data during initial job creation. In this phase, theconsumer_offsetreflects the backfill progress offset.LIVE: The backfill has completed (or no backfill state exists) and the consumer is processing real-time streaming data. In this phase, theconsumer_offsetreflects the latest offset reported by the source reader.
Example usage
Check for jobs with significant consumer lag:Best practices
- Monitor during initial creation: When creating a new materialized view or sink from a Kafka source, monitor the backfill progress to ensure it completes successfully.
- Track lag trends: Regularly query
rw_kafka_job_lagto identify partitions that consistently show high lag, which may indicate performance issues. - Alert on high lag: Set up monitoring alerts when lag exceeds acceptable thresholds for your use case.
- Check lag phase transitions: Monitor when jobs transition from
BACKFILLtoLIVEphase to understand when real-time processing begins.
Troubleshooting high lag
If you observe high consumer lag:- Check resource utilization: High lag may indicate that compute nodes are under-provisioned or overloaded.
- Review parallelism settings: Consider adjusting the parallelism of your streaming jobs to increase processing capacity.
- Inspect Kafka broker health: Verify that Kafka brokers are operating normally and not experiencing performance issues.
- Examine query complexity: Complex transformations in materialized views may slow down consumption.
- Check NULL values: If
high_watermarkorconsumer_offsetis NULL, metrics may not yet be available. This can happen shortly after source creation or if the Prometheus endpoint is not configured.