Setup
1. Ingest events from Kafka
2. Sync reference data from PostgreSQL via CDC
3. Create an enriched materialized view
The JOIN is maintained incrementally. When a user’s segment changes in PostgreSQL, the enriched output updates automatically.4. Deliver enriched events downstream
Key points
- The JOIN between a streaming source and a CDC table is maintained incrementally — no periodic recompute
- When a row in
userschanges in PostgreSQL, theenriched_clicksMV updates to reflect the new value - For late-arriving events or joins where the reference data may not exist yet, use a
LEFT JOINto avoid dropping records - Use
force_append_only = 'true'on Kafka sinks from materialized views that contain aggregations or joins
Next steps
- Lakehouse ingestion recipe — full CDC + Kafka → enrichment → Iceberg pipeline
- PostgreSQL CDC recipe — CDC source setup in detail