Cumulative Table Design

The pattern behind dim_all_users: full-outer-join yesterday to today, coalesce, and carry all of history in one row. Complex types (struct, array, map), the compactness-vs-usability tradeoff, temporal cardinality explosions, why run-length encoding is the reason Parquet won, and how to collapse cumulative history into an SCD Type 2 with window functions and a hand-rolled incremental merge.

The pattern behind dim_all_users: full-outer-join yesterday to today, coalesce, and carry all of history in one row. Complex types (struct, array, map), the compactness-vs-usability tradeoff, temporal cardinality explosions, why run-length encoding is the reason Parquet won, and how to collapse cumulative history into an SCD Type 2 with window functions and a hand-rolled incremental merge.

Intermediate4 chapters· 1h 42m· in Semantic & Metrics Layer

Course content

  1. 01The Data Consumer Continuum: Why Master Data ExistsFree
  2. 02Cumulative Tables: Full Outer Join, Coalesce, Carry History🔒
  3. 03Complex Types, Cardinality Explosions, and Run-Length Encoding🔒
  4. 04From Cumulative History to SCD Type 2: Streaks and Incremental Merge🔒

Prerequisites

What to learn next

Read the first chapter free

Start reading now — no account required for the free chapters.