Fact Data Modeling

Facts are the biggest data you'll ever touch: immutable events at 10-100x the volume of your dimensions. What makes a fact atomic, why raw logs aren't fact data, when denormalization is the fix (not the bug), deduplication at trillion-row scale, and the blurry line where aggregated facts become dimensions.

Facts are the biggest data you'll ever touch: immutable events at 10-100x the volume of your dimensions. What makes a fact atomic, why raw logs aren't fact data, when denormalization is the fix (not the bug), deduplication at trillion-row scale, and the blurry line where aggregated facts become dimensions.

Intermediate6 chapters· 2h· in Semantic & Metrics Layer

Course content

  1. 01What a Fact Is: Atomic, Immutable, and EnormousFree
  2. 02Raw Logs Are Not Fact Data🔒
  3. 03Normalized vs Denormalized Facts: When the Join Stops Working🔒
  4. 04Deduplication at Scale: From ROW_NUMBER to the Microbatch Tree🔒
  5. 05The Blurry Line Between Facts and Dimensions🔒
  6. 06Surviving High Volume: Retention, Sampling, and Bucketing🔒

Prerequisites

What to learn next

Read the first chapter free

Start reading now — no account required for the free chapters.