Module: OBT & Wide-Table Design | Duration: ~12 min | Lesson: 1 of 8

A principal engineer who learned data modeling in the 1990s sees a junior teammate ship a 200-column wide table and writes a polite Slack message: "This is denormalized to the point of malpractice, third normal form exists for a reason."

The junior engineer points at the dashboard query that backs the team's main report. With the 200-column wide table, the query reads 12 of the 200 columns, runs in 80ms, and costs $0.002 in compute credits. With the equivalent star schema (8 joined tables), the same query runs in 1.2 seconds and costs $0.04. Both are correct; one is 15× faster and 20× cheaper.

The senior engineer's instinct was right in 1995 and wrong in 2025. The thing that changed is not modeling philosophy, it's the engine. Columnar storage flipped a fundamental economic assumption: a column you don't read costs almost nothing. Once that's true, the cost of denormalization that haunted row-store databases simply doesn't apply. OBT became sane the day Parquet, Snowflake, BigQuery, and ClickHouse became the default.

2. Concept Explanation

The row-store assumption that no longer holds

Classical data modeling, third normal form, Kimball's star schemas, Vault, assumes the underlying engine reads rows. When you SELECT customer_name, order_total FROM ..., a row-store engine fetches the full row from disk, then projects out the columns you wanted. Adding columns to the row adds I/O cost to every query, even queries that don't touch the new columns.

Under that assumption:

A 200-column table is expensive even when queries read 5 columns. You pay for all 200 on every row scanned.
Normalization (splitting a wide entity into narrow tables) reduces the row size, which reduces I/O per scan.
Joins are cheap because each joined row is small.

Kimball and Vault both implicitly assume this. The star schema's narrow dimension and fact tables, the Vault's hub/sat split, both achieve narrow rows because that's what row-stores reward.

What columnar storage changed

A columnar engine (Parquet, Snowflake, BigQuery, ClickHouse, Druid, DuckDB) stores each column independently. A query that reads 12 of 200 columns only reads those 12 column files from disk. The other 188 columns aren't fetched, aren't decompressed, aren't projected. They cost storage, but they don't cost query I/O.

The economic flip:

Wide tables are no longer punished. A 200-column table where queries read 12 costs the same as a 12-column table, in query I/O.
Joins become more expensive relative to scans, not less. A scan of one wide table reading 12 columns is fast; the equivalent join of 8 narrow tables requires reading hash keys from all 8 plus the projected columns.
The query optimizer can push filters and aggregations deeper because it doesn't have to materialize a wide intermediate row representation.

Four properties of columnar storage matter for OBT viability:

Column pruning. Reading 12 of 200 columns reads 12 of 200 files. Disk I/O scales with columns read, not table width.
Compression-per-column. Each column is compressed with an algorithm appropriate for its data type, dictionary encoding for low-cardinality strings, run-length encoding for repetitive numerics, etc. Wide tables with many low-cardinality columns compress better than narrow tables would because each column compresses independently.
Predicate pushdown. A filter like WHERE region = 'us-west' can be evaluated on the region column alone, often using min/max statistics in the column's metadata. Other columns aren't loaded until the filtered row IDs are known.
Vectorized execution. Once columns are loaded, the engine processes them in batches (vectors) of thousands of rows at a time, using SIMD instructions. Per-row overhead is amortized away.

The combination: wide tables on columnar engines are roughly as fast to query as narrow tables, and avoid the join costs entirely. OBT exploits this.

What OBT actually means

"One Big Table" is a modeling style, not a single physical table. The principle: denormalize aggressively into a small number of wide tables, each shaped around an analytical use case. Examples:

A obt_sales table joining customer attributes, product attributes, time attributes, region attributes, and the sale itself into one row. 50-150 columns. Queryable without any join.
A obt_user_events table joining user attributes (current state) with every event ever fired by that user. Used for product analytics. 80-200 columns.
A obt_subscriptions table for SaaS revenue analytics, denormalizing plan, customer, billing, and lifecycle attributes.

Most teams have several OBTs, each serving a use-case cluster. They are not trying to model the whole business in one table, that would be unmaintainable and the columns wouldn't compose. OBT is a serving pattern, not an integration pattern. There's still a source-of-truth layer (Vault, normalized star, or staging models) that the OBT is materialized from.

OBT vs star: a one-paragraph contrast

A Kimball star schema requires joins to answer most questions; the joins are cheap in row-stores and only modestly expensive in columnar engines. An OBT eliminates joins entirely; it pays for that with bigger row width, more storage, and update amplification (Lesson 3). On columnar engines for read-heavy analytical workloads, OBT typically wins on query latency by 3-10×; on storage and refresh cost, it loses by 2-5×. Whether the trade is favorable depends on whether your workload is read-heavy or write-heavy and whether storage is your bottleneck or compute is.

Lesson 2 quantifies this with a real benchmark on our lab.

Where OBT fits in a layered architecture

A typical modern stack:

Sources → Staging → [Vault or Kimball core] → OBT serving marts → BI / dashboards

OBT is the serving layer. The core (Vault, star, or just well-curated staging models) is the source of truth. The OBT is rematerialized from the core on whatever cadence (daily, hourly) the freshness SLO demands. Consumers query only the OBT.

This is one of the most underappreciated architectural points: OBT and Kimball aren't competitors. They're layers. Kimball core for integration and history; OBT for serving. Most mature analytics shops have both, the question "OBT or star?" is usually mis-framed; the right question is "do my BI consumers query the star directly or do I materialize an OBT on top?"

When OBT is the wrong answer

OBT is the wrong tool when:

The engine is a row-store (classic OLTP Postgres for analytics, the wide-table costs are still real).
The data has heavy update churn (Lesson 3 covers update amplification, OBT punishes update-heavy workloads).
The consumer needs point-in-time historical queries (OBT typically materializes "current state"; historical OBTs explode in row count).
The workload is mostly single-row lookups (OBT's columnar advantages don't help for point reads).

For most analytical, columnar, read-heavy workloads, which is what BI and product analytics actually are, OBT is the serving-layer default.

3. Worked Example

Bring up the lab. If you haven't cloned the lab repo yet, do it now (one repo, shared across every Strata 7 course), then bring up this course's sandbox:

git clone https://github.com/petascalelabs/petascalelabs-lab-setup.git
cd petascalelabs-lab-setup/semantic-and-metrics-layer/modeling-alternatives-vault-obt/one-big-table/
./scripts/setup.sh

Verify the seed tables are reachable:

./scripts/verify.sh
# expected: "OBT lab ready: customers/orders/products loaded in DuckDB"

Whenever a lesson says "open the lab", drop into an interactive DuckDB shell with the seed CSVs mounted read-only at /seed:

docker compose exec duckdb duckdb

You are helping me bring up a local DuckDB lab on my machine.

Course: One Big Table & Wide-Table Design (TheWorldShop scenarios).
Lab artifacts (already in petascalelabs-lab-setup/semantic-and-metrics-layer/modeling-alternatives-vault-obt/one-big-table/):
  - docker-compose.yml: a single DuckDB container with the seed CSVs mounted read-only at /seed
  - data/: raw_customers.csv, raw_orders.csv, raw_products.csv (synthetic)
  - scripts/setup.sh, scripts/verify.sh, scripts/teardown.sh

My machine:
  OS: <fill in>
  RAM: <fill in GB>
  Docker version: <fill in>

Please walk me through:
1. Any OS-specific prerequisites (Rosetta, WSL2, virtualization flags) for my OS.
2. Confirming Docker is running with enough memory (>=2 GB is plenty here).
3. How to confirm ./scripts/verify.sh worked, and what to check if the container never reports healthy.
4. The teardown command when I want to reclaim disk.

Do not assume my OS, ask if it's not clear.

Let's build a small OBT for our lab and feel the columnar economics.

-- Reference dim_date, building a tiny one inline.
CREATE TABLE dim_date AS
SELECT
  d AS date_key,
  EXTRACT(YEAR  FROM d) AS year,
  EXTRACT(MONTH FROM d) AS month,
  EXTRACT(DOW   FROM d) AS day_of_week,
  EXTRACT(DAY   FROM d) AS day_of_month
FROM (
  SELECT DATE '2024-01-01' + INTERVAL (i) DAY AS d
  FROM range(0, 1000) AS t(i)
);

-- The OBT: denormalize customer + product + date attributes onto every order row.
CREATE TABLE obt_orders_v1 AS
SELECT
  o.order_id,
  o.order_date,
  o.amount,
  o.status                    AS order_status,

  c.customer_id,
  c.email                     AS customer_email,
  c.name                      AS customer_name,
  c.region                    AS customer_region,

  p.product_id,
  p.name                      AS product_name,
  p.category                  AS product_category,
  p.price                     AS product_price,

  d.year                      AS order_year,
  d.month                     AS order_month,
  d.day_of_week               AS order_day_of_week,
  d.day_of_month              AS order_day_of_month
FROM raw_orders o
JOIN raw_customers c ON c.customer_id = o.customer_id
JOIN raw_products  p ON p.product_id  = o.product_id
JOIN dim_date      d ON d.date_key    = o.order_date;

SELECT COUNT(*), COUNT(DISTINCT customer_id) FROM obt_orders_v1;
-- ~4200 rows, ~500 customers.

Now run two equivalent queries, once on the OBT, once on the star, and compare:

-- Q1: "Revenue per region for 2025".

-- Via OBT (no joins):
SELECT customer_region, SUM(amount) AS total_revenue
FROM obt_orders_v1
WHERE order_year = 2025
GROUP BY customer_region;

-- Via star (one join):
SELECT c.region, SUM(o.amount)
FROM raw_orders o
JOIN raw_customers c ON c.customer_id = o.customer_id
WHERE EXTRACT(YEAR FROM o.order_date) = 2025
GROUP BY c.region;

On DuckDB with 4,200 rows, both finish in single-digit milliseconds, the dataset is too small to feel a meaningful difference. But the shape of the difference is instructive:

The OBT query reads two columns (customer_region, amount) plus uses one for filtering (order_year). With column pruning, those are the only columns hitting disk.
The star query reads two columns from raw_orders and one from raw_customers, plus pays the cost of the join. The join's cost is small here because the tables are small, but it scales differently with size.

At 1B rows in production, the OBT version typically runs 3-10× faster on Snowflake/BigQuery, primarily because the join is gone. The storage cost is the price: obt_orders_v1 carries customer_region on every order row (4,200 copies of region values), versus raw_customers carrying it once per customer (500 rows).

Now demonstrate the column-pruning property, add 50 unused columns to the OBT and re-run the same query:

ALTER TABLE obt_orders_v1 ADD COLUMN unused_col_01 VARCHAR DEFAULT 'placeholder';
-- ... imagine 49 more such columns added.

-- Same query.
SELECT customer_region, SUM(amount) FROM obt_orders_v1 WHERE order_year = 2025 GROUP BY customer_region;

On a columnar engine (DuckDB, Parquet-backed), this query is no slower than before, the unused columns aren't read. On a row-store, every additional column would slow this query down because each row fetch costs more. The columnar engine doesn't punish you for the width.

Aha: OBT is not the rejection of normalization. It's the recognition that normalization was an optimization for an engine that's no longer the default. On a row-store, denormalization was punished by the I/O model. On a columnar engine, it's rewarded, column pruning makes the unused width free, and join elimination makes the read latency small. The argument "denormalization is bad practice" is true under an assumption (row storage) that fewer and fewer modern analytics workloads still meet.

4. Your Turn

Exercise: Decide between OBT and star for three queries.

Your analytics team has these queries against the same dataset (orders, customers, products). For each, decide whether OBT or star schema is the better serving layout, and explain in one sentence.

"Total revenue per region per month for the last 12 months", runs on a daily-refreshed dashboard, 50k page-views per month.
"For order ID 8472, show all its line items and the customer's full profile", runs on a customer support tool, looked up ad-hoc, ~200 lookups/day.
"Find all customers whose region changed between two specific dates", runs once a quarter for compliance review.

5. Real-World Application

The rediscovery of OBT in the 2017-2023 period coincided exactly with the dominance of Snowflake, BigQuery, and Redshift in enterprise analytics. Engineers who'd been trained on row-store-era Kimball discipline ran benchmarks on these engines and discovered that their carefully-normalized stars were 5-10× slower than denormalized equivalents. Many of them, especially in product analytics, quietly shifted to wide-table designs without giving the pattern a name. Around 2020, the term "One Big Table" started appearing in conference talks and blog posts; it codified what teams were already doing.

A second arc reinforced OBT's adoption: the ML feature store movement. Feature engineering pipelines naturally produce wide rows, one row per entity, many columns per feature, and the access pattern (training jobs reading 50 columns out of 500 for a million entities) is exactly the OBT workload. The realization that an ML feature store and an OBT serving table are essentially the same artifact, with different consumers, became commonplace by 2022. Lesson 5 explores this in depth.

A third pattern worth knowing: dbt's adoption of "marts" as a layer. dbt projects almost universally have a marts/ directory where the wide consumer-facing tables live, these are OBTs in everything but name. dbt's documentation has gradually shifted from Kimball-centric language toward OBT-friendly language; the underlying philosophy of "materialize the right shape for the consumer" is exactly OBT.

A cautionary counterpoint: OBT is over-applied. Teams adopt it for workloads where it doesn't fit (transactional applications, single-row lookups, complex historical analyses) and discover the costs without the benefits. The pattern is fashionable enough that some teams build wide tables reflexively without asking whether the workload matches. The discipline of knowing when not to use OBT is as important as knowing when to use it, Lesson 3's discussion of update amplification, Lesson 6's discussion of schema evolution, and Lesson 7's discussion of fan-out are the costs the fashion sometimes hides.

The underlying lesson, repeated: modeling choices are functions of the engine and the workload. Columnar engines made OBT economically reasonable; column pruning made wide tables free at read time; vectorized execution made denormalized scans fast. The argument for OBT is not philosophical, it's the consequence of those three engine properties. When the engine doesn't have those properties, OBT loses its advantage and you go back to narrow tables.

6. Recap + Bridge

Columnar storage and cheap compute made the wide, denormalized OBT viable: you only read the columns you touch, so the 'wasted' width is mostly free. Next we put it head-to-head with a star schema in a real benchmark.