Data engineering, from byte-level storage to the semantic layer.

Most courses teach tools. This one teaches the physics of data — the layers of the stack, top to bottom, and the failure modes that show up in real systems. Built for senior and aspiring data engineers who want to reason from first principles, not memorize commands.

7 strata · 1 specialization · 59 courses. Each stratum holds tracks; each track, a sequence of courses; each course, chapters you can read free to start.

Hands-on

Try the free in-browser tools

No account, nothing uploaded — they run entirely in your browser.

Parquet inspector→SCD Playground→All tools

Themes

◼Strata 1

Storage & File Formats

Model Parquet column statistics and Zstd compression ratios to cut GCS scan costs by 60–80%.

Explore→

⇄Strata 2

Ingestion & Transport

Understand Kafka's ISR protocol, exactly-once semantics, and CDC log tailing — so you can trace data quality failures back to their source.

Explore→

⬡Strata 3

Open Table Formats

How Iceberg snapshot isolation prevents silent data loss — and the exact conditions when it doesn't.

Explore→

⚡Strata 4

Compute Engines

Diagnose Flink backpressure to its root network buffer and tune Spark shuffle partitions with precision.

Explore→

⚙Strata 5

Orchestration & Pipelines

Name the failure modes before you learn the tools, debug Airflow scheduler internals, set freshness SLAs that page for the right reasons, and design pipelines that survive partial failures without reprocessing everything.

Explore→

◈Strata 6

Query Engines & OLAP

Reason about ClickHouse MergeTree merges, Trino's cost-based optimizer, and Druid segment distribution — so you can own query latency end-to-end.

Explore→

∿Strata 7

Semantic & Metrics Layer

Build dbt metrics that survive schema migrations without breaking upstream dashboards — and enforce data contracts before bad data reaches production.

Explore→

🔐Specialization

PII & Data Governance

Mask PII at ingestion, enforce access at table formats, and design right-to-erasure into the storage layer.

Explore→

Start with a free chapter

Every course opens with chapters you can read without an account. Go as deep as you like before you decide.

See pricing →Try the Arcade