Module: Setup | Duration: ~10 min | Lesson: 0 of 10
1. What You'll Build
A local lab for working through dimensional modeling without standing up a warehouse. You'll run DuckDB inside Docker against a small, deliberately messy retail dataset (orders, customers, products, dates). DuckDB is the right engine for this course because it's engine-agnostic enough that every concept you learn ports cleanly to Snowflake, BigQuery, ClickHouse, or Postgres — and small enough to run on a laptop.
At the end of this lesson you should be able to:
- Open a SQL shell into DuckDB and
SELECT * FROM raw_orders LIMIT 5. - See the four raw tables (
raw_orders,raw_customers,raw_products,raw_dates) loaded from CSV. - Be ready to model them — that's what Lessons 1–10 are about.
2. Prerequisites
- ~2 GB free RAM and 1 GB disk
- Docker Desktop ≥ 4.x (macOS/Windows) or Docker Engine + Compose (Linux)
- A shell — zsh, bash, or PowerShell
- Optional: a SQL client you like (DBeaver, TablePlus) — DuckDB has its own CLI which is what we'll use
No cloud account, no warehouse, no credit card.
3. Installation
macOS
- Install Docker Desktop from https://www.docker.com/products/docker-desktop, launch it, wait for the whale icon to settle.
- Make a lab directory and pull the image:
- Grab the seed CSVs (a tiny synthetic retail dataset shipped with this course):
If that URL is unreachable, generate equivalent data with the fallback script in
seed/gen.py(included).
Linux
sudo apt-get install -y docker.io(or your distro's equivalent), thensudo systemctl enable --now docker.- Same
docker pullandcurlsteps as macOS. - Add your user to the
dockergroup so you don't needsudo:sudo usermod -aG docker $USER && newgrp docker.
Windows (WSL2)
- Install WSL2 with Ubuntu, then Docker Desktop with the WSL2 backend.
- From a WSL2 shell, follow the macOS steps verbatim — paths, curl, docker all work the same.
4. Verify Your Setup
From inside ~/s7-lab, start a DuckDB shell with the seed CSVs mounted:
Expected output:
If you see 5000, the lab is ready. If you see an error about read_csv or /seed not found, your volume mount path is wrong — the most common Windows/WSL pitfall.
Throughout the course, when a lesson asks you to "open the lab", run:
Then at the D prompt, recreate the four tables (or save them to a persistent .duckdb file — see the README in seed/).