Orchestration & Pipelines
A DAG half-succeeded at 3am: some tasks ran twice, a backfill double-counted revenue, and now you are reconciling numbers instead of sleeping.
This stratum names the failure modes before the tools: the Airflow scheduler loop and why a task ran (or didn't, or twice), idempotency and state, backfills that don't double-count, freshness SLAs that page for the right reasons, and the gap between a green pipeline and correct data.
What you'll learn
- Explain the Airflow scheduler loop and why a task ran, skipped, or duplicated
- Design idempotent tasks and backfills that survive partial failure
- Distinguish pipeline health from data quality — and alert on the right one
- Run Airflow in production: deploys, secrets, logging, multi-tenancy, and cost
Tracks & courses
Full navigation is in the sidebar. Here's what each track gives you and the courses inside it.
Pipeline Failure Modes
Before any tool, learn the shapes of failure. The diagnostic vocabulary every data engineer earns the hard way at 3am.
Anatomy of a Broken Pipeline
Eight named failure modes every on-call data engineer will see. Learn to recognize silent success, schedule drift, partial loads, retry-induced corruption, and the deceptively green DAG.
8 ch · 1h 53m
1 freePipeline Quality vs Data Quality
Pipeline health and data health are independent axes. Learn to tell the difference between 'the job ran' and 'the data is right', and which one belongs on which dashboard.
6 ch · 1h 17m
1 freeOrchestration Vocabulary
Tool-agnostic shared language so Airflow, Dagster, Prefect, and Argo conversations all parse the same. The vocabulary is the durable skill; the tool is rented.
DAGs, Tasks, and the Scheduler Loop
The tool-agnostic vocabulary every orchestrator's docs assume you already know. DAG, task, operator, executor, sensor, scheduler heartbeat. Once you have these words, every orchestrator's docs read the same.
7 ch · 1h 28m
1 freeIdempotency and State
What 'safe to retry' actually means, where to keep state so retries don't lie, and why exactly-once is a marketing term. The six concepts senior engineers reach for without thinking.
6 ch · 1h 13m
1 freeAirflow Internals
Airflow is the default in 2026, and the default is also the most-misunderstood. What the scheduler, executor, and metadata DB are actually doing, and which knob to turn first.
Scheduler Deep Dive
What Airflow's scheduler, executors, metadata DB, and concurrency knobs are actually doing. Every Airflow shop hits the same five performance walls. This course names them before you hit the third one.
10 ch · 2h 4m
1 freeAirflow in Production
Deploying DAGs without restarting anything, secrets that survive failures, upgrades without losing a weekend, multi-tenant survival, and the cost story nobody tells you. The seven production lessons that decide whether Airflow is a tool or a tax.
7 ch · 1h 23m
1 freeRelated topics
Start Orchestration & Pipelines free
The first chapters of every course are free to read — no account needed.