Module: Foundations at Scale | Duration: 20 min read | Lesson: 1 of 11
In the fundamentals course, CDC made the dual-write problem disappear: the app writes the database, the log produces the events. TheWorldShop shipped it, and for one database, one stream, one team, it was clean.
Then the company grew. Now there are 40 CDC'd tables feeding 30 consumers, events fan out across hundreds of topics, a backend dev's ALTER TABLE can break a downstream team, an order and its line items arrive out of order, a poison message wedges a consumer for everyone, and the legal team wants to know how a GDPR deletion propagates through six derived stores. The clean single pipeline became an ecosystem with emergent failure modes none of which existed at small scale.
This course is the hard half. It opens by re-examining the dual-write hazard, because at scale, CDC reintroduces a subtler version of the very problem it solved, and recognizing that is the key to everything that follows.
2. Concept Explanation
CDC Solved One Dual Write and Created Another
The fundamentals insight: don't write the DB and publish an event; just write the DB, and derive the event from the log. One write, one truth.
But look at what CDC actually captures: row changes, not business facts. When TheWorldShop's checkout completes, the application's intent is one fact: "OrderPlaced." In the database, that one fact is several row changes: insert into orders, three inserts into order_items, an update to inventory, an update to customer_stats. CDC faithfully emits all of those as separate events across separate topics.
So the consumer that wants "OrderPlaced" now has to reassemble it from scattered row-change events that:
- arrive on different topics and partitions (no global order, fundamentals Lesson 6),
- expose the database's internal schema (so an
ALTER TABLEis a breaking contract change, fundamentals Lesson 10), - and may arrive partially (the order event before its items).
That reassembly is a new coordination problem. It's not the original dual write, but it rhymes: you're again trying to keep multiple things consistent across systems. CDC moved the hazard rather than eliminating it entirely, and at scale that residual hazard becomes the dominant source of bugs.
The Two Philosophies of Getting Events Out
This tension splits into two fundamental approaches, and most of this course is about choosing and combining them:
- Raw table CDC: capture row changes directly off the database log. Zero application changes, captures everything, but exposes internal schema and emits low-level row changes that consumers must reassemble into business facts.
- The outbox pattern (Lesson 2): the application writes business events into a dedicated
outboxtable in the same transaction as its data changes, and CDC captures the outbox. Now CDC emits clean, stable, business-level events the app controls, while still being atomic with the data change (one transaction, no dual write).
Raw CDC asks "what rows changed?" The outbox asks "what happened, in business terms?" The first is free but leaky; the second costs application discipline but produces a real contract.
Why "At Scale" Changes Everything
A single small pipeline can paper over these issues. At scale they compound:
- Schema coupling (raw CDC): 30 consumers depend on internal table shapes; every migration is a cross-team incident risk.
- Ordering (Lesson 3): per-key ordering isn't enough when business facts span tables and topics.
- Schema evolution propagation (Lesson 4): one schema change must flow compatibly through registry, topics, and every consumer.
- Poison messages (Lesson 5): one un-processable event can wedge a partition for everyone behind it; you need DLQs and replay.
- Exactly-once into sinks (Lesson 6): at-least-once CDC plus a warehouse that double-counts is a data-quality disaster at volume.
- Erasure (Lesson 8): a GDPR delete must reach every derived copy, which raw CDC scatter makes genuinely hard.
Each of these is a lesson in this course. They share a root: CDC at scale is a distributed systems problem, not a connector config.
Aha: CDC didn't eliminate the dual-write hazard, it relocated it. The fundamentals win ("just write the DB") quietly hands consumers a new job: reassembling scattered row-changes into the business fact the app actually meant, across topics with no global order and a schema that's really your internal tables. At one pipeline you don't feel it. At forty, that reassembly is where the bugs live, and the outbox pattern exists precisely to hand consumers the business fact directly instead of making them rebuild it.
3. Worked Example
See the gap between "row changes" and "business fact" concretely.
The intent: one checkout = one business fact, "OrderPlaced with 3 items, total $42."
What raw CDC emits for that single transaction:
Six events, four topics, one business fact. A consumer wanting "OrderPlaced" must:
- know that these six belong together (only the shared
txIdsays so), - wait for all of them (no ordering guarantees the order arrives before its items),
- understand the internal table schemas (so a
customer_statsrefactor breaks it).
The outbox alternative (previewing Lesson 2):
One event, one topic, one business fact, emitted atomically with the DB write (the app inserted this into an outbox table in the same transaction). The consumer gets exactly what it needs and never sees an internal table. That's the difference scale makes.
4. Your Turn
Exercise: TheWorldShop's "cancel order" action updates orders.status='cancelled', restores inventory, and inserts a refunds row, all in one DB transaction. A new "Refunds Notifier" consumer must react to "OrderCancelled."
- List the raw CDC events this one action produces and the topics they land on.
- Explain two ways the Refunds Notifier could get this wrong if it consumes raw CDC events.
- Restate the design with an outbox
OrderCancelledevent. What does the app write, and in what transaction? - Argue why the outbox version is not a dual write, even though the app now writes two things (its data + the outbox row).
- Give one scenario where raw CDC is still the right choice despite its downsides.
5. Real-World Application
The "row changes vs business events" debate is one of the central design arguments in event-driven architecture. Teams that start with raw CDC for everything eventually hit schema-coupling and reassembly pain and introduce the outbox for their important domain events, ending up with both: raw CDC for bulk replication, outbox for business facts.
Microservices and domain-driven design lean hard on the outbox precisely because exposing a service's internal tables as its public event stream violates encapsulation. The outbox lets a service publish a stable domain event while keeping its schema private.
This course mirrors a real maturity curve. Companies adopt CDC (fundamentals), scale it, get burned by ordering/schema/erasure/exactly-once, and then adopt the patterns in this course. You're learning the destination without having to suffer each incident first.
6. Recap + Bridge
CDC solved the original dual write by deriving events from the log, but at scale it relocates the hazard: it emits row changes, not business facts, forcing consumers to reassemble scattered, schema-coupled, unordered events. The outbox pattern resolves this by having the app write business events into an outbox table in the same transaction, so CDC emits clean contracts, without reintroducing a dual write. The rest of this course is the distributed-systems problems this scale creates.
Next we build the outbox pattern properly: the table design, the transactional guarantee, and the Debezium outbox event router that turns it into clean topics.