Freshness Is a Contract, Not a Metric

Module: Freshness and SLA | Duration: ~12 min | Lesson: 1 of 7


Priya opens the TheWorldShop data catalog and finds the orders_daily table. The description field says "updated daily." She builds a dashboard on it for the merchandising team. Three weeks later the merch lead pings her: "the dashboard's been stuck on Tuesday's numbers since Thursday, did you know?"

She didn't. "Updated daily" was true in the sense that the pipeline runs daily. It just hadn't run successfully since Wednesday night, and nothing told anyone. The word "daily" felt like a promise. It was actually a hope.

What would the description have needed to say for the merch lead to never have to send that ping?


2. Concept Explanation

A metric describes; a contract obligates

A freshness metric answers "how old is this data right now?" It's a number: 14 hours, 3 minutes, 2 days. Useful, but it commits no one to anything. A dashboard can show "last updated 40 hours ago" in a tasteful gray font forever, and the number stays accurate the whole time the table rots.

A freshness contract answers a different question: "by when must this data be no older than X, and what happens if it isn't?" That sentence has three parts, and all three have to be present or it isn't a contract:

  1. A subject. Which table, which partition, which column's timestamp.
  2. A deadline. "No older than 6 hours, measured at 09:00 every business day."
  3. A consequence. Someone is paged, a downstream job is blocked, a banner appears on the dashboard. Something happens when the deadline is missed.

"Updated daily" has a subject and a vague gesture at a deadline. It has no consequence. That's why it failed Priya. Nothing was wired to the gap.

Why "daily" is the wrong unit

"Daily" describes the schedule, not the result. A DAG scheduled @daily that has been failing for a week is still, technically, a daily DAG. The schedule is an input. Freshness is an output. Conflating them is the root of most "the data looked fine" incidents.

The fix is to phrase freshness in terms a consumer cares about, never in terms of your cron expression:

  • Bad: "runs every night at 2am."
  • Good: "as of 09:00 each business day, orders_daily contains all orders through the prior calendar day. If not, the merch dashboard shows a stale banner and the data team is notified."

The second version is testable. You can write a check that fires at 09:00, looks at the data, and decides pass or fail. The first version is just a description of intent.

The two clocks freshness lives between

Every freshness contract sits between two clocks:

  • Event time. When the thing actually happened. An order placed at 23:58.
  • Processing time. When your pipeline saw it and wrote it down. Maybe 02:15 the next morning.

Freshness is the gap between "the latest event time present in the table" and "now." When you say "no older than 6 hours," you mean the newest record's event time should be within 6 hours of the current wall clock, at the moment you check.

Get the clock wrong and the contract lies. If you measure freshness by "when did the pipeline last run" (processing time) instead of "what's the newest event in the data" (event time), a pipeline that runs perfectly on schedule but processes an empty file will report itself fresh. It ran. It wrote zero new rows. The freshness-by-run-time check is green. The data is a day stale.

A contract names its consequence up front

The part teams skip is the consequence. It feels bureaucratic to write "if this misses, notify the on-call." But the consequence is the whole point. A deadline with no consequence is a metric wearing a deadline's clothes.

Consequences come in tiers, and naming the tier is half the design:

  • Block. Downstream jobs refuse to run on stale upstream data. The freshness check is a gate.
  • Page. A human is notified now. Reserved for "someone needs to act before business hours."
  • Annotate. The dashboard shows a "data as of X, may be stale" banner. The consumer sees the truth and decides.

Most tables deserve "annotate." A few deserve "block." Very few deserve "page." Lesson 5 is entirely about not over-paging. For now the rule is: every freshness contract picks exactly one consequence, and it's written down next to the deadline.


3. Worked Example

Let's turn TheWorldShop's "updated daily" into a real contract.

Step 1: name the subject and the clock.

-- The freshness signal: newest event time in the table.
SELECT MAX(order_placed_at) AS newest_event
FROM   orders_daily;

We measure freshness by order_placed_at (event time), not by a pipeline run log. This is the clock the merch team actually cares about.

Step 2: write the deadline as a check.

-- Freshness contract: at 09:00, the newest order must be recent enough
-- that we know data is still flowing.
SELECT
  MAX(order_placed_at)                                 AS newest_event,
  NOW() - MAX(order_placed_at)                         AS lag,
  (NOW() - MAX(order_placed_at)) < INTERVAL '30 hours' AS is_fresh
FROM orders_daily;

Step 3: attach the consequence. This table feeds a dashboard, not revenue. The consequence is annotate, not page:

result = run_freshness_check()  # returns is_fresh, lag
if not result.is_fresh:
    set_dashboard_banner(
        f"Data may be stale: last order seen {result.lag} ago"
    )
    notify_slack_low_priority("orders_daily missed freshness contract", lag=result.lag)
# Note: no page. A stale merch dashboard is a Tuesday, not a 3am.

Step 4: write the contract in the catalog, in human words.

orders_daily freshness contract. As of 09:00 each business day, the newest order_placed_at is within 30 hours of now. Measured by the query above. On miss: dashboard shows a stale banner; data team notified by a low-priority Slack message, not a page. Owner: data-merch team.

That paragraph is the thing Priya's catalog was missing. It has a subject, a deadline, a consequence, and an owner. The merch lead never has to send the "did you know" ping again, because the banner says it first.

Aha: "Updated daily" describes your cron schedule. Freshness describes your data. A pipeline that runs flawlessly every night and processes an empty file is perfectly on schedule and completely stale. Measure the newest event time in the data, never the last run time of the job.


4. Real-World Application

Every data catalog tool (DataHub, Atlan, Collibra, dbt's freshness config) has a freshness feature, and almost every team fills in the description field with a schedule word like "daily" or "hourly" and stops there. The teams that don't get paged are the ones who treat the freshness entry as a contract with a named consequence.

dbt source freshness is the most common concrete form. You declare loaded_at_field (the event-time clock) and warn_after / error_after thresholds (the deadline) directly in sources.yml. dbt source freshness then becomes a check you run on a schedule, and error_after is the consequence: it fails the run. That's a contract expressed in config.

The cultural shift is harder than the tooling. Moving a team from "the pipeline runs nightly" to "here is the deadline and here is what happens when we miss it" forces a conversation about who actually depends on the data and how badly. That conversation is the real product of writing freshness contracts. Often you discover a table everyone assumed was critical that nobody actually reads, and a "nice to have" export that secretly feeds the CFO's board deck.


5. Your Turn

Exercise: TheWorldShop has a customer_ltv table feeding three consumers: the homepage recommendation model (reads it hourly), a quarterly finance report (reads it four times a year), and an internal analyst who queries it ad hoc.

  1. Write a freshness contract for customer_ltv as the three-part sentence (subject, deadline, consequence). Pick a clock.
  2. The recommendation model and the finance report disagree on how fresh the table needs to be. How do you resolve one contract serving both?
  3. The table's pipeline ran successfully every hour last week, but a bug made it write the same snapshot each time. Would your contract have caught it? If not, what would you add?

6. Recap + Bridge

Freshness is a contract, not a metric: a subject, a deadline, and a named consequence. "Updated daily" describes your schedule and commits no one. Measure the newest event time in the data, never the last run time of the job, or an empty successful run will report itself fresh. Pick exactly one consequence per contract: block, page, or annotate. Next lesson untangles the three words teams use interchangeably and get billed for confusing: SLA, SLO, and SLI.