The Four Primitives

Module: Measures, Metrics, Entities, Dimensions | Duration: ~13 min | Lesson: 1 of 9


Dev is reading the docs for his team's new metrics layer and keeps hitting four words that seem to mean the same thing: measure, metric, entity, dimension. The docs use them precisely, but Dev uses "measure" and "metric" interchangeably, the way most people do.

So he writes what he thinks is a metric, it compiles, and the number comes out wrong. The layer didn't error. It did exactly what he told it, which wasn't what he meant, because he'd conflated a measure (a raw aggregation) with a metric (a measure plus business intent). The bug wasn't in his SQL. It was in his vocabulary.

Every confusing thing about a metrics layer dissolves once these four words are precise. Let's make them precise.


2. Concept Explanation

A metrics layer is built on exactly four primitives. Everything else (ratio metrics, time spines, query planning) is composed from these. Get them precise and the rest of this course is assembly.

Measure: a raw aggregation

A measure is a single aggregation over a column: SUM(amount), COUNT(DISTINCT user_id), MAX(order_date). That's it. A measure has an aggregation function and an expression, and nothing else. It carries no business meaning, no filters, no intent. SUM(amount) is a measure whether "amount" means revenue, refunds, or shipping cost.

Think of a measure as the rawest reusable building block: "how to roll up this column". It is necessary but not sufficient to answer a business question, because a business question always carries more than "sum this column".

Metric: a measure plus business intent

A metric is a measure wrapped in everything that makes it mean something the business named: filters, a default time grain, null-handling, a description, an owner. revenue is the measure SUM(amount - refunds) plus "exclude test accounts, recognized on order_date, owned by finance".

This is the distinction Dev missed. The measure is the math. The metric is the math plus the agreed-upon rules that make it the company's official "revenue". Two metrics can share one measure: revenue and revenue_excluding_returns might both build on SUM(amount) with different filters. A measure is reusable plumbing; a metric is a governed, named business quantity.

The slogan: a measure is a SQL aggregation; a metric is a measure plus intent. Conflating them is the single most common beginner error, and it produces numbers that are wrong without being errors.

Entity: a thing you join on

An entity is a business object that rows can be grouped by and joined on: a customer, an order, a product, a region. Entities are how the metrics layer knows that orders and customers can be connected (via the customer entity) so you can ask for "revenue by customer signup cohort".

Entities are the nouns of your business. They're also, secretly, your join graph: every join in your warehouse is two tables meeting on a shared entity. Naming entities is what lets the layer plan joins for you instead of making you write them. The whole of Lesson 2 is about this.

Dimension: an axis to slice and group by

A dimension is an attribute you filter or group by: region, order_date, product_category, customer_tier. "Revenue by region" uses region as a dimension. "Revenue for the US" uses region as a filter, still a dimension, used differently.

The subtle part: a dimension is not a column, it's a role a column plays at query time. The same created_at column is a dimension when you group revenue by signup month and is just a stored value otherwise. We'll sharpen this in Lesson 3.

How they fit together

One sentence ties all four: a metric (a measure plus intent) is computed over rows joined along entities and sliced by dimensions.

revenue            by    region        for customers who signed up in 2026
^metric (^measure)       ^dimension       ^entity (customer) + ^dimension (signup date)

Every metric request is this shape: pick a metric, pick dimensions to slice by, the layer uses entities to resolve the joins, and applies filters. Four primitives, one grammar.

Aha: "Measure" and "metric" are not synonyms, and the entire correctness of your metrics layer rides on the difference. A measure is SUM(amount), just math. A metric is SUM(amount) plus "minus refunds, excluding test accounts, on order_date, owned by finance", the math plus the intent. The layer can't tell you when you've asked for the measure but meant the metric. It just returns a number that's wrong without being an error.


3. Worked Example

Let's classify the pieces of one business question for TheWorldShop, then write them in a metrics-layer model.

The question: "What was net revenue by product category for customers in the enterprise tier, last quarter?"

Pull it apart:

  • Metric: net_revenue (the named business quantity).
  • Measure underneath it: SUM(amount - coalesce(refund_amount, 0)).
  • Intent that turns the measure into the metric: exclude test accounts, recognize on order_date.
  • Entities involved: order (the grain of the fact), customer (to filter by tier), product (to group by category).
  • Dimensions: product_category (group by), customer_tier (filter to enterprise), order_date (filter to last quarter).

In model form:

semantic_model:
  name: orders
  entities:
    - { name: order, type: primary, expr: order_id }
    - { name: customer, type: foreign, expr: customer_id }
    - { name: product, type: foreign, expr: product_id }
  measures:
    - name: net_amount
      agg: sum
      expr: amount - coalesce(refund_amount, 0)
  dimensions:
    - { name: order_date, type: time }

metric:
  name: net_revenue          # the METRIC
  measure: net_amount        # built on the MEASURE
  filter: "account_type != 'test'"
  description: "Net revenue: sales minus refunds, excluding test accounts."
  owner: finance

Now the original question compiles to a request: metric net_revenue, group by product_category (a dimension reached through the product entity), filter customer_tier = 'enterprise' (a dimension reached through the customer entity) and order_date in last quarter.

Notice how the four primitives partition the question cleanly. The measure is the math, the metric adds the intent, the entities resolve the joins to customer and product, and the dimensions do the slicing. If Dev had asked for the bare measure net_amount instead of the metric net_revenue, he'd have included test accounts and gotten a wrong-but-not-erroring number. Same math, missing intent.


4. Your Turn

Exercise: Take the business question: "How many distinct active sellers did we have by marketplace region last month, excluding sellers flagged as fraudulent?"

  1. Identify the measure, the metric, the entities, and the dimensions.
  2. Explain what intent turns the measure into the metric here (what would be wrong if you asked for the bare measure).
  3. In one sentence, state the rule for telling a measure from a metric.

5. Real-World Application

The measure-vs-metric distinction is baked into every real metrics store, even when the names differ. dbt's MetricFlow makes you define measures inside semantic models and then metrics on top of them, exactly this two-layer split. Cube has measures inside cubes and curated views on top. LookML had measures and then derived/filtered measures. The vocabulary varies; the four primitives are universal, because they're the irreducible parts of "a named business quantity, computed over joined rows, sliced by attributes".

The reason the distinction earns its place in production is reuse and governance. Measures are the reusable plumbing: define SUM(amount) once and three metrics can build on it. Metrics are the governed surface: revenue, net_revenue, and gross_revenue are three reviewed, owned definitions that share plumbing but mean different things. Collapsing them into one concept either forces you to redefine the math for every metric (no reuse) or to expose raw measures to consumers (no governance). Keeping them separate is what lets a metrics layer be both DRY and governed.

For an engineer, the immediate payoff is debugging speed. When a number is wrong, the first question is "did you ask for the measure or the metric?" Half the time the answer is "I asked for the raw aggregation and lost the filters", and you've found the bug in one question instead of reading the generated SQL. The vocabulary isn't pedantry. It's a diagnostic.


6. Recap + Bridge

Four primitives, made precise: a measure is a raw aggregation (SUM(amount)); a metric is a measure plus business intent (filters, grain, null-handling, ownership); an entity is a business object you join and group on; a dimension is an attribute you slice and filter by, a role a column plays, not the column itself. A metric is computed over rows joined along entities and sliced by dimensions. Conflating measure and metric produces numbers that are wrong without erroring.

Next we go deep on the primitive that does the most invisible work: entities, which are the join graph you already have and didn't know you'd named.