Skip to content

Derived Metrics #1616

@shangyian

Description

@shangyian

Intro

A derived metric is a metric node whose query references one or more other metric nodes, rather than directly aggregating columns from a transform/source/dimension node. Enabling this will provide flexibility for users who need to create complex metrics.

Examples

Period-over-Period Metrics:

This example include week-over-week and month-over-month metrics from same base metric:

-- Base metric: default.revenue (granularity-agnostic)
SELECT SUM(revenue) FROM default.orders_fact

-- Derived metric: default.wow_revenue_change (Week-over-Week)
SELECT 
    (default.revenue - LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.week))
    / NULLIF(LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.week), 0) * 100

-- Derived metric: default.mom_revenue_change (Month-over-Month)
-- Same base metric, different ORDER BY dimension
SELECT 
    (default.revenue - LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.month))
    / NULLIF(LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.month), 0) * 100

Note: The base metric default.revenue is not tied to any time granularity. The granularity is determined by:

  1. The ORDER BY clause in the window function (specifies comparison period)
  2. The requested dimensions at query time (specifies output grain)

This means one base metric can power multiple period-over-period derived metrics (WoW, MoM, YoY, etc.).

Ratio of Two Metrics (Same Parent):

-- Base metrics from same parent
-- default.revenue:
SELECT SUM(amount) FROM default.orders_fact
-- default.orders:
SELECT COUNT(*) FROM default.orders_fact

-- Derived metric: Revenue per Order
SELECT default.revenue / default.orders 

Ratio of Two Metrics (Different Parents):

-- default.revenue from orders_fact
-- default.page_views from events_fact

-- Derived metric: Revenue per Page View
SELECT default.revenue / default.page_views

Design Notes

1. Join Dimensions = Requested Dimensions

When a derived metric references metrics from different parent nodes (different fact tables), the join is performed on the requested dimensions at query time.

Example:

  • default.revenue from orders_fact (has: date, country, product_id, customer_id, ...)
  • default.page_views from events_fact (has: date, country, session_id, page_id, ...)

When user queries:

GET /sql/?metrics=revenue_per_view&dimensions=date_dim.date,country_dim.country

Generated SQL joins on date AND country:

WITH revenue_measures AS (
  SELECT date, country, SUM(amount) as revenue
  FROM orders_fact
  GROUP BY date, country
),
pageview_measures AS (
  SELECT date, country, COUNT(*) as page_views  
  FROM events_fact
  GROUP BY date, country
)
SELECT 
  COALESCE(r.date, p.date) as date,
  COALESCE(r.country, p.country) as country,
  r.revenue / NULLIF(p.page_views, 0) as revenue_per_view
FROM revenue_measures r
FULL OUTER JOIN pageview_measures p 
  ON r.date = p.date AND r.country = p.country

2. Available Dimensions = Intersection of Base Metrics' Dimensions

A derived metric's available dimensions are the intersection of all referenced base metrics' dimensions.

If revenue supports [date, country, product_id] and page_views supports [date, country, page_id], then revenue_per_view supports [date, country] only.

Requesting a dimension not in the intersection should return an error.

3. Recursive Component Decomposition

When decomposing a derived metric, we recursively decompose referenced metrics until we reach the actual aggregation components (SUM, COUNT, etc.).

Example:

wow_revenue 
  → references weekly_revenue 
    → decomposes to SUM(revenue) 
      → component: revenue_sum_xxxxx

The derived metric's components are the leaf-level aggregations from all referenced metrics.

SQL Generation Examples

Example 1: Week-over-Week Revenue Change

Base metric: default.revenue

SELECT SUM(revenue) FROM default.orders_fact

Derived metric: default.wow_revenue_change

SELECT 
    (default.revenue - LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.week))
    / NULLIF(LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.week), 0) * 100
FROM default.revenue

Query: metrics=[wow_revenue_change]&dimensions=[date_dim.week]

Generated SQL:

WITH orders_fact_measures AS (
    SELECT 
        week,
        SUM(revenue) as revenue_sum_xxxxx
    FROM orders_fact
    GROUP BY week
)
SELECT 
    week,
    (SUM(revenue_sum_xxxxx) - LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY week))
    / NULLIF(LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY week), 0) * 100 AS wow_revenue_change
FROM orders_fact_measures
GROUP BY week

Example 2: Month-over-Month Revenue Change (Same Base Metric)

Base metric: default.revenue (same as above)

SELECT SUM(revenue) FROM default.orders_fact

Derived metric: default.mom_revenue_change

SELECT 
    (default.revenue - LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.month))
    / NULLIF(LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.month), 0) * 100
FROM default.revenue

Query: metrics=[mom_revenue_change]&dimensions=[date_dim.month]

Generated SQL:

WITH orders_fact_measures AS (
    SELECT 
        month,
        SUM(revenue) as revenue_sum_xxxxx
    FROM orders_fact
    GROUP BY month
)
SELECT 
    month,
    (SUM(revenue_sum_xxxxx) - LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY month))
    / NULLIF(LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY month), 0) * 100 AS mom_revenue_change
FROM orders_fact_measures
GROUP BY month

Note: Both wow_revenue_change and mom_revenue_change use the same base metric (default.revenue) and decompose to the same component (revenue_sum_xxxxx). The difference is in the ORDER BY dimension in the window function.

Example 3: Revenue per Page View (Cross-Fact)

Base metrics:

  • default.revenue = SELECT SUM(amount) FROM orders_fact
  • default.page_views = SELECT COUNT(*) FROM events_fact

Derived metric: default.revenue_per_view

SELECT default.revenue / default.page_views
FROM default.revenue, default.page_views

Query: metrics=[revenue_per_view]&dimensions=[date_dim.date, country_dim.country]

Generated SQL:

WITH orders_measures AS (
    SELECT 
        date,
        country,
        SUM(amount) as amount_sum_xxxxx
    FROM orders_fact
    GROUP BY date, country
),
events_measures AS (
    SELECT 
        date,
        country,
        COUNT(*) as count_xxxxx
    FROM events_fact
    GROUP BY date, country
)
SELECT 
    COALESCE(o.date, e.date) as date,
    COALESCE(o.country, e.country) as country,
    SUM(o.amount_sum_xxxxx) / NULLIF(SUM(e.count_xxxxx), 0) AS revenue_per_view
FROM orders_measures o
FULL OUTER JOIN events_measures e 
    ON o.date = e.date AND o.country = e.country
GROUP BY 1, 2

Example 4: YoY Growth Rate (Same Base Metric as WoW/MoM)

Base metric: default.revenue (same base metric as Examples 1 & 2)

SELECT SUM(revenue) FROM default.orders_fact

Derived metric: default.yoy_revenue_change

SELECT 
    (default.revenue - LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.year))
    / NULLIF(LAG(default.revenue, 1) OVER (ORDER BY default.date_dim.year), 0) * 100
FROM default.revenue

Query: metrics=[yoy_revenue_change]&dimensions=[date_dim.year]

Generated SQL:

WITH orders_measures AS (
    SELECT 
        year,
        SUM(revenue) as revenue_sum_xxxxx
    FROM orders_fact
    GROUP BY year
)
SELECT 
    year,
    (SUM(revenue_sum_xxxxx) - LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY year))
    / NULLIF(LAG(SUM(revenue_sum_xxxxx), 1) OVER (ORDER BY year), 0) * 100 AS yoy_revenue_change
FROM orders_measures
GROUP BY year

Note: wow_revenue_change, mom_revenue_change, and yoy_revenue_change all share the same base metric (default.revenue). The only difference is the time dimension used in the ORDER BY clause.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions