How to - use incremental models to detect regressions in historic metrics #1615
graciegoheen
announced in
Archive
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background
Imagine that you have expected outputs for a historic metric (total revenue) as described below:
How can you test that these historical metrics do not change in dbt?
This is an alternative solution to the snapshot-based one proposed here.
Step 1
Let's say that you have a
fct_orders
table, which has one row for each order:First, you should create a model that sums the amount for each year excluding the current one (for simplicity, we're assuming you have no costs). Let's call this
fct_revenue_summary
:Step 2
Next, create an incremental model on top of
fct_revenue_summary
which captures the historical view of revenue outputs. Let's call thisfct_revenue_summary_history
:Step 3
Finally, create a test on
fct_revenue_summary_history
to check that each year has a single source of truth fortotal_revenue
:When you run a
dbt build
, you will get an error if you ever output a newtotal_revenue
value for a historic year that differs from the original. For example, if your originalfct_revenue_summary_history
looks like this:But then, you introduce a breaking change such that dbt now calculates the
total_revenue
for 2019 as 0.8 million,fct_revenue_summary_history
will now look like this:And the uniqueness test on the year column will fail.
This allows you to detect regressions in historic metrics!
Beta Was this translation helpful? Give feedback.
All reactions