Ability to test how data changes over time via Unit Tests #10226

mwodonnell · 2024-05-25T16:32:40Z

mwodonnell
May 25, 2024

We recently upgraded our DBT project to 1.8 so that we can take advantage of Unit Testing which has been very helpful.

While it is still in it's early stages - it makes sense that the testing capabilities are a bit limited and we expect the feature set to grow over time.

One area we ran into trouble quite quickly is two fold.

First, unit tests only support doing positive checks currently. Meaning you can only test for the expectation that something does exist - you cannot test that something does not exist. Being able to test for say, 0 rows in the resulting table, would solve this as you could set your input data up in a way that should result in no records being generated, and "expect" 0 records.

The second larger one though is testing how records change with subsequent runs. We have a use case within our pipeline where a record could be created in an EDW table based on the state of the ODS layer. Later on - the data in the ODS layer could change such that the records in the EDW layer should be updated to reflect the new state of the underlying data.

We currently test this by just testing the individual states of each use case - expect the EDW layer shows a flag set to True given one set of input data, and separately test the flag is False from a different set of input data. However - we can't test the transition itself. The tests we do have give us pretty good confidence that we should not need to worry - but being able to test this transition would get us up to that 100% confidence we'd prefer to have.

jtcohen6 · 2024-05-27T11:51:59Z

jtcohen6
May 27, 2024
Maintainer

@mwodonnell Thanks for giving unit tests a go!

The second point sounds like you're testing logic at the intersection of incremental updates (in your EDW model) and capturing slowly-changing dimensions (your ODS layer). Is that fair? I don't have a perfect answer here, but two things come to mind:

We do have a documented example for testing incremental updates
It's possible to mock the input of a dbt snapshot (type-2 SCD) in your unit test (though it's not currently possible to unit-test whether the snapshot itself is correctly capturing changes)

On the first point, "negative" checks - give this a go!

-- models/model_a.sql
select 1 as id

-- models/model_b.sql
select * from {{ ref('model_a') }}
where id is not null

unit_tests:
  - name: my_unit_test
    model: model_b
    given:
      - input: ref('model_a')
        rows:
          - {id: null}
    expect:
      rows: []

1 reply

mwodonnell May 27, 2024
Author

This looks like exactly what we need to cover all our corner cases! We'll give this a whirl and let you know if we run into any issues. Thanks for pointing us in the right direction!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to test how data changes over time via Unit Tests #10226

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Ability to test how data changes over time via Unit Tests #10226

mwodonnell May 25, 2024

Replies: 1 comment · 1 reply

jtcohen6 May 27, 2024 Maintainer

mwodonnell May 27, 2024 Author

mwodonnell
May 25, 2024

Replies: 1 comment 1 reply

jtcohen6
May 27, 2024
Maintainer

mwodonnell May 27, 2024
Author