Data Testing In Holistics

Dear Holistics Community,

We’ve been talking to many data folks, and we know the headache of dealing with unreliable & inaccurate data after metric updates. That’s why we’re experimenting with a new research project to tackle this data accuracy challenge.
We need your feedback and ideas to help shape this innovative project.

The Problem: When updating a metric could cause broken dependencies (and many headaches)

When analysts update a metric, they must ensure Reliability:

  • Identifying how changes impact output data
  • Ensuring no disruptions occur in downstream elements such as other metrics, filters, and reports.
  • Preventing runtime errors such as AQL/SQL errors.

Currently, these steps rely on manual effort or a cumbersome review process. Since the metric is a complex chain of calculation, It’s not just about verifying your changes—you also need to ensure they don’t disrupt others’ work. This approach is both exhausting and time-consuming.

Solution: What if you can run Data tests on metrics in Holistics?

:bulb: Please note that this is a research prototype, and the final feature may differ from what is described here. The availability timeline is also subject to change.

High-level flow

Manual testing vs automation data testing

Data Testing

  1. Identifying how changes impact output data

  2. Testing Downstream Dependencies: Ensuring no unexpected changes in downstream elements such as other metrics, filters, and reports.

    test

  3. Test Report - Coverage & Status Overview: gain visibility & overview of your project’s data quality by centralizing all test results in one view.

Further Consideration for Improvement:

  • CI/CD Integration: Automatically run Data Test by seamlessly integrating with CI/CD workflow via API, GitHub Action
  • Automatically generate test cases for your metrics with AI-powered assistance
  • Dashboard Testing: Generate test for all widgets, integration test (filter, cross-filtering), user-attributes,…

We genuinely hope that this experiment can enhance the quality of your data. We would love to hear your feedback and ideas. Your input is invaluable as we refine and expand this initiative.

I love the direction of this! Unfortunately I don’t quite think it would work for us because a lot of our data can be retroactively updated. In other words, we regularly get late-arriving facts, so what we reported were the June numbers on July 4 might be different from the June numbers on July 5. Occasionally retroactive changes can happen months after the fact.

I think what would work better for us would be a unit-testing style (see dbt unit tests for an example) , where we could feed in some fabricated data and check the output under various circumstances.

2 Likes