Development workflow//testing

Alex_H · January 11, 2022, 6:12pm

It would be really helpful if there were a feature where you could point models to a development schema when in development mode (e.g. pointing to dynamic dbt dev schemas using a users parameter), and being able to test all models for validity.

Khai_To · January 12, 2022, 9:04am

Hi @Alex_H, if I understand you correctly, basically you want to config the environment wherein dev mode, all of your models, and datasets are forced to be pointed to the dev database. Whereas, after deploying the project to the Production, all of your deployed models, and datasets will automatically be pointed to the Production database.
This is something we call environment management and we have thought of supporting it in the future. For now, I don’t have the timeline for it yet but I will +1 priority for this.

For this one, could you share a little bit more details about the use case and workflow that you’re thinking of? Also, how do you currently test your models?

Alex_H · January 12, 2022, 9:30am

That’s good to hear environement management is on the radar.

For this one, could you share a little bit more details about the use case and workflow that you’re thinking of? Also, how do you currently test your models?

Once we have defined our measures and dimensions in Holistics, it is possible that during development, a column is removed or renamed. Once the development table is ready, it would be good to be able to automatically test that all the dimensions and dimensions in a model still work. The way to do this now would be to select all the fields in a model and manually run it.

In addition, it would be really helpful if we could view dashboards using the development data (when the dev switch is turned on in AML).

Khai_To · January 17, 2022, 3:37pm

If that’s the case, we will always test the validity of your modeling setup before deploying it to production. If your models or datasets are invalid, the deployment will be failed and you will need to resolve it before continuing deploying.

For this one, the use case is valid. That’s the reason we’re working on dashboard as-code so that you can create and test your dashboard in the development area before deploying it to production. I will let you know when there is an update on this.

Alex_H · January 17, 2022, 3:40pm

If that’s the case, we will always test the validity of your modeling setup before deploying it to production. If your models or datasets are invalid, the deployment will be failed and you will need to resolve it before continuing deploying.

But in this scenario nothing would have changed on the Holistics layer - only in the underlying db…

Khai_To · January 17, 2022, 4:01pm

Ok, I see. How frequently does your underlying table structure change?

Alex_H · January 18, 2022, 6:29am

Difficult to say in the long run as we’re just setting up so they’re changing fairly frequently. My experience though is that models do continuously evolve, but slowly.

DataGeekDude · June 15, 2022, 3:19pm

I have just had the same request come from one of our analysts.

We use dbt, and have a separate GCP project where all the output of each individuals dbt runs are output too for development e.g. dbt-development, vs our production project production.

If an analyst makes a change to a dbt model locally, e.g. add a new column, they would like to be able to add this new column in AML and check what the data looks like in a simple chart. This way they can validate their change works as expected.

Abdel · June 15, 2022, 11:42pm

It would be also very helpful for us if we can have a staging environment in Holistics.

Khai_To · June 17, 2022, 3:52am

Hi @DataGeekDude and @Abdel,

Not sure if something like Preview Reporting helps?

[Upcoming] Preview Reporting before Deployment - News from Holistics / Product Releases - Holistics Community

Alex_H · June 17, 2022, 6:15am

This definitely moves us forward, thanks Khai.

The outstanding feature to complete this would be to be able to place variable blocks within the model’s query code to automatically switch between different tables between dev and prod - something like

select
*
from
– if development_environment – {{user_variable_dev_schema}}.table1
– if production_environment – production_schema.table1

DataGeekDude · June 17, 2022, 7:42am

Hi Khai,

We already use this to test out changes in the AML layer, but as @Alex_H points out above, it doesn’t allow us to switch the data source underneath.

To provide a more complete example, say we have a table called orders and I need to add a new column product category to this table.

First, I would add this table in dbt to our data warehouse. I do this by making a change locally, running the code and tests against this change. The data with the new column now lives in dbt-development.order_data.stg_orders and I can query it.

However, at the moment in Holistics, we query the production data which lives at production.order_data.stg_orders. If I want to test the impact my change in development has made to reporting, I need to deploy this to production first.

But what if the change is wrong? I would need to go back and create another PR and deploy the change over and over again until I get it right.

However, if Holistics was in development mode which was configured to point to dbt-development rather than production then we could query the new change without deploying it to production.

I would see this happening in the data source configuration within Holistics and the AML code would stay the exact same. This would mean we don’t need to change the code, as per @Alex_H example above.

We do this in dbt ourselves, where if our connection configuration is for development our output goes to dbt-development and if our connection configuration is for production, the output goes to production.

Happy to discuss further and even demo what we do in dbt if that would help.

Alex_H · June 17, 2022, 8:07am

We’re completely on the same page @DataGeekDude.

One point to bear in mind in dbt development though is that each developer could have their own dev schema, so any global dev configuration would also need to apply user variables.

Abdel · June 17, 2022, 11:13am

Hi @Khai_To,

In an ideal situation, you would just do your testing in a sandbox environment, that carries all your production reports.

That way the developer can have the freedom to apply changes and experiment.

The report preview doesn’t sound really as some major benefit to us.

The main reasons:

the preview mode is very manual, so testing needs to be manually. If you have 20 reports depending, you cannot go through them all manually.
switch of datasource to dev/staging is not supported there

Anyway, I understand the feature, and it will add value, but it is definitely not close to having a seperate dev/staging environment.

Khai_To · June 18, 2022, 4:21pm

Hi @Alex_H, @DataGeekDude, @Abdel,
Thank you so much for all the use cases and suggestions. I would love to discuss the complete workflow with you guys. I will try to setup a call with you soon to further understand it.

This could be something that we’re planning to do is dynamic modeling. Dynamic modeling in combination with Preview Reporting would definitely help you guys.

For the comment from @Abdel

Could you specify what do you mean by testing? Do you want to test:

if there are broken reports
or test how will the data of the reports be altered due to modeling changes?

Abdel · June 18, 2022, 4:41pm

Hi Khai,

When I change my datasource OR the AML, I want to be sure nothing breaks in Holistics Dashboards/Reports.

Example:

In my dev/staging datasource (assuming I can switch prod with dev/staging datasource), I remove/rename a few fields, how will I be able to make sure in development mode that nothing downstream breaks?
In AML, I remove a measure or dimension (for cleaning up my code for example), how can I be sure that that specific field isn’t used in any report?

For data testing, I didn’t pull up the topic, but it is worth mentioning, that a test feature would also be helpful to test metrics on a more automated way. Maybe unit testing? But that is a bot off topic

Abdel · June 20, 2022, 4:53am

Hi @Khai_To ,

This feature might also help here

However, I wouldn’t want to fix stuff after deployment, but want to do report validation BEFORE deployment

Khai_To · June 23, 2022, 8:52am

This totally makes sense @Abdel. The Reporting Validation is expected to be done before deployment.
The only issue of fixing the report before deployment is the possible downtime (fixing the error which is not happened yet). What we should have in the long-run is the ability to create Dashboard as-code you so can manage the whole modeling and reporting workflow in development mode and when everything is valid, you can then confidently click Deploy to Production.

Abdel · June 23, 2022, 11:07am

Hi @Khai_To

That’s exactly what I was looking forward to.
But on what term would Dashboard/Report as code be there?

Khai_To · June 28, 2022, 10:14am

This should be part of our long-term plan (not this year I believe). I will check with my team to see if we can set higher priority for it in the last quarter this year, but cannot promise for now.