Hi @Alex_H, @DataGeekDude, @Abdel,
Thank you so much for all the use cases and suggestions. I would love to discuss the complete workflow with you guys. I will try to setup a call with you soon to further understand it.
This could be something that weâre planning to do is dynamic modeling. Dynamic modeling in combination with Preview Reporting would definitely help you guys.
When I change my datasource OR the AML, I want to be sure nothing breaks in Holistics Dashboards/Reports.
Example:
In my dev/staging datasource (assuming I can switch prod with dev/staging datasource), I remove/rename a few fields, how will I be able to make sure in development mode that nothing downstream breaks?
In AML, I remove a measure or dimension (for cleaning up my code for example), how can I be sure that that specific field isnât used in any report?
For data testing, I didnât pull up the topic, but it is worth mentioning, that a test feature would also be helpful to test metrics on a more automated way. Maybe unit testing? But that is a bot off topic
This totally makes sense @Abdel. The Reporting Validation is expected to be done before deployment.
The only issue of fixing the report before deployment is the possible downtime (fixing the error which is not happened yet). What we should have in the long-run is the ability to create Dashboard as-code you so can manage the whole modeling and reporting workflow in development mode and when everything is valid, you can then confidently click Deploy to Production.
This should be part of our long-term plan (not this year I believe). I will check with my team to see if we can set higher priority for it in the last quarter this year, but cannot promise for now.
Question for @Abdel, @Alex_H et @DataGeekDude who contributed to this feature request : how do you solve this issue right now ? I donât see a viable option and I donât want to manually change all the schema and datasource names when deploying to production
Now, to make the data_source_name overriding working correctly, we will have to avoid using schema prefix because it canât be overrided at the dataset layer as the data_source_name.
I would like to open this subject again : like @Abdel it would really help us to have a staging env distinct from production.
We use an external git to version our models but we find that :
the dataset preview feature is not enough to validate a dataset (you need to start building dashboards to do that
itâs very difficult to deal with test vs production database/schemas within one single holistics env
If we could have a âProdâ holistics env connected to the master branch of our git repo and our prod datawarehouse as well as a âTestâ Holistics env connected to a staging branch of the same repo and a staging datawarehouse it would very much simplify all these issues.
And I think Holistics already has all the features to support this : you only need external git support
Hi @dacou , thanks for sharing this. Could you specify what kind of outcome youâre seeking from Dataset Validation? Does the Preview Option work in this case?
Hi @Khai_To , what I mean is that in order to fully validate a dataset, we need to start building reports, dashboards and filters to confirm that it is a good fit for the intended purpose.
For now, the models and dataset preview feature are useful for the first controls :
âmodel previewâ : detect formatting issues, visual check of measures
âdataset previewâ : control relationship + sandbox for dataset first checks
When these first checks are done, we publish the dataset in order to start building dashboard, reports and filters. At this stage, the dataset is already merged in the master branch but some major changes can still occur on to implement features that could not be detected before :
custom dimensions (groupings, encodings, value ranges, âŚ)
custom models (depending on the visuals that we are trying to build
data formatting
pre-aggregation (depending on query performance feedback)
âŚ
For a new dataset, these first steps happen with models and datasets querying a staging database.
When itâs done, we publish the new tables to our production database and switch the datasource of our models and datasets to production.
With this workflow it is not easy to have at the same time multiple versions of the same models and datasets (one production version and one+ version for dev/test).
If we had a âprodâ vs âtestâ holistics env, we would solve these issues :
maintain both test/prod versions of models and dataset
expose new versions of datasets to some key users for validation/iteration on the âtestâ env
and only when all of this is done, merge to master and publish to âprodâ Holistics
The main issue with this is that dashboards built on the âtestâ env would have to be rebuilt from scratch when moving to the âprodâ env.
One ideal solution would be to have one single holistics env but have âbranch supportâ included in the URL for both reporting and modeling so that we can share with our key users a dataset or dashboard for review before merging to master : https://eu.holistics.io/{release-xxx}/dashboards/v3/âŚ
I totally understand that our current version doesnât support this use case well.
The only (upcoming) option that you can use is our Preview Reporting, but we havenât released it yet due to several performance issues. Our team is actively working on the improvement so that we can release it as soon as possible.
I think that this use case is totally valid. Weâre discussing internally how to best solve this, and one of the options weâre thinking of is environment variable where we allow users to customize the behavior of your project depending on where the project is running (either Production, Development, or a specific Branch).
For example, you can define the dataset
Func ds_name () {
if (env.IS_PRODUCTION == true) {
env.PROD_DB_NAME
} else {
env.DEV_DB_NAME
}
}
Dataset name {
label: 'Name'
description: 'Something here'
data_source_name: ds_name()
// Other configurations
}
Do note that itâs just the drafted solution . The final one could be different.
With this use case, basically, you want to share a version of your Data (dashboard) to another Data Analyst so that he/she can check review it before Deploying to Production.
With the release of our upcoming Preview Reporting, you can basically Commit and Push your changes to a particular branch and ask the reviewers to checkout that branch and go to the Dashboard to validate the data.
But of course, with the branch name included in the URL, it would be much easier for you to quickly share the Dashboard with the reviewers.
I will note this down and consider supporting it in the future.
The environment variable feature is a great addition. We use a hacky workaround for now to simulate this (detailed by @Julien_OLLIVIER in this post ).
Regarding the âPreview reportingâ feature :
Will we be able to save reports and dashboards across âpreview reportingâ sessions ? This would be a must have to collaborate between data analysts and end users when dataset reviews last multiple days.
Will it be available to users other than data analysts ? Most of our âdataset reviewersâ are product owners and business users who donât have access to the modeling view right now.
Our upcoming release for Preview Reporting wonât allow you to save any changes to the Reports/Dashboards (in the Preview Session).
Also, the Preview Reporting is user-based meaning that other users wonât observe the same Reporting Behavior as you do.
It will refer to your current Working Directory (instead of the commit) to reflect changes to the Reports/Dashboards. Thus, if you make changes in the Reporting while in Preview session (for e.g., replace or remove a field in a report), other users (viewing reporting items) will be affected.
However, could you share why do you need to save reports and dashboards while in Preview Session?
The option Preview Reporting is only available for users who have access to the Modeling layer. Itâs currently not designed for non-DA users to do Acceptance Testing.
However, if you use case is to allow non-DA users to check and validate the Dashboard before Production Deployment, we would need to consider another approach.
How about serialization approach? Where youâre able to export all or specific parts of the Holistics contents and load it into another instance (dev instance).
That gives you much more possibilities