How manage required column filters in Holistics


Some of the tables I’m working in are fairly large and it is reasonable to have required filters for columns such as ‘created_at’ to avoid querying a full table. I’ve come across two problems in Holistics when using required column filters in BigQuery tables:

  • When I use a Holistics Dataset based on a table with req filters the underlying query fails unless a condition on the required filter column is manually set. This is confusing for colleagues without SQL experience. A Dataset-level default filter on columns would fix this (Google Looker’s approach)

  • If the user wants to set specific conditions in the Dataset then Holistics tries to present search suggestions for the end user - this functionality fails since it’s missing required filters in the query that runs in the background. The user must switch to ‘contains’ as a work around.

At the moment all required column filters have been temporarily removed to facilitate for the end user.

What are potential work-arounds if I want to keep the required column filters in BigQuery and work around the issues described above?

Sorry for the late reply @Chris

As far as I understand:

  • You have a dataset that is created from a large BQ table; thus, it normally loads very long when a user explores data from it. Secondly, our filter suggestions also often fails because of the large data table.
  • You believe that ability to add a default filter to this dataset will help solve this case. But since we haven’t supported it yet, you want to know if there is any workaround for those issues.

Please let me know if I misunderstood anything.
If it’s your case, I suggest you to create a transform model from this BQ table and filter out data at modeling layer. Then, create a new dataset from those transform models instead of the original table models. It would help reduce the loading time when a user explores your dataset.

Would you mind checking if it can solve your issue?