Running Total for Count/Count Distinct

Thanks @Rob_B for the suggestion :pray:

Generic use case: Apply count/count distinct on a dimension, then use running total on that count of dimension_name to accumulate it over time.


Please leave a vote :small_red_triangle: for this feature so I can humbly shout at my team that “our customers really need this feature don’t you see???”

Jk, this helps us prioritize our backlog better.

Count distinct with a running total would go beyond what can be achieved within an ANSI SQL window function so this would be :smiling_face_with_three_hearts:

1 Like

Any update on the prioritization of this feature?

Applying a running sum on any aggregated column (and also dividing by the column total sum) would be of great help for my organisation.

Hi @jpean and @Chris,

Thank you for writing to us! We are still in the progress to collect more input for this feature; hence, no concrete update at the moment. Your comments are duly noted and I will be sure to raise this up again with our team.

In the meantime, if you have any additional details to share about your use cases, really looking forward to hearing more.

Daphne

Hi @daph.td ,

The possibility to combine a running sum and divide by a column total sum can be used to create a cumulative percentage. One use case: Get a quick overview of where to spend focus. A rule of thumb in operations could be to prioritise Product Groups that together make up 85% of the revenue.

In Google’s Looker this can be done in their Table calculations, which correspond to Holistics Business calculation. Looker docs for running total

1 Like

Hi @Chris,

Thanks again for sharing the details with us! Good news - I raised this topic with our team and we decided to look into it really soon. Please stay tuned for the new update.

Daphne

1 Like

Daphne has there been any update on this?

Hi @gwizard

I’m Thomas from the Holistics team. Thanks for following up on this thread. Can I clarify if you want to:

  1. Use running_total with _count_distinct?
  2. Use running_total with regular count?

Your input would be really helpful for us!

It would b great to have both, but the important one is the count_distinct

Hi Garett,

Thank you for clarifying the needs. Unfortunately we haven’t supported yet count_distinct yet since it requires the database to support hyper-log-log (so that we can approximate count_distinct to have acceptable performance), and not all databases support this function. We acknowledge this is a limitation and we’ll keep you up to date when there’s any new information.

Hi Garret and all,

Note that if you are using Holistics 4.0, running total of aggregation functions (except count distinct) are already available in the UI (Running Total | Holistics Docs (4.0)) as well as an AQL function (running_total | Holistics Docs (4.0)). The only reason why running total of count distinct is not yet supported is that it is not well supported by most SQL databases. It can be done using a self-join, but the performance is not acceptable due to the expensive (in terms of performance) nature of count distinct. To support this in an acceptable manner with respect to performance, the database must support hyper log log, as mentioned above, and many databases still don’t have this support built-in yet.