Issue with ratio between two measures

Hi everyone,

I have two tables in my dataset :

  • Customers contains my customers and their attributes (customer_id, customer type, country, …)
  • Product Sales contains one row for each sale recorded (customer_id, product category, month)
  • Both tables are joined by the customer_id field.

I have created a simple measure in order to count customers in the Product Sales table :

  • Nb customers = count(distinct customer_id)

Now I want to calculate the ratio of customers of each product category compared to the total number of customers in my base and I can’t find a way to do it.

Considering a total number of customers of 300, the end result should look like this :

Number of customers Total number of customers Ratio
Product category A 150 300 50%
Product category B 100 300 33%
Product category C 200 300 66%

And obviously, if I add a filter to select only a subset of customers (customer_country = ‘UK’ for example). I would like this table to update accordingly :sweat_smile:

Is there an Holistics champion somewhere who knows how to do this ?

1 Like

I would recommend that you create the two measures in AML:

  1. Number of Customers → count(distinct customer_id)
  2. Total number of Customers → This is a little more ticky and you have to use windowing functions in your AML to get the correct result and depending on your database you won’t be able to do a distinct count in a window function.

If you can work out 2 then you can use (1) / (2) in a business calculation in your widget.

I have done this, using a sum, in a dashboard but you have to partition your windowing function exactly for the visualisation that is being created, otherwise you get an error.

It would be great if Holistics supported this natively.

1 Like

Thanks @dacou for sharing the use case and @DataGeekDude for sharing your workaround.

If I were to break Damien’s request down, there are two main things:

  • Calculate the percentage of a subset of customers vs total
  • Apply the dashboard filter to only selected measure (in this case: num of customers), and the business calculation will change accordingly

Our team has been researching and developing a native solution for these cases, and you can expect them in early 22Q3.


Just curious: When you mentioned you wanted to filter, say, by country = ‘UK’, do you also want to filter the Total number of customers?

Also, do you have any other use cases related to subgroup calculation (e.g. other visualization types, other ways of calculations…)?

Thanks Tuan for the info and great to hear this is something being looked into.

For my usage, what I found particularly difficult was that the context of the total is important i.e. do you also want to filter the Total number of customers?. For me I did need to filter that metric by the time slicer and have this change if someone changed it but otherwise this did not need filtered. So for instance I might want to look at this month to month, but then change it quarter to quarter. My workaround you would need to change (or have two) the metric from Month to quarter partitioning.

I think being able to plot this on a bar chart would be useful, or incorporate some form of bar into the table visualisation to be able to see the percentage difference in slightly more “detail”.

Thanks both of you for your answers,

Regarding your question @tuan.nguyen : yes, I would like both measures (the number of customers that bought each product category and the total number of customers) to be filtered with country = UK.

I tried a different design approach taken from my previous Qlik Sense years. I built a dimensions table containing all possible combinations of the dimensions that apply to all my measures and generated a key for each combination with an hash function and then I use this key to join all my fact tables between them. My model looks like this in the end :

  • Dimensions (combination_id, customer type, country)
  • Customers (customer_id, combination_id)
    • Measure : Nb customers = count(distinct customer_id)
  • Product Sales (combination_id, customer_id, product_category, month)
    • Measure : Nb buying customers = count(distinct customer_id)

All tables are joined together with the combination_id field.
In my chart I can then directly calculate my ratio “Nb buying customers” / “Nb customers” with a business calculation.

And it works…almost ! :sweat_smile:

If I don’t put any filter on my report it shows my ratios correctly. But if I filter to only show the sales of january and if nothing was bought by a UK customer in January, the “Total number of customers” measure will not include the UK customers at all.

I hope my explanations were clear ! After giving it more thoughts, I think that the issue lies on how Holistics builds the SQL query :

Right now it tries to build one single query including all measures and dimensions asked by a report. And in order for this to work, I think it should build one subquery for each measure to be shown in the graph and then merge those subqueries on all common dimensions.

What do you think?
Damien

This example of subgroup calculation is maybe particularly complex because the sum of the subgroups (the number of customers who bought each product) is not equal to the total size of the group as some customers may have bought multiple products and be counted multiple times.

I’m also looking for solution to the simpler case where we want to show the number of customers of each country compared to the total number of customers :

Country Nb customers %
UK 100 ?
France 200 ?
Total 300 100%

Just my two cents on the subject of ‘category / total’ - This documentation is way above my current level of understanding and the examples are sufficiently complex that I can’t even begin to copy / paste and adapt them:

A display option in a pivot table, of ‘display as % of total’ would be a great option - like GDocs offers. At the moment, my only workable alternative is to export raw values as a csv and have GDocs do the pivot table.

I use Holistics because my technical skills are limited so there really needs to be a simple solution I think?

Hi @dacou, @DataGeekDude, and @Martin_Dell,

We are working on a new Expression to support this use case so that you don’t have to write complex SQL Models for it.

For more information, you can refer to the thread below