Compare Partitions

gwizard · August 14, 2024, 10:11pm

The period comparison tool is very helpful, but it would be nice to have the ability to compare partitions with more control. For example, my runs multiple tests on a single date that get partitioned in our database. We need to compare tests both on the same date and on other dates. It would be great to be able to have a filter where we could choose the partitions that are being compared.

Phuong_N_Holistics · August 15, 2024, 4:10am

Hi @gwizard

Thank you so much for your feedback!

The period comparison is one of the most appreciated features in Holistics. We understand that it can be made more flexible so that our users can perform more analyses like yours. We have added your request to our backlog and will share updates with you when we have a more concrete plan for improving this feature.

In the meantime, can you please share with us the following:

Please elaborate more on the use case you mentioned here. Any screenshots or illustrations of how you want the outcome to look like would be great!

We need to compare tests both on the same date and on other dates. It would be great to be able to have a filter where we could choose the partitions that are being compared

What is your current workaround to achieve this task?
Besides this use case, are there any other scenarios that require a more flexible period comparison?

gwizard · August 19, 2024, 5:29pm

I can’t share screenshots currently.

What is your current workaround to achieve this task?
I’ve created a new dimension that adds +1 day or +2 days to the original partition date depending on the tag associated with, and have the dashboard setup to just compare results with the previous partition in this dimension. It’s workable but not ideal as we can’t choose which two runs the results are from and are always limited to only the one before it.

Other scenarios this would be useful for is any scenario that involves running tests of different criteria where you need to compare the results. Datasets often need to be partitioned by more than just their date.

Phuong_N_Holistics · August 20, 2024, 11:31am

Hi @gwizard

What do you mean by “partition”? Can you please elaborate more and give some specific examples?

From what you described thus far, I’ve listed down a few cases. Does any of the cases below match with one you have in mind?

Case #1:

Let’s say your test data looks like this: Datetime | Test id | Test type | Test result
Example: 2024-08-20 10:00:00 | 1 | A | 100
You run tests multiple times a day. You partition them hourly (e.g., 12 AM-1 AM, 1 AM-2 AM, 2 AM-3 AM, and so on). You want to compare the test results to the previous hour.

Case #2:

Using the data above, you want to define the range of the period in our app. For example, a range of every 4 hours (12 AM-4 AM, 4 AM-8 AM, 8 AM-12 PM, and so on).

Case #3:

You want to compare a value not only to its previous period but also to its future period. For example, compare test results to the previous 1 day and the next 1 day.

Case #4:

Can you describe an example for this part of your reply: “a scenario that involves running tests of different criteria where you need to compare the results. Datasets often need to be partitioned by more than just their date.”?

Thank you,
Phuong.

gwizard · August 26, 2024, 2:25pm

Say you’re building a software system that responds to customer queries based. There are multiple design factors that go into aand you’re team is building it out and changing different factors as time goes on and running tests to see how the system responds. To be able to understand how different changes within the system affect different aspects of the tests we need to be able to compare each partition not only with the last one, but others as we may be trying many different changes within the system.

So we’d need to compare “08/24 - test rank” against “08/22 - test rank” but also against “08/24 re-rank” and “08/22 re-rank” and “08/22 back-rank”

Phuong_N_Holistics · August 28, 2024, 3:09am

Hi @gwizard

Thanks so much for providing us a specific example.

With the case you mentioned above, I imagine you can actually create a pivot table to display the test data, with rank_type on the column and date on the row, something like this:

|        | test-rank | re-rank | back-rank |
| 08/22  |
| 08/24  |

I wonder what’s the reason you use PoP in this case? What result do you want to look at in particular (e.g. show % change)?

Thank you,
Phuong.

gwizard · August 28, 2024, 3:09pm

Yes we’re looking for % change over different metrics.

And we’re using partitions because the rank_type field I gave you is an example, there are multiple other types of changes and classifications to be made that make a pivot table a poor choice because it is not consistent from date to date and can have new rank_type (and other classifiers I left out for ease of the example) on any given date while other types can be deprecated

Phuong_N_Holistics · August 30, 2024, 3:17am

Hi @gwizard

With the information you gave me, my understanding of your use case is that:

You run different tests on different dates, with different classifiers.
Your goal is to compare the results of those tests and see the % change among them.
For example, you have different metrics such as:
-Metric 1: test result where date is 08/2024 and rank_type is “test rank”
-Metric 2: test result where date is 08/2022 and rank_type is “test rank”
-Metric 3: test result where date is 08/2024 and rank_type is “re-rank”
-and so on…

Besides rank_type, in some cases you also want to apply other classifiers as well.

If I generalize your use case, it would be:

You want to be able to apply different conditions on each individual field, or “partitions” as in your term.
You want to be able to quickly show % change over those metrics.
It’ll look something like this:

Garrett - compare partitions794×180 12.4 KB

Please help confirm or correct my understanding. We’re trying to understand the root of your problems so that we can build features that are most valuable to you.

If you can give a quick example in the Excel/Google Sheet to demonstrate what you want to achieve (like my screenshot above), we’d really appreciate it!

Thank you,
Phuong.

gwizard · August 30, 2024, 8:31pm

I’m simply asking for the same tools that you have for comparing against different dates, but selectable by partition.

We have multiple metrics measuring at different LOD that we’re cutting up and comparing a few different ways. Right now we can only have each partition compared against the last one in the dataset and can allow the user to choose their partition with two filters (Date + Tag).

Ideally we’d be able to have another set of filters here where the user can select the partition they’d like to compare against. This wouldn’t just be helpful for this scenario, but any scenario in which you have a multivariate tests being performed that you would like to compare results on.

Phuong_N_Holistics · September 4, 2024, 1:12am

Thank you @gwizard for the explanations.

I’ve added your feature request to our backlog. We’ll keep you updated when we have a solution for it. Thank you for your patience!

Best regards,
Phuong.