Role Playing Models / Model Aliases

Hi All,

Star schema modelling is a common form of modelling used in BI reporting layers.

For instance see the following for definitions and usage:

When creating a star schema you can link the same dimension table multiple times to a fact table. This happens quite frequently, and since within Holistics each model is treated as the same table you can’t link the same dimension table multiple times to a fact table, or else you end up with a circular dependency.

At the moment you have to create a whole new table, defining the same columns and custom dimensions time and time again in order to do this. This is obviously time consuming but also extremely hard to maintain as you have to make the same change in multiple places when an update is required to the model.

A better way to do this is to have a mechanism in place to reference another model (like a database view) that we can rename and use as such. Then we can reference multiple times, but only ever have to update the base model to get any changes pushed through.

For instance this could take place in the dataset builder when you pull in a model you can give it a an alternative name, allowing you to import the same model multiple times but join as a different table. Then in the SQL that is generated you can use the alternative name as an AS rename i.e.

SELECT *
FROM
  my_fact_table
  INNER JOIN my_dim_table AS from_table ON my_fact_table.from_col = from_table.id
  INNER JOIN my_dim_table AS to_table ON my_fact_table.to_col = to_table.id

And within the explore/report creation page you can have each table listed with their alternative names as different tables you can bring in columns from.

I believe Looker achieves the same in an explore:

Again the benefits to the approach are:

  1. Single definition of a model
  2. Star schema designs are much easier
  3. Renaming a model for context, rather than using as a general table i.e. origin and destination geo models can reference the same underlying geo model

Hi Holistics team,

In our workspace we have some key shared models that we reuse in many datasets like a dim_date model for instance.

When integrating this model in a dataset, it would be nice if we could just update the model label in order to provide more context and clarity to the end user.
=> Instead of simply Date we could call it invoice date or snapshot date depending on the context.

In some dataset, we also sometimes need to add the same model multiple times. This feature would be useful for this purpose as well.

Best regards,
Damien

I guess I should have done a little bit more digging before posting this because @DataGeekDude asked the same thing and explained it a lot better than me here :sweat_smile:.

If I understood correctly, AML 2.0 will solve the first usecase (changing the label fo a model in a specific dataset) but not the second one (using multiple occurrences of the same model in a dataset).

Yes unfortunately, as I understand it, that there is no solution to the model aliasing as of yet.

Hi @dacou
As far as I understand, you want to be:

  1. Able to change the model labels in a specific dataset:
    For example, the model ecommerce_users can be called Buyer Information in dataset A, but can also be Customer Details in dataset B.

  2. Able to add the same model multiple times to a dataset

Please let me know if I misunderstood anything.
If it’s your case, unfortunately, we haven’t supported them at the moment, and even AML 2.0 won’t help solve these issues.
Instead, I believe that the role playing model can be the solution for both of them. I’ll merge your case to Craig’s post, so that everyone can upvote this request :blush:

That’s exactly my need @di.hoang :ok_hand:
What are your plans on this ? Can you share any news ?

We do have a plan to support it but for now I cannot share with you the exact timeline.
I will definitely keep you in the loop for any update.

1 Like

Added my vote to this! We have a similar problem with tables that are used in different parts of the dataset joins.

Hi @olammas, could you share more details about your current dataset/model setup and your expected setup so that we can have a sense on how our upcoming solutions could benefit you?

Hi Khai. We have a dimension table that is connected to two different event types. To ensure that the join isn’t ambiguous I have created a duplicate of the dimension for each event type and therefore join path.

Let me know if I can provide more details.

Hi @olammas,

In this case, you can leverage our with_relationships() function to define the metric that requires the disabled relationship path.

For more information, please refer to our documentation here with_relationships | Holistics Docs.

Basically, you can disable the relationship from the Dim model to 1 of your Event Table (let’s call it A).
And when you need to build metric that require the join from Dim Model to the Event Table A, you can use with_relationships() function to force enable that relationship within the context of your Metric.

sum(event_a.revenue) | 
      with_relatonships(event_a.join_key > dim_model.join_key)

If the with_relationships() function doesn’t make sense to your case, please let me know.

2 Likes

Interesting. I can explore this once we move that dataset to AQL

1 Like