Create datasets easily across multiple databases in one data warehouse without transform models

Hi team,

We have one :snowflake: Snowflake warehouse with many databases - with one database for each source of data we import into Snowflake (eg. one Salesforce, one for Google Analytics, one for Twitter, etc etc). Each database is further split by schemas and I do not want to re-architect my Snowflake DWH.

:white_check_mark: I can make multiple connections for each Database in the DWH - eg. one connection for Salesforce data, and one connection for the Google Analytics data.

:white_check_mark: I can make datasets from models in each Database separately - eg. one dataset from Salesforce models, and one dataset from Google Analytics models.

I want to create a dataset with models joined across different databases in Snowflake - eg. combine data models from both Salesforce and Google Analytics data into one dataset.

:white_check_mark: All the back end connections are already set up,
:x: but the front end doesn’t allow me to see/select data into a dataset from multiple databases in the same DWH.

Can we make functionality available in the front end so that we can make datasets across any number of databases in the same DWH using the data modelling layer? This would save significant time and hassle having to create transform models for hundreds of tables across many databases in the same DWH.

Cheers :grinning:

Thank you for raising this Guido. Yes as mentioned, this is an oversight from our side when implementing Snowflake data source support. I’ll raise this up to the team and see how we can get this sorted asap.

Sharing some workarounds here so that others can reference:

A- Create a view in the main database that references the table in the second database.

use db1;
create view public.users2_other_db as
select * from db2.public.users2;

B- (4.0 only) Use fully qualified table name in the AML definition

See line 4 below.

Model users2 {
  type: 'table'
  data_source_name: 'snowflake'
  table_name: 'db2.public.users2'
  label: 'Users'

  dimension id {
    type: 'number'
    definition: @sql {{ #SOURCE.ID }};;
  }
}

It’s similar issue in SQL Server as well - when browing the database schemas, it only shows you the database you specified in the data source. Using fully qualified names we can use any database, but would be better to be able to browse all databases the account has permission to.
This is especially problematic when using ‘Create Model from Data Table’, as you can’t select the tables from another db.
Doing it via the code view is fine for 1 or 2 tables, but when you need to add multiple tables, it does slow you down a bit.

1 Like

Yes @david-ri - we have hundreds of tables so manual work arounds don’t really work for us here.

Has this made its way onto the roadmap yet? It would be hugely useful to be able to browse all objects the account has access to, not just the objects in the ‘default’ database

Yep, +1 for me. I want to create a single model using a Snowflake data source and a PostgreSQL data source. I’d struggle with any solution that isn’t point and click

1 Like

Hi all,
I guess that we are talking about different product functionalities:

1. @Guido_Stark mentions the ability to create a dataset from different Snowflake databases (or catalogs)

2. @david-ri has similar request to Guido’s, but for SQL Server

  • We have not planned to support this ability for SQL Server yet.
  • However, our team will check this case after implementing the cross-db joint for Snowflake. I’ll let you know if there is any further update on it.

3. Different from Guido and David, @Martin_Dell wants a different function: Ability to cross-join different data warehouses
Our product is designed to centralize all data sources into one place (we called it data warehouse). So we don’t support building a single model from different data warehouses.

Instead, we recommend you create an import model from Postgresql in Snowflake data warehouse (or vice versa). Then, create a transform model to join these data models.

May I check if I understood your cases correctly?

I’m not sure if this is the same of different issue. We use Postgress. We have namespaces per applications in a master warehouse db, each with the same structure. In holistices, each name space is a datasource with distinct postgress perms. For example:

  • Datawarehouse
    • Namespace: App_A (Postgres user → holistics data source)
      • Table: user_table (Holistics SQL data model)
    • Namespace: App_B (Postgres user → holistics data source)
      • Table: user_table (Holistics SQL data model)

The schema of the user_table is the same. I’m hoping to make

  1. a dashboard for admins that combines users in all apps
  2. a user for App A and App B that has data source permissions such that when they open the dashboard they only see their users

Should I use a transform model or a data set to combine these models (Using Modeling 4.0)? When I tried to make a SQL model, i’m getting permission errors.

Thanks.
.stone

1 Like