Null safe "is not" filter condition

Problem
Most operators in Snowflake are not null safe, which causes confusion among non-technical data users when using the is not operator in “Conditions”.

Some users who are not familiar with the data may even be unaware that they are not retrieving their intended output.

Example
A bank_name column contains the following values:

  • NULL

  • bank_a

  • bank_b

  • bank_c

In most cases, users want to retrieve all rows that are not equal to bank_a or bank_b (including NULL). In order to achieve this, the WHERE condition has to be:

where coalesce(bank_name, 'NA') not in ('bank_a','bank_b')

instead of

where bank_name not in ('bank_a','bank_b')

This is because Snowflake will treat NULLs as unknown values in the 2nd case, and exclude them from the comparison (and therefore the query output).

Current Workaround
We define a custom dimension from bank_name which assigns a 'NA' to NULL values, and hide the bank_name column.

However, this is not ideal because:

  1. Reports using the bank_name dimension previously have to be updated to use new custom dimension

  2. The same behaviour will be observed for other columns, and it is not scalable to recreate custom dimensions for all.

  3. We want to keep NULL values as is for some columns

Hi @jingyu,
I do agree that the Null behavior in Holistics could be handled much better, thank you for sharing a very detailed use case and example.
I will add this to our backlog for future improvement :grin:

1 Like