Most operators in Snowflake are not null safe, which causes confusion among non-technical data users when using the is not operator in “Conditions”.
Some users who are not familiar with the data may even be unaware that they are not retrieving their intended output.
bank_name column contains the following values:
In most cases, users want to retrieve all rows that are not equal to
bank_b (including NULL). In order to achieve this, the WHERE condition has to be:
where coalesce(bank_name, 'NA') not in ('bank_a','bank_b')
where bank_name not in ('bank_a','bank_b')
This is because Snowflake will treat NULLs as unknown values in the 2nd case, and exclude them from the comparison (and therefore the query output).
We define a custom dimension from
bank_name which assigns a
'NA' to NULL values, and hide the
However, this is not ideal because:
Reports using the
bank_namedimension previously have to be updated to use new custom dimension
The same behaviour will be observed for other columns, and it is not scalable to recreate custom dimensions for all.
We want to keep NULL values as is for some columns