In BigQuery, the primary out of the box function to calculate median (PERCENTILE_CONT
) can only be used as part of a window function. This seems to work when combined with AQL functions that are aggregating or when combined SQL functions using pre-defined aggregation types (Holistics is able to turn those into window functions as well for the most part), but not when combined with SQL functions using a custom aggregation type (can’t do a window function AND aggregation in the same SELECT statement).
I’ve got a sample model below that should work for testing for anyone regardless of underlying data since it generates its own fake data (you might need to se the data_source_name). When selecting:
- Median Quantity SQL and Sum Total Price SQL - Custom Aggregation, this fails
- Median Quantity AQL and Sum Total Price SQL - Custom Aggregation, this fails
- Pretty much any other combination of measures works
Error statement: SELECT list expression references test.quantity which is neither grouped nor aggregated
I figure this is probably a limitation that’s more due to BigQuery than Holistics, so somewhat out of their hands (BigQuery would need a median function that aggregates without needing a window function). That being said, I wanted to share in case other people are running into a similar issue, or in case Holistics can address this bug somehow. We’ve got a workaround to instead use a SQL field with custom aggregation and define it as APPROX_QUANTILES(field, 100)[OFFSET(50)]
. We’re also planning to prevent any fields using the aggregation_type: median
in our planned SQL validation as part of CI/CD, though we won’t be validating AQL and can’t validate ad hoc AQL fields so it’s still possible this bug could sneak in if someone uses the median()
function in AQL.
Model test {
label: 'Test'
description: ''
data_source_name: ''
type: 'query'
query: @sql
SELECT
1 as ID,
'abc-123' as val,
5 as unit_price,
100 as quantity
UNION ALL
SELECT
2 as ID,
'abc-456' as val,
8 as price,
50 as quantity
UNION ALL
SELECT
3 as ID,
'xyz-123' as val,
10 as price,
20 as quantity
;;
dimension id {
label: 'Id'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.ID }};;
}
dimension val {
label: 'Value'
type: 'text'
hidden: false
definition: @sql {{ #SOURCE.val }};;
}
dimension unit_price {
label: 'Unit Price'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.unit_price }} ;;
}
dimension quantity {
label: 'Quantity'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.quantity }} ;;
}
dimension total_price_sql {
label: 'Total Price SQL'
type: 'number'
hidden: false
definition: @sql {{ unit_price }} * {{ quantity }} ;;
}
measure sum_total_price_sql_custom_agg {
label: 'Sum Total Price SQL - Custom Aggregation'
type: 'number'
hidden: false
definition: @sql SUM({{total_price_sql}}) ;;
}
measure sum_total_price_sql_sum_agg {
label: 'Sum Total Price SQL - Sum Aggregation'
type: 'number'
hidden: false
definition: @sql {{total_price_sql}} ;;
aggregation_type: 'sum'
}
measure sum_total_price_aql {
label: 'Sum Total Price AQL'
type: 'number'
hidden: false
definition: @aql sum(test.total_price_sql) ;;
}
measure median_quantity_sql {
label: 'Median Quantity SQL'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.quantity }} ;;
aggregation_type: 'median'
}
measure median_quantity_aql {
label: 'Median Quantity AQL'
type: 'number'
hidden: false
definition: @aql median(test.quantity) ;;
}
}