I’m struggling with a “how do I think about this?” style question.
We have a simple model of transactions, with columns event, completed (datetime), user, hours (int) where seconds are the time spent on the transaction.
I want to calculate the median(hours) per user per day for each event.
These are volunteers. We want to answer a question like “How many hours per day does a typical volunteer work?” – I expect a value in the 3-5h range.
It’s easy to show a team roster of hours worked per day, if user and date are on the grid, but I want to just get one number for a day or an event.
the raw query looks something like this to get this:
WITH daily_hours AS (
SELECT
examiner_name,
exam_event_id,
DATE(completed) AS exam_date,
SUM(stack_seconds) / 3600.0 AS total_stack_hours
FROM cfa.warehouse_etl_stack_assignment_examiner_timing
GROUP BY examiner_name, exam_event_id, exam_date
)
SELECT
exam_event_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_stack_hours) AS median_stack_hours_per_event
FROM daily_hours
GROUP BY exam_event_id;
How do I do this with AQL? in model, or dataset, or both?