Holistics Query Syntax still selecting wide table

I’m following the guidance from this:

I created a simple query model with this in the transform aml (type: ‘query’):

 query: @sql
    select
       {{ #s.exam_event_id }}
      ,  {{ #s.team_name }}
      ,  {{ #s.examiner_name }}
      ,  {{ #s.stack_assignment_id }}
      ,  {{ #s.calibration }}
      ,  {{ #s.assigned }}
      ,  {{ #s.completed }}
      ,  {{ #s.last_completed }}
      ,  {{ #s.stack_seconds }}
      ,  {{ #s.stack_interval_seconds }}
      ,  {{ #s.updated }}
      ,  {{ #s.completed_hour }}
    from
      {{ #warehouse_etl_stack_assignment_examiner_timing as s }}
    WHERE
      -- exclude stacks that were left open more than 30min
      ({{ #s.completed }} - {{ #s.assigned }} < interval '30 minutes') ;;
 
 models: [
    warehouse_etl_stack_assignment_examiner_timing
  ] 

where warehouse_etl_stack_assignment_examiner_timing is a table model.

When I run a query on this dataset, the generated sql includes the entire column list. Is that correct?

here’s the generated sql is below (I expect the CTE to only include the columns in the setup):

WITH "stack_assignment_timing" AS (
  select
     "s"."exam_event_id"
    ,  "s"."team_name"
    ,  "s"."examiner_name"
    ,  "s"."stack_assignment_id"
    ,  "s"."calibration"
    ,  "s"."assigned"
    ,  "s"."completed"
    ,  "s"."last_completed"
    ,  "s"."stack_seconds"
    ,  "s"."stack_interval_seconds"
    ,  "s"."updated"
    ,  "s"."completed_hour"
  from
    (
      SELECT
        "warehouse_etl_stack_assignment_examiner_timing"."exam_event_id" AS "exam_event_id",
        "warehouse_etl_stack_assignment_examiner_timing"."team_name" AS "team_name",
        "warehouse_etl_stack_assignment_examiner_timing"."examiner_name" AS "examiner_name",
        "warehouse_etl_stack_assignment_examiner_timing"."stack_assignment_id" AS "stack_assignment_id",
        "warehouse_etl_stack_assignment_examiner_timing"."calibration" AS "calibration",
        "warehouse_etl_stack_assignment_examiner_timing"."assigned" AS "assigned",
        "warehouse_etl_stack_assignment_examiner_timing"."completed" AS "completed",
        "warehouse_etl_stack_assignment_examiner_timing"."last_completed" AS "last_completed",
        "warehouse_etl_stack_assignment_examiner_timing"."stack_seconds" AS "stack_seconds",
        "warehouse_etl_stack_assignment_examiner_timing"."stack_interval_seconds" AS "stack_interval_seconds",
        "warehouse_etl_stack_assignment_examiner_timing"."updated" AS "updated",
        date_part('hour', "warehouse_etl_stack_assignment_examiner_timing"."completed") AS "completed_hour"
      FROM
        "cfa"."warehouse_etl_stack_assignment_examiner_timing" "warehouse_etl_stack_assignment_examiner_timing"
    ) AS "s"
  WHERE
    -- exclude stacks that were left open more than 30min
    ("s"."completed" - "s"."assigned" < interval '30 minutes')
)
SELECT
  TO_CHAR((CAST ( "stack_assignment_timing"."completed" AS timestamptz )) AT TIME ZONE 'America/New_York', 'YYYY-MM-DD HH24:MI:SS.US') AS "sat_c_96517b",
  "stack_assignment_timing"."team_name" AS "sat_tn_8bb5e9",
  COUNT("stack_assignment_timing"."stack_assignment_id") AS "c_sat_s_49aabc"
FROM
  "stack_assignment_timing"
GROUP BY
  1,
  2
ORDER BY
  3 DESC
LIMIT 10000

Hi @Matt_Stone,

This is the expected behavior, which is documented here. The CTE selects fields defined in your query model, and the main query selects the 3 fields that you have chosen for your visualization.

If this is not the expected result, you can try looking into the persistence settings as mentioned here.

Thanks. Glad it’s working as expected.

Hi all,

Thanks, Matt, for posting the question.

I’d like to add a brief note/comment here: while the current behavior is expected, we at Holistics are keeping an eye out for optimizations to automatically push down projections/predicates further into your custom SQLs to optimize the whole queries further.