Skip to content

Support aggregates and constant filters in QUALIFY #17210

@alamb

Description

@alamb

@haohuaijin added support for QUALIFY in

@Vedin has pointed out some follow on work here

Hi @haohuaijin, I accidentally worked on the same feature. The approach is the same. So, I don't want anyhow conflict with your contribution and will just wait until your PR is merged. I just want to highlight 2 cases that probably are not currently covered in your PR.

  1. Aggregate functions with QUALIFY
CREATE TABLE qt (i INT, p VARCHAR, o INT) AS VALUES
  (1, 'A', 1),
  (2, 'A', 2),
  (3, 'B', 1),
  (4, 'B', 2);

SELECT p, SUM(o) AS s
FROM qt
GROUP BY p
QUALIFY RANK() OVER (ORDER BY s DESC) = 1
ORDER BY p;
  1. Constant filter + QUALIFY
CREATE TABLE web_base_events_this_run (
  domain_sessionid VARCHAR,
  app_id VARCHAR,
  page_view_id VARCHAR,
  derived_tstamp TIMESTAMP,
  dvce_created_tstamp TIMESTAMP,
  event_id VARCHAR
) AS SELECT * FROM VALUES
  ('ds1', 'appA', NULL, '2025-01-01 10:00:00'::timestamp, '2025-01-01 10:05:00'::timestamp, 'e1'),
  ('ds1', 'appA', NULL, '2025-01-01 11:00:00'::timestamp, '2025-01-01 11:00:00'::timestamp, 'e2'),
  ('ds1', 'appA', 'pv', '2025-01-01 12:00:00'::timestamp, '2025-01-01 12:00:00'::timestamp, 'e3'),
  ('ds2', 'appB', NULL, '2025-01-01 09:00:00'::timestamp, '2025-01-01 09:10:00'::timestamp, 'e4'),
  ('ds2', 'appB', NULL, '2025-01-01 09:05:00'::timestamp, '2025-01-01 09:09:00'::timestamp, 'e5');
  
  SELECT domain_sessionid, app_id
FROM web_base_events_this_run
WHERE page_view_id IS NULL
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY domain_sessionid
  ORDER BY derived_tstamp, dvce_created_tstamp, event_id
) = 1
ORDER BY domain_sessionid;

I covered the first one by adding qualify expressions to aggr_expr_haystack:

        let aggr_expr_haystack = select_exprs
            .iter()
            .chain(having_expr_opt.iter())
            .chain(qualify_expr_opt_pre_aggr.iter());

The second one required me to change the logic in common_subexpr_eliminate.rs. You can check it out here (Embucket#34). Maybe you'll come up with a better solution.
Hope this helps.

Originally posted by @Vedin in #16933 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsqlSQL Planner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions