Skip to content

Commit 818e739

Browse files
authored
add extend sql for issue-11212 (#15500)
1 parent bde9803 commit 818e739

File tree

2 files changed

+34
-2
lines changed

2 files changed

+34
-2
lines changed

benchmarks/queries/clickbench/README.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,12 +93,14 @@ LIMIT 10;
9393

9494
Results look like
9595

96+
```
9697
+-------------+---------------------+---+------+------+------+
9798
| ClientIP | WatchID | c | tmin | tmed | tmax |
9899
+-------------+---------------------+---+------+------+------+
99100
| 1611957945 | 6655575552203051303 | 2 | 0 | 0 | 0 |
100101
| -1402644643 | 8566928176839891583 | 2 | 0 | 0 | 0 |
101102
+-------------+---------------------+---+------+------+------+
103+
```
102104

103105

104106
### Q5: Response start time distribution analysis (p95)
@@ -120,13 +122,42 @@ LIMIT 10;
120122
```
121123

122124
Results look like
123-
125+
```
124126
+-------------+---------------------+---+------+------+------+
125127
| ClientIP | WatchID | c | tmin | tp95 | tmax |
126128
+-------------+---------------------+---+------+------+------+
127129
| 1611957945 | 6655575552203051303 | 2 | 0 | 0 | 0 |
128130
| -1402644643 | 8566928176839891583 | 2 | 0 | 0 | 0 |
129131
+-------------+---------------------+---+------+------+------+
132+
```
133+
134+
### Q6: How many social shares meet complex multi-stage filtering criteria?
135+
**Question**: What is the count of sharing actions from iPhone mobile users on specific social networks, within common timezones, participating in seasonal campaigns, with high screen resolutions and closely matched UTM parameters?
136+
**Important Query Properties**: Simple filter with high-selectivity, Costly string matching, A large number of filters with high overhead are positioned relatively later in the process
137+
138+
```sql
139+
SELECT COUNT(*) AS ShareCount
140+
FROM hits
141+
WHERE
142+
-- Stage 1: High-selectivity filters (fast)
143+
"IsMobile" = 1 -- Filter mobile users
144+
AND "MobilePhoneModel" LIKE 'iPhone%' -- Match iPhone models
145+
AND "SocialAction" = 'share' -- Identify social sharing actions
146+
147+
-- Stage 2: Moderate filters (cheap)
148+
AND "SocialSourceNetworkID" IN (5, 12) -- Filter specific social networks
149+
AND "ClientTimeZone" BETWEEN -5 AND 5 -- Restrict to common timezones
150+
151+
-- Stage 3: Heavy computations (expensive)
152+
AND regexp_match("Referer", '\/campaign\/(spring|summer)_promo') IS NOT NULL -- Find campaign-specific referrers
153+
AND CASE
154+
WHEN split_part(split_part("URL", 'resolution=', 2), '&', 1) ~ '^\d+$'
155+
THEN split_part(split_part("URL", 'resolution=', 2), '&', 1)::INT
156+
ELSE 0
157+
END > 1920 -- Extract and validate resolution parameter
158+
AND levenshtein("UTMSource", "UTMCampaign") < 3 -- Verify UTM parameter similarity
159+
```
160+
Result is empty,Since it has already been filtered by `"SocialAction" = 'share'`.
130161

131162

132163
## Data Notes

benchmarks/queries/clickbench/extended.sql

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@ SELECT COUNT(DISTINCT "HitColor"), COUNT(DISTINCT "BrowserCountry"), COUNT(DISTI
33
SELECT "BrowserCountry", COUNT(DISTINCT "SocialNetwork"), COUNT(DISTINCT "HitColor"), COUNT(DISTINCT "BrowserLanguage"), COUNT(DISTINCT "SocialAction") FROM hits GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
44
SELECT "SocialSourceNetworkID", "RegionID", COUNT(*), AVG("Age"), AVG("ParamPrice"), STDDEV("ParamPrice") as s, VAR("ParamPrice") FROM hits GROUP BY "SocialSourceNetworkID", "RegionID" HAVING s IS NOT NULL ORDER BY s DESC LIMIT 10;
55
SELECT "ClientIP", "WatchID", COUNT(*) c, MIN("ResponseStartTiming") tmin, MEDIAN("ResponseStartTiming") tmed, MAX("ResponseStartTiming") tmax FROM hits WHERE "JavaEnable" = 0 GROUP BY "ClientIP", "WatchID" HAVING c > 1 ORDER BY tmed DESC LIMIT 10;
6-
SELECT "ClientIP", "WatchID", COUNT(*) c, MIN("ResponseStartTiming") tmin, APPROX_PERCENTILE_CONT("ResponseStartTiming", 0.95) tp95, MAX("ResponseStartTiming") tmax FROM 'hits' WHERE "JavaEnable" = 0 GROUP BY "ClientIP", "WatchID" HAVING c > 1 ORDER BY tp95 DESC LIMIT 10;
6+
SELECT "ClientIP", "WatchID", COUNT(*) c, MIN("ResponseStartTiming") tmin, APPROX_PERCENTILE_CONT("ResponseStartTiming", 0.95) tp95, MAX("ResponseStartTiming") tmax FROM 'hits' WHERE "JavaEnable" = 0 GROUP BY "ClientIP", "WatchID" HAVING c > 1 ORDER BY tp95 DESC LIMIT 10;
7+
SELECT COUNT(*) AS ShareCount FROM hits WHERE "IsMobile" = 1 AND "MobilePhoneModel" LIKE 'iPhone%' AND "SocialAction" = 'share' AND "SocialSourceNetworkID" IN (5, 12) AND "ClientTimeZone" BETWEEN -5 AND 5 AND regexp_match("Referer", '\/campaign\/(spring|summer)_promo') IS NOT NULL AND CASE WHEN split_part(split_part("URL", 'resolution=', 2), '&', 1) ~ '^\d+$' THEN split_part(split_part("URL", 'resolution=', 2), '&', 1)::INT ELSE 0 END > 1920 AND levenshtein("UTMSource", "UTMCampaign") < 3;

0 commit comments

Comments
 (0)