Commit 11abc64
[SPARK-47094][SQL] SPJ : Dynamically rebalance number of buckets when they are not equal
### What changes were proposed in this pull request?
-- Allow SPJ between 'compatible' bucket funtions
-- Add a mechanism to define 'reducible' functions, one function whose output can be 'reduced' to another for all inputs.
### Why are the changes needed?
-- SPJ currently applies only if the partition transform expressions on both sides are identifical.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added new tests in KeyGroupedPartitioningSuite
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #45267 from szehon-ho/spj-uneven-buckets.
Authored-by: Szehon Ho <[email protected]>
Signed-off-by: Chao Sun <[email protected]>1 parent d1ace24 commit 11abc64
File tree
9 files changed
+821
-15
lines changed- sql
- catalyst/src/main
- java/org/apache/spark/sql/connector/catalog/functions
- scala/org/apache/spark/sql
- catalyst
- expressions
- plans/physical
- internal
- core/src
- main/scala/org/apache/spark/sql/execution
- datasources/v2
- exchange
- test/scala/org/apache/spark/sql/connector
- catalog/functions
9 files changed
+821
-15
lines changedLines changed: 42 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java
Lines changed: 106 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
Lines changed: 56 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
57 | 112 | | |
58 | 113 | | |
59 | 114 | | |
| |||
Lines changed: 49 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
833 | 834 | | |
834 | 835 | | |
835 | 836 | | |
836 | | - | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
837 | 844 | | |
838 | 845 | | |
839 | 846 | | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
840 | 873 | | |
841 | 874 | | |
842 | 875 | | |
| |||
846 | 879 | | |
847 | 880 | | |
848 | 881 | | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
849 | 897 | | |
850 | 898 | | |
851 | 899 | | |
| |||
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1558 | 1558 | | |
1559 | 1559 | | |
1560 | 1560 | | |
| 1561 | + | |
| 1562 | + | |
| 1563 | + | |
| 1564 | + | |
| 1565 | + | |
| 1566 | + | |
| 1567 | + | |
| 1568 | + | |
| 1569 | + | |
| 1570 | + | |
| 1571 | + | |
| 1572 | + | |
1561 | 1573 | | |
1562 | 1574 | | |
1563 | 1575 | | |
| |||
5323 | 5335 | | |
5324 | 5336 | | |
5325 | 5337 | | |
| 5338 | + | |
| 5339 | + | |
| 5340 | + | |
5326 | 5341 | | |
5327 | 5342 | | |
5328 | 5343 | | |
| |||
Lines changed: 17 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
| |||
164 | 165 | | |
165 | 166 | | |
166 | 167 | | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
167 | 180 | | |
168 | 181 | | |
169 | 182 | | |
| |||
174 | 187 | | |
175 | 188 | | |
176 | 189 | | |
177 | | - | |
| 190 | + | |
178 | 191 | | |
179 | 192 | | |
180 | 193 | | |
| |||
207 | 220 | | |
208 | 221 | | |
209 | 222 | | |
210 | | - | |
| 223 | + | |
211 | 224 | | |
212 | 225 | | |
213 | 226 | | |
| |||
259 | 272 | | |
260 | 273 | | |
261 | 274 | | |
| 275 | + | |
262 | 276 | | |
263 | 277 | | |
264 | 278 | | |
| |||
0 commit comments