Commit 0a901dd
committed
[SPARK-7231] [SPARKR] Changes to make SparkR DataFrame dplyr friendly.
Changes include
1. Rename sortDF to arrange
2. Add new aliases `group_by` and `sample_frac`, `summarize`
3. Add more user friendly column addition (mutate), rename
4. Support mean as an alias for avg in Scala and also support n_distinct, n as in dplyr
Using these changes we can pretty much run the examples as described in http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html with the same syntax
The only thing missing in SparkR is auto resolving column names when used in an expression i.e. making something like `select(flights, delay)` works in dply but we right now need `select(flights, flights$delay)` or `select(flights, "delay")`. But this is a complicated change and I'll file a new issue for it
cc sun-rui rxin
Author: Shivaram Venkataraman <[email protected]>
Closes apache#6005 from shivaram/sparkr-df-api and squashes the following commits:
5e0716a [Shivaram Venkataraman] Fix some roxygen bugs
1254953 [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into sparkr-df-api
0521149 [Shivaram Venkataraman] Changes to make SparkR DataFrame dplyr friendly. Changes include 1. Rename sortDF to arrange 2. Add new aliases `group_by` and `sample_frac`, `summarize` 3. Add more user friendly column addition (mutate), rename 4. Support mean as an alias for avg in Scala and also support n_distinct, n as in dplyr1 parent b6c797b commit 0a901dd
File tree
8 files changed
+249
-29
lines changed- R/pkg
- R
- inst/tests
- sql/core/src
- main/scala/org/apache/spark/sql
- test/scala/org/apache/spark/sql
8 files changed
+249
-29
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
| 33 | + | |
31 | 34 | | |
32 | 35 | | |
33 | 36 | | |
34 | 37 | | |
| 38 | + | |
35 | 39 | | |
36 | 40 | | |
| 41 | + | |
37 | 42 | | |
38 | 43 | | |
39 | 44 | | |
| |||
42 | 47 | | |
43 | 48 | | |
44 | 49 | | |
45 | | - | |
| 50 | + | |
46 | 51 | | |
47 | 52 | | |
48 | 53 | | |
| |||
72 | 77 | | |
73 | 78 | | |
74 | 79 | | |
| 80 | + | |
| 81 | + | |
75 | 82 | | |
76 | 83 | | |
77 | 84 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
480 | 480 | | |
481 | 481 | | |
482 | 482 | | |
| 483 | + | |
483 | 484 | | |
484 | 485 | | |
485 | 486 | | |
| |||
501 | 502 | | |
502 | 503 | | |
503 | 504 | | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
504 | 514 | | |
505 | 515 | | |
506 | 516 | | |
| |||
682 | 692 | | |
683 | 693 | | |
684 | 694 | | |
685 | | - | |
| 695 | + | |
| 696 | + | |
686 | 697 | | |
687 | 698 | | |
688 | 699 | | |
| |||
705 | 716 | | |
706 | 717 | | |
707 | 718 | | |
708 | | - | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
709 | 728 | | |
710 | 729 | | |
711 | 730 | | |
712 | 731 | | |
713 | 732 | | |
| 733 | + | |
714 | 734 | | |
715 | 735 | | |
716 | 736 | | |
717 | 737 | | |
718 | 738 | | |
719 | 739 | | |
720 | 740 | | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
721 | 749 | | |
722 | 750 | | |
723 | 751 | | |
| |||
886 | 914 | | |
887 | 915 | | |
888 | 916 | | |
889 | | - | |
| 917 | + | |
890 | 918 | | |
891 | 919 | | |
892 | 920 | | |
| |||
946 | 974 | | |
947 | 975 | | |
948 | 976 | | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
949 | 1013 | | |
950 | 1014 | | |
951 | 1015 | | |
| |||
977 | 1041 | | |
978 | 1042 | | |
979 | 1043 | | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
980 | 1082 | | |
981 | 1083 | | |
982 | | - | |
| 1084 | + | |
983 | 1085 | | |
984 | 1086 | | |
985 | 1087 | | |
986 | 1088 | | |
987 | 1089 | | |
988 | 1090 | | |
989 | 1091 | | |
990 | | - | |
| 1092 | + | |
991 | 1093 | | |
992 | 1094 | | |
993 | 1095 | | |
994 | 1096 | | |
995 | 1097 | | |
996 | 1098 | | |
997 | 1099 | | |
998 | | - | |
999 | | - | |
1000 | | - | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
1001 | 1103 | | |
1002 | | - | |
| 1104 | + | |
1003 | 1105 | | |
1004 | 1106 | | |
1005 | 1107 | | |
| |||
1013 | 1115 | | |
1014 | 1116 | | |
1015 | 1117 | | |
1016 | | - | |
| 1118 | + | |
1017 | 1119 | | |
1018 | 1120 | | |
1019 | 1121 | | |
1020 | 1122 | | |
1021 | | - | |
| 1123 | + | |
1022 | 1124 | | |
1023 | 1125 | | |
1024 | 1126 | | |
1025 | 1127 | | |
1026 | 1128 | | |
1027 | 1129 | | |
1028 | 1130 | | |
1029 | | - | |
| 1131 | + | |
1030 | 1132 | | |
1031 | 1133 | | |
1032 | 1134 | | |
| |||
1106 | 1208 | | |
1107 | 1209 | | |
1108 | 1210 | | |
| 1211 | + | |
1109 | 1212 | | |
1110 | 1213 | | |
1111 | 1214 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| 134 | + | |
| 135 | + | |
134 | 136 | | |
135 | 137 | | |
136 | 138 | | |
| |||
141 | 143 | | |
142 | 144 | | |
143 | 145 | | |
| 146 | + | |
| 147 | + | |
144 | 148 | | |
145 | 149 | | |
| 150 | + | |
| 151 | + | |
146 | 152 | | |
147 | 153 | | |
148 | 154 | | |
| |||
152 | 158 | | |
153 | 159 | | |
154 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
155 | 164 | | |
156 | 165 | | |
157 | 166 | | |
| |||
173 | 182 | | |
174 | 183 | | |
175 | 184 | | |
176 | | - | |
177 | | - | |
| 185 | + | |
| 186 | + | |
178 | 187 | | |
179 | 188 | | |
180 | 189 | | |
| |||
184 | 193 | | |
185 | 194 | | |
186 | 195 | | |
187 | | - | |
188 | | - | |
| 196 | + | |
| 197 | + | |
189 | 198 | | |
190 | 199 | | |
191 | 200 | | |
| |||
197 | 206 | | |
198 | 207 | | |
199 | 208 | | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
0 commit comments