[SQL] Minor: Introduce SchemaRDD#aggregate() for simple aggregations #874

aarondav · 2014-05-25T21:00:14Z

rdd.aggregate(Sum('val))

is just shorthand for

rdd.groupBy()(Sum('val))

but seems be more natural than doing a groupBy with no grouping expressions when you really just want an aggregation over all rows.

Did not add a JavaSchemaRDD or Python API, as these seem to be lacking several other methods like groupBy() already -- leaving that cleanup for future patches.

rdd.aggregate(Sum('val)) is just shorthand for rdd.groupBy()(Sum('val)), but seems be more natural than doing a groupBy with no grouping expressions when you really just want an aggregation over all rows. Did not add a JavaSchemaRDD or Python API, as these seem to be lacking in several other methods like groupBy() already -- leaving that cleanup for future patches.

aarondav · 2014-05-25T21:01:23Z

sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala

This example doesn't compile with the \ in there.

AmplabJenkins · 2014-05-25T21:02:58Z

Merged build triggered.

AmplabJenkins · 2014-05-25T21:03:04Z

Merged build started.

rxin · 2014-05-25T21:55:51Z

sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala

Can you say in the scaladoc that this is equivalent to groupBy()(...) ?

rxin · 2014-05-25T21:56:19Z

LGTM other than the small addition to scaladoc.

AmplabJenkins · 2014-05-25T22:16:03Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-25T22:16:03Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15188/

aarondav · 2014-05-25T23:57:11Z

Added comment!

AmplabJenkins · 2014-05-25T23:57:58Z

Merged build triggered.

AmplabJenkins · 2014-05-25T23:58:04Z

Merged build started.

rxin · 2014-05-25T23:58:23Z

LGTM

AmplabJenkins · 2014-05-26T01:01:59Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-26T01:01:59Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15195/

rxin · 2014-05-26T01:37:35Z

I've merged this into master & branch-1.0.

```scala rdd.aggregate(Sum('val)) ``` is just shorthand for ```scala rdd.groupBy()(Sum('val)) ``` but seems be more natural than doing a groupBy with no grouping expressions when you really just want an aggregation over all rows. Did not add a JavaSchemaRDD or Python API, as these seem to be lacking several other methods like groupBy() already -- leaving that cleanup for future patches. Author: Aaron Davidson <[email protected]> Closes #874 from aarondav/schemardd and squashes the following commits: e9e68ee [Aaron Davidson] Add comment db6afe2 [Aaron Davidson] Introduce SchemaRDD#aggregate() for simple aggregations (cherry picked from commit c3576ff) Signed-off-by: Reynold Xin <[email protected]>

```scala rdd.aggregate(Sum('val)) ``` is just shorthand for ```scala rdd.groupBy()(Sum('val)) ``` but seems be more natural than doing a groupBy with no grouping expressions when you really just want an aggregation over all rows. Did not add a JavaSchemaRDD or Python API, as these seem to be lacking several other methods like groupBy() already -- leaving that cleanup for future patches. Author: Aaron Davidson <[email protected]> Closes apache#874 from aarondav/schemardd and squashes the following commits: e9e68ee [Aaron Davidson] Add comment db6afe2 [Aaron Davidson] Introduce SchemaRDD#aggregate() for simple aggregations

) - make "mapr.spark.user.secret" config optional - review all mapr-specific volumes and make them optional

aarondav changed the title ~~Introduce SchemaRDD#aggregate() for simple aggregations~~ [SQL] Minor: Introduce SchemaRDD#aggregate() for simple aggregations May 25, 2014

aarondav reviewed May 25, 2014
View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala

Copy link

Contributor Author

aarondav May 25, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example doesn't compile with the \ in there.

rxin reviewed May 25, 2014
View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala

Copy link

Contributor

rxin May 25, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say in the scaladoc that this is equivalent to groupBy()(...) ?

Add comment

e9e68ee

asfgit closed this in c3576ff May 26, 2014

agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022

EZSPA-270 - adopt mapr spark feature to work in non-mapr env (apache#874

b339696

) - make "mapr.spark.user.secret" config optional - review all mapr-specific volumes and make them optional

wangyum pushed a commit that referenced this pull request May 26, 2023

[CARMEL-5861] Limit the max size of statement and executePlan (#874)

7a07647

udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024

EZSPA-270 - adopt mapr spark feature to work in non-mapr env (apache#874

7504dea

) - make "mapr.spark.user.secret" config optional - review all mapr-specific volumes and make them optional

mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025

EZSPA-270 - adopt mapr spark feature to work in non-mapr env (apache#874

625fa5a

) - make "mapr.spark.user.secret" config optional - review all mapr-specific volumes and make them optional

[SQL] Minor: Introduce SchemaRDD#aggregate() for simple aggregations #874

[SQL] Minor: Introduce SchemaRDD#aggregate() for simple aggregations #874

Uh oh!

Conversation

aarondav commented May 25, 2014

Uh oh!

aarondav May 25, 2014

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

rxin May 25, 2014

Choose a reason for hiding this comment

Uh oh!

rxin commented May 25, 2014

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

aarondav commented May 25, 2014

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

AmplabJenkins commented May 25, 2014

Uh oh!

rxin commented May 25, 2014

Uh oh!

AmplabJenkins commented May 26, 2014

Uh oh!

AmplabJenkins commented May 26, 2014

Uh oh!

rxin commented May 26, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants