Skip to content

Conversation

@yhuai
Copy link
Contributor

@yhuai yhuai commented Nov 9, 2015

https://issues.apache.org/jira/browse/SPARK-9830

This PR contains the following main changes.

  • Removing AggregateExpression1.
  • Removing Aggregate operator, which is used to evaluate AggregateExpression1.
  • Removing planner rule used to plan Aggregate.
  • Linking MultipleDistinctRewriter to analyzer.
  • Renaming AggregateExpression2 to AggregateExpression and AggregateFunction2 to AggregateFunction.
  • Updating places where we create aggregate expression. The way to create aggregate expressions is AggregateExpression(aggregateFunction, mode, isDistinct).
  • Changing vals in DeclarativeAggregates that touch children of this function to lazy vals (when we create aggregate expression in DataFrame API, children of an aggregate function can be unresolved).

@rxin
Copy link
Contributor

rxin commented Nov 9, 2015

YAY.

@yhuai
Copy link
Contributor Author

yhuai commented Nov 9, 2015

test this please

@SparkQA
Copy link

SparkQA commented Nov 9, 2015

Test build #45342 has finished for PR 9556 at commit cb3f4d3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 9, 2015

Test build #45350 has finished for PR 9556 at commit cb3f4d3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and the error message mention a new aggregation path. We should update this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivaram Seems this default value should be 0.05 instead of 0.95, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like it should be 0.05 -- cc @davies
This seems to have been added back when SparkR was a separate code-base in davies/SparkR-pkg@d7b17a4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@yhuai
Copy link
Contributor Author

yhuai commented Nov 10, 2015

test this please

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45474 has finished for PR 9556 at commit fb64896.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #2027 has finished for PR 9556 at commit fb64896.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45476 has finished for PR 9556 at commit fb64896.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin @marmbrus It is better to change the prettyString of UnresolvedFunction to get rid of that ' (e.g. 'abs). The impact of this change is that if any user rely on that ', his/her code will break.

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45492 has finished for PR 9556 at commit 16826e6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agg2 -> agg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@marmbrus
Copy link
Contributor

We should keep reviewing and address any comments in a follow-up, but I'm going to merge this now to unblock other work. Thanks!

asfgit pushed a commit that referenced this pull request Nov 10, 2015
…used to evaluate AggregateExpression1s

https://issues.apache.org/jira/browse/SPARK-9830

This PR contains the following main changes.
* Removing `AggregateExpression1`.
* Removing `Aggregate` operator, which is used to evaluate `AggregateExpression1`.
* Removing planner rule used to plan `Aggregate`.
* Linking `MultipleDistinctRewriter` to analyzer.
* Renaming `AggregateExpression2` to `AggregateExpression` and `AggregateFunction2` to `AggregateFunction`.
* Updating places where we create aggregate expression. The way to create aggregate expressions is `AggregateExpression(aggregateFunction, mode, isDistinct)`.
* Changing `val`s in `DeclarativeAggregate`s that touch children of this function to `lazy val`s (when we create aggregate expression in DataFrame API, children of an aggregate function can be unresolved).

Author: Yin Huai <[email protected]>

Closes #9556 from yhuai/removeAgg1.

(cherry picked from commit e0701c7)
Signed-off-by: Michael Armbrust <[email protected]>
@asfgit asfgit closed this in e0701c7 Nov 10, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the NullType here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should remove it. I somehow missed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

asfgit pushed a commit that referenced this pull request Nov 11, 2015
… and update toString of Exchange

https://issues.apache.org/jira/browse/SPARK-9830

This is the follow-up pr for #9556 to address davies' comments.

Author: Yin Huai <[email protected]>

Closes #9607 from yhuai/removeAgg1-followup.

(cherry picked from commit 3121e78)
Signed-off-by: Reynold Xin <[email protected]>
asfgit pushed a commit that referenced this pull request Nov 11, 2015
… and update toString of Exchange

https://issues.apache.org/jira/browse/SPARK-9830

This is the follow-up pr for #9556 to address davies' comments.

Author: Yin Huai <[email protected]>

Closes #9607 from yhuai/removeAgg1-followup.
kiszk pushed a commit to kiszk/spark-gpu that referenced this pull request Dec 26, 2015
… and update toString of Exchange

https://issues.apache.org/jira/browse/SPARK-9830

This is the follow-up pr for apache/spark#9556 to address davies' comments.

Author: Yin Huai <[email protected]>

Closes #9607 from yhuai/removeAgg1-followup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants