[SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers #24144

cloud-fan · 2019-03-19T12:04:33Z

What changes were proposed in this pull request?

Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the sketches library.

However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it.

All credits go to @pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests.

close #23778

How was this patch tested?

a new test

cloud-fan · 2019-03-19T12:08:40Z

Hi @pgandhi999 , I think you are right about the mismatch between Hive UDAF and Spark UDAF framework. Since this is a regression, and it may take a long time for you to get familiar with the Spark aggregate framework, I take it over and try to get this in before 2.4.1. Please take a look, thanks!

cloud-fan · 2019-03-19T12:09:00Z

also cc @gatorsmile

pgandhi999 · 2019-03-19T13:57:12Z

@cloud-fan Yes, you are right, this fix looks better. Will review the same. Thank you.

pgandhi999 · 2019-03-19T14:54:29Z

@cloud-fan I tested your PR with the test case mentioned in JIRA and it fails with the following error:

19/03/19 14:47:35 WARN TaskSetManager: Lost task 3.0 in stage 3.0 (TID 3, gsrd259n17.red.ygrid.yahoo.com, executor 1): java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to [Ljava.lang.Object;
	at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldData(StandardStructObjectInspector.java:170)
	at org.apache.spark.sql.hive.HiveInspectors.$anonfun$unwrapperFor$43(HiveInspectors.scala:689)
	at org.apache.spark.sql.hive.HiveInspectors.$anonfun$unwrapperFor$45(HiveInspectors.scala:693)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at scala.collection.TraversableLike.map(TraversableLike.scala:237)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
	at scala.collection.AbstractTraversable.map(Traversable.scala:108)
	at org.apache.spark.sql.hive.HiveInspectors.$anonfun$unwrapperFor$44(HiveInspectors.scala:693)
	at org.apache.spark.sql.hive.HiveUDAFFunction.eval(hiveUDFs.scala:434)
	at org.apache.spark.sql.hive.HiveUDAFFunction.eval(hiveUDFs.scala:307)
	at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.eval(interfaces.scala:543)
	at org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$5(AggregationIterator.scala:232)
	at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:86)
	at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:33)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:256)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:852)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:852)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:121)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:428)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1341)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:431)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

pgandhi999 · 2019-03-19T15:02:14Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala

IIUC you are checking whether buffer passed in the method is null and based on that you create a partial2 mode buffer. What if the buffer is not null but is of type partial1? Will that cause issues here?

It can happen if Spark initializes a UDAF, run update and then run merge. I don't think that will happen in Spark.

Hi, I came through the DataSketches hll issue, and find it is still problematic in spark 2.4.1, which was released with this change. Briefly the test case @pgandhi999 posted here #23778 is not passed from Spark 2.4.1. However another testcase passed when I created a DataFrame only based on an single-object array, which means the DataFrame is actually not distributed into multiple threads. I believe what @pgandhi999 said here came true, and more scary, which is unexpected in Spark per @cloud-fan. I post the error log bellow:
Caused by: java.lang.ClassCastException: com.yahoo.sketches.hive.hll.SketchState cannot be cast to com.yahoo.sketches.hive.hll.UnionState
at com.yahoo.sketches.hive.hll.SketchEvaluator.merge(SketchEvaluator.java:56)
at com.yahoo.sketches.hive.hll.DataToSketchUDAF$DataToSketchEvaluator.merge(DataToSketchUDAF.java:100)
at org.apache.spark.sql.hive.HiveUDAFFunction.merge(hiveUDFs.scala:430)
at org.apache.spark.sql.hive.HiveUDAFFunction.merge(hiveUDFs.scala:307)
at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.merge(interfaces.scala:539)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$2.apply(AggregationIterator.scala:174)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$2.apply(AggregationIterator.scala:174)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:188)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:182)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.findNextSortedGroup(ObjectAggregationIterator.scala:275)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.hasNext(ObjectAggregationIterator.scala:247)
at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.hasNext(ObjectAggregationIterator.scala:81)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:187)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

@m44444 Yes, the above issue has been addressed in PR #24149. Thank you for bringing it to our notice.

SparkQA · 2019-03-19T19:50:53Z

Test build #103673 has finished for PR 24144 at commit c45b7d4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-03-20T11:35:14Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala

We don't need a partial2 evaluator and a final evaluator. We just need one final evaluator.

The partial2 evaluator consumes agg buffer and produces agg buffer, while the final evaluator consumers agg buffer and produce final result. That said, the final evaluator can execute merge, and we don't need the partial2 evaluator.

SparkQA · 2019-03-20T18:47:24Z

Test build #103729 has finished for PR 24144 at commit ce5287e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

pgandhi999 · 2019-03-20T19:14:06Z

@cloud-fan PR is still failing with the same error as above after the push.

cloud-fan · 2019-03-21T20:05:08Z

weird, all tests pass at Spark side. Let me revert the removal of the partial2 evaluator and see if it works

pgandhi999 · 2019-03-21T20:40:58Z

@cloud-fan The test works now. Thank you.

pgandhi999 · 2019-03-21T20:52:16Z

Also I figured out that my machine had an issue and hence, your old commit did not get updated. I tested the code without the last commit and that works too. Sorry, my bad.

cloud-fan · 2019-03-21T22:31:05Z

@pgandhi999 no worry, thanks for your confirmation! Happy to know that my cleanup is corrected :P

SparkQA · 2019-03-21T22:39:51Z

Test build #103788 has finished for PR 24144 at commit deab7ef.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-03-22T00:52:27Z

Test build #103794 has finished for PR 24144 at commit 3bf4ad8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile

  public static enum Mode {
    /**
     * PARTIAL1: from original data to partial aggregation data:
     * iterate() and
     * terminatePartial() will be called.
     */
    PARTIAL1,
    /**
     * PARTIAL2: from partial aggregation data to partial aggregation data:
     * merge() and
     * terminatePartial() will be called.
     */
    PARTIAL2,
    /**
     * FINAL: from partial aggregation to full aggregation:
     * merge() and
     * terminate() will be called.
     */
    FINAL,
    /**
     * COMPLETE: from original data directly to full aggregation:
     * iterate() and
     * terminate() will be called.
     */
    COMPLETE
  };

Could you improve the comments and explain how these four modes are implemented?

cloud-fan · 2019-03-24T22:54:32Z

The 4 modes exactly match what Spark has, although the names are a little different. partial2 is called partial-merge in Spark.

The problem here is, Hive UDAF can know the mode during initialization, while Spark can't. Technically Hive UDAF can pick a different buffer implementation for each mode, and to fully support it we need to refactor the Spark aggregate framework to give mode to Spark UDAF as well. This is overkill IMO and this patch is a best-effort to work around it. I think Hive UDAF will only pick a different buffer implementation for different kinds of inputs(original record or agg buffer), which is the case of the sketches library.

gatorsmile

Yes. This is our best-effort support Hive UDAF.

Thanks! Merged to master/2.4

## What changes were proposed in this pull request? Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it. All credits go to pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests. close #23778 ## How was this patch tested? a new test Closes #24144 from cloud-fan/hive. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit a6c207c) Signed-off-by: gatorsmile <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of #24144 . #24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes #24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7432e7d) Signed-off-by: Wenchen Fan <[email protected]>

## What changes were proposed in this pull request? Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it. All credits go to pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests. close apache#23778 ## How was this patch tested? a new test Closes apache#24144 from cloud-fan/hive. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of apache#24144 . apache#24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes apache#24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

## What changes were proposed in this pull request? backport #24144 and #24459 to 2.3. ## How was this patch tested? existing tests Closes #24539 from cloud-fan/backport. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

## What changes were proposed in this pull request? Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it. All credits go to pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests. close apache#23778 ## How was this patch tested? a new test Closes apache#24144 from cloud-fan/hive. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit a6c207c) Signed-off-by: gatorsmile <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of apache#24144 . apache#24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes apache#24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7432e7d) Signed-off-by: Wenchen Fan <[email protected]>

## What changes were proposed in this pull request? Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it. All credits go to pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests. close apache#23778 ## How was this patch tested? a new test Closes apache#24144 from cloud-fan/hive. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit a6c207c) Signed-off-by: gatorsmile <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of apache#24144 . apache#24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes apache#24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7432e7d) Signed-off-by: Wenchen Fan <[email protected]>

## What changes were proposed in this pull request? Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). However, the Hive UDAF adapter in Spark always creates the buffer with partial1 mode, which can only deal with one input: the original data. This PR fixes it. All credits go to pgandhi999 , who investigate the problem and study the Hive UDAF behaviors, and write the tests. close apache#23778 ## How was this patch tested? a new test Closes apache#24144 from cloud-fan/hive. Lead-authored-by: pgandhi <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit a6c207c) Signed-off-by: gatorsmile <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of apache#24144 . apache#24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes apache#24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7432e7d) Signed-off-by: Wenchen Fan <[email protected]>

…H in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of apache/spark#24144 . #24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes #24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 7432e7d) Signed-off-by: Wenchen Fan <[email protected]>

pgandhi999 reviewed Mar 19, 2019

View reviewed changes

cloud-fan force-pushed the hive branch from c45b7d4 to ce5287e Compare March 20, 2019 11:32

cloud-fan commented Mar 20, 2019

View reviewed changes

cloud-fan force-pushed the hive branch from deab7ef to ce5287e Compare March 21, 2019 22:31

fix Hive UDAF with two aggregation buffers

3bf4ad8

cloud-fan force-pushed the hive branch from ce5287e to 3bf4ad8 Compare March 21, 2019 22:34

pgandhi999 mentioned this pull request Mar 22, 2019

[SPARK-27207][SQL] : Ensure aggregate buffers are initialized again for So… #24149

Closed

gatorsmile reviewed Mar 24, 2019

View reviewed changes

gatorsmile approved these changes Mar 24, 2019

View reviewed changes

gatorsmile closed this in a6c207c Mar 24, 2019

cloud-fan mentioned this pull request Apr 25, 2019

[SPARK-24935][SQL][followup] support INIT -> UPDATE -> MERGE -> FINISH in Hive UDAF adapter #24459

Closed

cloud-fan mentioned this pull request May 6, 2019

[SPARK-24935][SQL][2.3] fix Hive UDAF with two aggregation buffers #24539

Closed

[SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers #24144

[SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers #24144

Uh oh!

Conversation

cloud-fan commented Mar 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

cloud-fan commented Mar 19, 2019

Uh oh!

cloud-fan commented Mar 19, 2019

Uh oh!

pgandhi999 commented Mar 19, 2019

Uh oh!

pgandhi999 commented Mar 19, 2019

Uh oh!

pgandhi999 Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

m44444 Apr 22, 2019

Choose a reason for hiding this comment

Uh oh!

pgandhi999 Apr 22, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 19, 2019

Uh oh!

cloud-fan Mar 20, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 20, 2019

Uh oh!

pgandhi999 commented Mar 20, 2019

Uh oh!

cloud-fan commented Mar 21, 2019

Uh oh!

pgandhi999 commented Mar 21, 2019

Uh oh!

pgandhi999 commented Mar 21, 2019

Uh oh!

cloud-fan commented Mar 21, 2019

Uh oh!

SparkQA commented Mar 21, 2019

Uh oh!

SparkQA commented Mar 22, 2019

Uh oh!

gatorsmile left a comment

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Mar 24, 2019

Uh oh!

gatorsmile left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cloud-fan commented Mar 19, 2019 •

edited

Loading