[SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fail if accumulator is garbage collected #17931

cloud-fan · 2017-05-10T05:35:24Z

What changes were proposed in this pull request?

After #17596 , we do not send internal accumulator name to executor side anymore, and always look up the accumulator name in AccumulatorContext.

This cause a regression if the accumulator is already garbage collected, this PR fixes this by still sending accumulator name for SQLMetrics.

How was this patch tested?

N/A

cloud-fan · 2017-05-10T05:35:49Z

cc @vanzin

rxin · 2017-05-10T05:36:50Z

What's the issue with SQL metrics?

SparkQA · 2017-05-10T05:37:34Z

Test build #76726 has started for PR 17931 at commit 5ae4f6e.

cloud-fan · 2017-05-10T06:20:39Z

we will keep SQLMetrics that sent from executors as UI data, while the actual accumulator registered for SQLMetrics may be garbage collected as the SparkPlan linked with it is garbage collected.

task context accumulators don't have this problem as we always keep the registered accumulators in DAGScheduler

cloud-fan · 2017-05-10T08:08:42Z

retest this please

SparkQA · 2017-05-10T11:19:16Z

Test build #76741 has finished for PR 17931 at commit 5ae4f6e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2017-05-11T16:43:25Z

by still sending accumulator name for SQLMetrics

Do you have an idea of how much this undoes the benefits of SPARK-12837? You're still avoiding sending the names of internal metrics, but I don't have a feel for how many accumulators a large sql query might generate.

Also, is that really necessary? The errors seen in the bug (and that I noticed in my testing) were always on the driver side.

cloud-fan · 2017-05-12T06:59:10Z

at the time we received SQLMetrics in SQLListener with task end event, the registered accumulator may already be GCed, then there is no way to retrieve the accumulator names, except we sending accumulator names to executor side, so that when executor can send back accumulators to driver side with names.

vanzin · 2017-05-12T15:28:09Z

I see. What about the gains from SPARK-12837? Are they still enough that the change is justified, or should we just revert it instead?

vanzin · 2017-05-12T15:53:27Z

(BTW, if keeping the code, a slightly more verbose comment in the code explaining why non-internal accumulators still need to send their names would be good.)

cloud-fan · 2017-05-15T07:55:50Z

I checked the SPARK-12837 again and we only lose a little from this PR. I'll add more comments.

SparkQA · 2017-05-15T10:44:14Z

Test build #76937 has finished for PR 17931 at commit 2a3f773.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2017-05-15T16:19:44Z

LGTM.

vanzin · 2017-05-15T16:21:43Z

Merging to master / 2.2.

…il if accumulator is garbage collected ## What changes were proposed in this pull request? After #17596 , we do not send internal accumulator name to executor side anymore, and always look up the accumulator name in `AccumulatorContext`. This cause a regression if the accumulator is already garbage collected, this PR fixes this by still sending accumulator name for `SQLMetrics`. ## How was this patch tested? N/A Author: Wenchen Fan <[email protected]> Closes #17931 from cloud-fan/bug. (cherry picked from commit e1aaab1) Signed-off-by: Marcelo Vanzin <[email protected]>

…il if accumulator is garbage collected ## What changes were proposed in this pull request? After apache#17596 , we do not send internal accumulator name to executor side anymore, and always look up the accumulator name in `AccumulatorContext`. This cause a regression if the accumulator is already garbage collected, this PR fixes this by still sending accumulator name for `SQLMetrics`. ## How was this patch tested? N/A Author: Wenchen Fan <[email protected]> Closes apache#17931 from cloud-fan/bug.

getting name should not fail if accumulator is garbage collected

5ae4f6e

cloud-fan changed the title ~~[SPARK-12837][CORE][FOLLOWUP] getting name should not fail if accumulator is garbage collected~~ [SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fail if accumulator is garbage collected May 10, 2017

add comment

2a3f773

asfgit closed this in e1aaab1 May 15, 2017

[SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fail if accumulator is garbage collected #17931

[SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fail if accumulator is garbage collected #17931

Uh oh!

Conversation

cloud-fan commented May 10, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

cloud-fan commented May 10, 2017

Uh oh!

rxin commented May 10, 2017

Uh oh!

SparkQA commented May 10, 2017

Uh oh!

cloud-fan commented May 10, 2017

Uh oh!

cloud-fan commented May 10, 2017

Uh oh!

SparkQA commented May 10, 2017

Uh oh!

vanzin commented May 11, 2017

Uh oh!

cloud-fan commented May 12, 2017

Uh oh!

vanzin commented May 12, 2017

Uh oh!

vanzin commented May 12, 2017

Uh oh!

cloud-fan commented May 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented May 15, 2017

Uh oh!

vanzin commented May 15, 2017

Uh oh!

vanzin commented May 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cloud-fan commented May 15, 2017 •

edited

Loading