[SPARK-26700][CORE] enable fetch-big-block-to-disk by default #23625

cloud-fan · 2019-01-23T08:09:52Z

What changes were proposed in this pull request?

This is a followup of #16989

The fetch-big-block-to-disk feature is disabled by default, because it's not compatible with external shuffle service prior to Spark 2.2. The client sends stream request to fetch block chunks, and old shuffle service can't support it.

After 2 years, Spark 2.2 has EOL, and now it's safe to turn on this feature by default

How was this patch tested?

existing tests

cloud-fan · 2019-01-23T08:10:17Z

cc @jinxing64 @tgravescs @vanzin @srowen

SparkQA · 2019-01-23T12:48:54Z

Test build #101577 has finished for PR 23625 at commit 295d163.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

docs/configuration.md

srowen · 2019-01-23T14:45:46Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

      // to the block data itself (in particular UploadBlock has a lot of metadata), so we leave
      // extra room.
-      .createWithDefault(Int.MaxValue - 512)
+      .checkValue(_ <= Int.MaxValue - 512, "maxRemoteBlockSizeFetchToMem must be less than 2GB.")


If someone specifies '2g' this will fail right? which might be surprising given the message. What about reusing that lower limit in the message?

vanzin · 2019-01-23T19:56:35Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

+        "in bytes. This is to avoid a giant request takes too much memory. Note this " +
+        "configuration will affect both shuffle fetch and block manager remote block fetch. " +
+        "For users who enabled external shuffle service, this feature can only work when " +
+        "external shuffle service is newer than Spark 2.2.")


newer than 2.2 -> at least 2.3.0?

SparkQA · 2019-01-24T08:05:01Z

Test build #101614 has finished for PR 23625 at commit 161c4e6.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-01-24T11:23:00Z

retest this please

SparkQA · 2019-01-24T16:05:04Z

Test build #101627 has finished for PR 23625 at commit 161c4e6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2019-01-24T16:47:33Z

docs/configuration.md

 </tr>
-<tr>
-  <td><code>spark.maxRemoteBlockSizeFetchToMem</code></td>
-  <td>Int.MaxValue - 512</td>


just to clarify, you intentionally moved this from shuffle section to network section since it affects both the shuffle fetch and block manager fetches?

docs/configuration.md

dongjoon-hyun · 2019-01-25T20:03:12Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

-      .createWithDefault(Int.MaxValue - 512)
+      .checkValue(
+        _ <= Int.MaxValue - 512,
+        "maxRemoteBlockSizeFetchToMem must be less than (Int.MaxValue - 512) bytes.")


nit. less than or equal to?

SparkQA · 2019-01-28T08:05:02Z

Test build #101742 has finished for PR 23625 at commit 54c216e.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-01-28T08:13:07Z

retest this please

wangshuo128 · 2019-01-28T09:12:04Z

"fetch-big-block-to-memory" maybe "fetch-big-block-to-disk" in title?

SparkQA · 2019-01-28T12:56:31Z

Test build #101750 has finished for PR 23625 at commit 54c216e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-01-28T15:42:10Z

thanks, merging to master!

## What changes were proposed in this pull request? This is a followup of apache#16989 The fetch-big-block-to-disk feature is disabled by default, because it's not compatible with external shuffle service prior to Spark 2.2. The client sends stream request to fetch block chunks, and old shuffle service can't support it. After 2 years, Spark 2.2 has EOL, and now it's safe to turn on this feature by default ## How was this patch tested? existing tests Closes apache#23625 from cloud-fan/minor. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

enable fetch-big-block-to-memory by default

295d163

tgravescs reviewed Jan 23, 2019

View reviewed changes

docs/configuration.md Show resolved Hide resolved

srowen reviewed Jan 23, 2019

View reviewed changes

vanzin reviewed Jan 23, 2019

View reviewed changes

address comments

161c4e6

tgravescs reviewed Jan 24, 2019

View reviewed changes

srowen approved these changes Jan 25, 2019

View reviewed changes

docs/configuration.md Show resolved Hide resolved

dongjoon-hyun reviewed Jan 25, 2019

View reviewed changes

address comment

54c216e

dongjoon-hyun approved these changes Jan 28, 2019

View reviewed changes

cloud-fan changed the title ~~[SPARK-26700][CORE] enable fetch-big-block-to-memory by default~~ [SPARK-26700][CORE] enable fetch-big-block-to-disk by default Jan 28, 2019

cloud-fan closed this in ed71a82 Jan 28, 2019

[SPARK-26700][CORE] enable fetch-big-block-to-disk by default #23625

[SPARK-26700][CORE] enable fetch-big-block-to-disk by default #23625

Uh oh!

Conversation

cloud-fan commented Jan 23, 2019

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

cloud-fan commented Jan 23, 2019

Uh oh!

SparkQA commented Jan 23, 2019

Uh oh!

Uh oh!

srowen Jan 23, 2019

Choose a reason for hiding this comment

Uh oh!

vanzin Jan 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 24, 2019

Uh oh!

cloud-fan commented Jan 24, 2019

Uh oh!

SparkQA commented Jan 24, 2019

Uh oh!

tgravescs Jan 24, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan Jan 25, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dongjoon-hyun Jan 25, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 28, 2019

Uh oh!

cloud-fan commented Jan 28, 2019

Uh oh!

wangshuo128 commented Jan 28, 2019

Uh oh!

SparkQA commented Jan 28, 2019

Uh oh!

cloud-fan commented Jan 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

vanzin Jan 23, 2019 •

edited

Loading