Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup of #16989

The fetch-big-block-to-disk feature is disabled by default, because it's not compatible with external shuffle service prior to Spark 2.2. The client sends stream request to fetch block chunks, and old shuffle service can't support it.

After 2 years, Spark 2.2 has EOL, and now it's safe to turn on this feature by default

How was this patch tested?

existing tests

@cloud-fan
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Jan 23, 2019

Test build #101577 has finished for PR 23625 at commit 295d163.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// to the block data itself (in particular UploadBlock has a lot of metadata), so we leave
// extra room.
.createWithDefault(Int.MaxValue - 512)
.checkValue(_ <= Int.MaxValue - 512, "maxRemoteBlockSizeFetchToMem must be less than 2GB.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone specifies '2g' this will fail right? which might be surprising given the message. What about reusing that lower limit in the message?

"in bytes. This is to avoid a giant request takes too much memory. Note this " +
"configuration will affect both shuffle fetch and block manager remote block fetch. " +
"For users who enabled external shuffle service, this feature can only work when " +
"external shuffle service is newer than Spark 2.2.")
Copy link
Contributor

@vanzin vanzin Jan 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newer than 2.2 -> at least 2.3.0?

@SparkQA
Copy link

SparkQA commented Jan 24, 2019

Test build #101614 has finished for PR 23625 at commit 161c4e6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 24, 2019

Test build #101627 has finished for PR 23625 at commit 161c4e6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

</tr>
<tr>
<td><code>spark.maxRemoteBlockSizeFetchToMem</code></td>
<td>Int.MaxValue - 512</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify, you intentionally moved this from shuffle section to network section since it affects both the shuffle fetch and block manager fetches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea

.createWithDefault(Int.MaxValue - 512)
.checkValue(
_ <= Int.MaxValue - 512,
"maxRemoteBlockSizeFetchToMem must be less than (Int.MaxValue - 512) bytes.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. less than or equal to?

@SparkQA
Copy link

SparkQA commented Jan 28, 2019

Test build #101742 has finished for PR 23625 at commit 54c216e.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@wangshuo128
Copy link
Contributor

"fetch-big-block-to-memory" maybe "fetch-big-block-to-disk" in title?

@cloud-fan cloud-fan changed the title [SPARK-26700][CORE] enable fetch-big-block-to-memory by default [SPARK-26700][CORE] enable fetch-big-block-to-disk by default Jan 28, 2019
@SparkQA
Copy link

SparkQA commented Jan 28, 2019

Test build #101750 has finished for PR 23625 at commit 54c216e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

thanks, merging to master!

@cloud-fan cloud-fan closed this in ed71a82 Jan 28, 2019
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

This is a followup of apache#16989

The fetch-big-block-to-disk feature is disabled by default, because it's not compatible with external shuffle service prior to Spark 2.2. The client sends stream request to fetch block chunks, and old shuffle service can't support it.

After 2 years, Spark 2.2 has EOL, and now it's safe to turn on this feature by default

## How was this patch tested?

existing tests

Closes apache#23625 from cloud-fan/minor.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants