-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-3889] Attempt to avoid SIGBUS by not mmapping files in ConnectionManager #2742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ionManager In general, individual shuffle blocks are frequently small, so mmapping them often creates a lot of waste. It may not be bad to mmap the larger ones, but it is pretty inconvenient to get configuration into ManagedBuffer, and besides it is unlikely to help all that much. Note that user of ManagedBuffer#nioByteBuffer() seems generally bad practice, and would ideally never be used for data that may be large. Users of such data would ideally stream the data instead.
|
@rxin thoughts on this? You probably have a better idea on whether this would be too much of a perf hit, and whether we should switch into "map" mode for larger blocks. |
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the old code had a config parameter for using mem map or this ..
|
Added a non-configurable version of the memory map pathway, with the threshold you suggested (2MB, the size of a hugepage). Note that this fix will also be included in #2753. |
|
QA tests have started for PR 2742 at commit
|
|
QA tests have finished for PR 2742 at commit
|
|
Test PASSed. |
|
LGTM. Merged. Thanks! |
|
This needs to be configurable ... IIRC 1.1 had this customizable - see spark.storage.memoryMapThreshold |
|
@mridulm Could you give an example of which way you would want to shift it via config? Map more or less often? |
|
With 1.1, in expts, we have done both : depending on whether our user code
|
|
Note: this is reqd since there are heap and vm limits enforced, so we
|
In general, individual shuffle blocks are frequently small, so mmapping them often creates a lot of waste. It may not be bad to mmap the larger ones, but it is pretty inconvenient to get configuration into ManagedBuffer, and besides it is unlikely to help all that much.