-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-39636][CORE][UI] Fix multiple bugs in JsonProtocol, impacting off heap StorageLevels and Task/Executor ResourceRequests #37027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| def taskResourceRequestFromJson(json: JValue): TaskResourceRequest = { | ||
| val rName = (json \ "Resource Name").extract[String] | ||
| val amount = (json \ "Amount").extract[Int] | ||
| val amount = (json \ "Amount").extract[Double] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amount is defined as a double at
spark/core/src/main/scala/org/apache/spark/resource/TaskResourceRequest.scala
Lines 35 to 38 in 2925896
| @Evolving | |
| @Since("3.1.0") | |
| class TaskResourceRequest(val resourceName: String, val amount: Double) | |
| extends Serializable { |
| def executorResourceRequestFromJson(json: JValue): ExecutorResourceRequest = { | ||
| val rName = (json \ "Resource Name").extract[String] | ||
| val amount = (json \ "Amount").extract[Int] | ||
| val amount = (json \ "Amount").extract[Long] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amount is defined as a long at
spark/core/src/main/scala/org/apache/spark/resource/ExecutorResourceRequest.scala
Lines 52 to 58 in 2925896
| @Evolving | |
| @Since("3.1.0") | |
| class ExecutorResourceRequest( | |
| val resourceName: String, | |
| val amount: Long, | |
| val discoveryScript: String = "", | |
| val vendor: String = "") extends Serializable { |
| def storageLevelToJson(storageLevel: StorageLevel): JValue = { | ||
| ("Use Disk" -> storageLevel.useDisk) ~ | ||
| ("Use Memory" -> storageLevel.useMemory) ~ | ||
| ("Use Off Heap" -> storageLevel.useOffHeap) ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The
useOffHeapfield was added in Spark 1.0 in PR SPARK-1305: Support persisting RDD's directly to Tachyon #158 and was persisted to JsonProtocol as a "Use Tachyon" field (since that was the original meaning of that field). - It was subsequently renamed to "Use ExternalBlockStore" in [SPARK-6479][Block Manager]Create off-heap block storage API #5430.
- It was later removed in [SPARK-12667] Remove block manager's internal "external block store" API #10752 as part of removal of the legacy external block store API.
- In [SPARK-13992] Add support for off-heap caching #11805 for Spark 2.x I repurposed the then-orphaned
OFF_HEAPstorage level for use with off-heap caching, but overlooked the fact that that theuseOffHeapfield was no longer being recorded inJsonProtocol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice set of bugfixes @JoshRosen !
|
Thanks for the reviews. I'm going to merge this to the master branch (Spark 3.4.0). |
What changes were proposed in this pull request?
This PR fixes three longstanding bugs in Spark's
JsonProtocol:TaskResourceRequestloses precision foramount< 0.5. Theamountis a floating point number which is either between 0 and 0.5 or is a positive integer, but the JSON read path assumes it is an integer.ExecutorResourceRequestinteger overflows for values larger than Int.MaxValue because the write path writes longs but the read path assumes integers.useOffHeapfield isn't included in the JSON, so this StorageLevel cannot be round-tripped through JSON. This could cause the History Server to display inaccurate "off heap memory used" stats on the executors page.I discovered these bugs while working on #36885.
Why are the changes needed?
JsonProtocol should be able to roundtrip events through JSON without loss of information.
Does this PR introduce any user-facing change?
Yes: it fixes bugs that impact information shown in the History Server Web UI. The new StorageLevel JSON field will be visible to tools which process raw event log JSON.
How was this patch tested?
Updated existing unit tests to cover the changed logic.