[SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in executor may result in data error #21772

liutang123 · 2018-07-15T12:02:45Z

When join key is long or int in broadcast join, Spark will use LongToUnsafeRowMap to store key-values of the table witch will be broadcasted. But, when LongToUnsafeRowMap is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because write uses a variable cursor to determine how many bytes in page of LongToUnsafeRowMap will be write out and the cursor was not restore when deserializing, executor will write out nothing from page into disk.

What changes were proposed in this pull request?

Restore cursor value when deserializing.

…sult in data error

hvanhovell · 2018-07-16T12:43:23Z

ok to test

hvanhovell · 2018-07-16T14:15:20Z

@liutang123 can you explain why we are losing data when serializing to disk. Also can you add a unit test?

SparkQA · 2018-07-16T16:36:21Z

Test build #93112 has finished for PR 21772 at commit a72fe61.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2018-07-17T22:43:54Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala

-    writeLong(used)
+    val cursorFlag = cursor - Platform.LONG_ARRAY_OFFSET
+    writeLong(cursorFlag)
+    val used = (cursorFlag / 8).toInt


Is this said, that when (cursor - Platform.LONG_ARRAY_OFFSET) / 8 is over the range of Int, we will have overflow? But later you still do toInt and use the value?

losing data when serializing LongHashedRelation in executor, can you see this picture? In executor, the cursor is 0.

Can you post the image in this PR? The web site you refer contains too many ads.

Sorry, I don't kown how to post an image in PR at first.
The image is as fowllows:

liutang123 · 2018-07-18T03:29:29Z

@hvanhovell Thanks for reviewing. Losing data because the variable cursor in executor is Platform.LONG_ARRAY_OFFSET and serialization depends on it. I will add an UT later.

viirya · 2018-07-18T18:33:29Z

Let me clarify it. So this means that when LongToUnsafeRowMap is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because write uses cursor to determine used, it will write out nothing from page into disk.

Is this what you mean?

liutang123 · 2018-07-19T07:31:32Z

@viirya Yes, absolutely right. :)

SparkQA · 2018-07-19T10:14:45Z

Test build #93267 has finished for PR 21772 at commit f67ff4d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

jinxing64 · 2018-07-20T02:46:56Z

Jenkins, test this please

SparkQA · 2018-07-20T06:14:20Z

Test build #93314 has finished for PR 21772 at commit f67ff4d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liutang123 · 2018-07-23T03:30:31Z

@viirya Hi, Could you have more time to review this PR?

viirya · 2018-07-23T03:58:58Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala

+
+    val usedWordsNumber = ((cursor - Platform.LONG_ARRAY_OFFSET) / 8).toInt
+    writeLong(usedWordsNumber)
+    writeLongArray(writeBuffer, page, usedWordsNumber)


If no good reason, shall we revert this change? Looks like you only rename it?

viirya · 2018-07-23T04:00:10Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala

+    val usedWordsNumber = readLong().toInt
+    // Set cursor because cursor is used in write function.
+    cursor = usedWordsNumber * 8 + Platform.LONG_ARRAY_OFFSET
+    page = readLongArray(readBuffer, usedWordsNumber)


ditto. Can you just update cursor and revert other unrelated change?

viirya · 2018-07-23T04:01:09Z

sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala

    map.free()
  }

+  test("SPARK-24809: Serializing LongHashedRelation in executor may result in data error") {


Is it possible to have an end-to-end test for this?

I think this UT can cover the case I had met.
End-to-end test is too hard to structure because this case just occurs when executor's memory is not enough to hold the block and the broadcast cache is removed by the garbage collector.

viirya · 2018-07-23T04:02:28Z

@liutang123 Thanks for this work. I'm curious that if this is an actual problem you hit in real application, or you just think it is problematic?

liutang123 · 2018-07-24T03:48:04Z

@viirya This case occurred in our cluster and we took a lot of time to find this bug.
For some man-made reasons, the small table's max id has become abnormally large. The LongHasedRelation generated based on the table was not optimized to dense and has become abnormally big(approximately 400MB).

viirya · 2018-07-24T04:02:43Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala

    array = readLongArray(readBuffer, length)
    val pageLength = readLong().toInt
    page = readLongArray(readBuffer, pageLength)
+    // Set cursor because cursor is used in write function.


maybe: Restore cursor variable to make this map able to be serialized again on executors?

viirya · 2018-07-24T04:05:22Z

As you actually modify LongToUnsafeRowMap, is it better to update the PR title and description to reflect that?

viirya · 2018-07-24T04:06:16Z

sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala

+    val value1 = new Random().nextLong()
+
+    val key2 = 2L
+    val value2 = new Random().nextLong()


Is it necessary to use Random here? Can we use two arbitrary long values?

viirya · 2018-07-24T04:10:18Z

sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala

+
+    val resultRow = new UnsafeRow(1)
+    assert(originalMap.getValue(key1, resultRow).getLong(0) === value1)
+    assert(originalMap.getValue(key2, resultRow).getLong(0) === value2)


We don't need to test LongToUnsafeRowMap's normal feature here. We just need to verify the map after two ser/de can work normally.

viirya · 2018-07-24T04:12:07Z

sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala

+    val ser = new KryoSerializer(
+            (new SparkConf).set("spark.kryo.referenceTracking", "false")).newInstance()
+
+    val mapSerializedInDriver = ser.deserialize[LongToUnsafeRowMap](ser.serialize(originalMap))


nit:

// Simulate serialize/deserialize twice on driver and executor val firstTimeSerialized = ... val secondTimeSerialized = ...

viirya · 2018-07-24T04:23:45Z

cc @cloud-fan

SparkQA · 2018-07-24T07:01:02Z

Test build #93473 has finished for PR 21772 at commit 06a9547.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-07-24T07:05:01Z

Test build #93480 has finished for PR 21772 at commit c9ebfd0.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

liutang123 · 2018-07-24T09:05:54Z

Jenkins, test this please

kiszk · 2018-07-24T18:59:13Z

retest this please

SparkQA · 2018-07-24T22:50:10Z

Test build #93516 has finished for PR 21772 at commit c9ebfd0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-07-28T01:53:26Z

sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala

+    originalMap.append(key2, unsafeProj(InternalRow(value2)))
+    originalMap.optimize()
+
+    val ser = new KryoSerializer(


we can write sparkContext.env.serializer.newInstance()

cloud-fan · 2018-07-28T01:54:39Z

good catch! LGTM

viirya · 2018-07-29T00:44:02Z

LGTM too.

SparkQA · 2018-07-29T17:56:24Z

Test build #93749 has finished for PR 21772 at commit 6246dfa.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile

Thanks! Merged to master/2.3/2.2/2.1

…ult in data error When join key is long or int in broadcast join, Spark will use `LongToUnsafeRowMap` to store key-values of the table witch will be broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses a variable `cursor` to determine how many bytes in `page` of `LongToUnsafeRowMap` will be write out and the `cursor` was not restore when deserializing, executor will write out nothing from page into disk. ## What changes were proposed in this pull request? Restore cursor value when deserializing. Author: liulijia <[email protected]> Closes #21772 from liutang123/SPARK-24809. (cherry picked from commit 2c54aae) Signed-off-by: Xiao Li <[email protected]>

…ult in data error When join key is long or int in broadcast join, Spark will use `LongToUnsafeRowMap` to store key-values of the table witch will be broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses a variable `cursor` to determine how many bytes in `page` of `LongToUnsafeRowMap` will be write out and the `cursor` was not restore when deserializing, executor will write out nothing from page into disk. ## What changes were proposed in this pull request? Restore cursor value when deserializing. Author: liulijia <[email protected]> Closes apache#21772 from liutang123/SPARK-24809.

…ult in data error When join key is long or int in broadcast join, Spark will use `LongToUnsafeRowMap` to store key-values of the table witch will be broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses a variable `cursor` to determine how many bytes in `page` of `LongToUnsafeRowMap` will be write out and the `cursor` was not restore when deserializing, executor will write out nothing from page into disk. ## What changes were proposed in this pull request? Restore cursor value when deserializing. Author: liulijia <[email protected]> Closes apache#21772 from liutang123/SPARK-24809. (cherry picked from commit 2c54aae) Signed-off-by: Xiao Li <[email protected]>

…ult in data error When join key is long or int in broadcast join, Spark will use `LongToUnsafeRowMap` to store key-values of the table witch will be broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses a variable `cursor` to determine how many bytes in `page` of `LongToUnsafeRowMap` will be write out and the `cursor` was not restore when deserializing, executor will write out nothing from page into disk. ## What changes were proposed in this pull request? Restore cursor value when deserializing. Author: liulijia <[email protected]> Closes apache#21772 from liutang123/SPARK-24809.

[SPARK-24809] [SQL] Serializing LongHashedRelation in executor may re…

a72fe61

…sult in data error

viirya reviewed Jul 17, 2018

View reviewed changes

[SPARK-24809][SQL] updte code style, add UT.

f67ff4d

viirya reviewed Jul 23, 2018

View reviewed changes

do not change existing code

06a9547

viirya reviewed Jul 24, 2018

View reviewed changes

liutang123 changed the title ~~[SPARK-24809] [SQL] Serializing LongHashedRelation in executor may result in data error~~ [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in executor may result in data error Jul 24, 2018

optimize code style.

c9ebfd0

cloud-fan reviewed Jul 28, 2018

View reviewed changes

small update.

6246dfa

gatorsmile reviewed Jul 29, 2018

View reviewed changes

asfgit closed this in 2c54aae Jul 29, 2018

[SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in executor may result in data error #21772

[SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in executor may result in data error #21772

Uh oh!

Conversation

liutang123 commented Jul 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Uh oh!

hvanhovell commented Jul 16, 2018

Uh oh!

hvanhovell commented Jul 16, 2018

Uh oh!

SparkQA commented Jul 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liutang123 Jul 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liutang123 commented Jul 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viirya commented Jul 18, 2018

Uh oh!

liutang123 commented Jul 19, 2018

Uh oh!

SparkQA commented Jul 19, 2018

Uh oh!

jinxing64 commented Jul 20, 2018

Uh oh!

SparkQA commented Jul 20, 2018

Uh oh!

liutang123 commented Jul 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Jul 23, 2018

Uh oh!

liutang123 commented Jul 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Jul 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Jul 24, 2018

Uh oh!

SparkQA commented Jul 24, 2018

Uh oh!

SparkQA commented Jul 24, 2018

Uh oh!

liutang123 commented Jul 24, 2018

Uh oh!

kiszk commented Jul 24, 2018

Uh oh!

SparkQA commented Jul 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Jul 28, 2018

Uh oh!

viirya commented Jul 29, 2018

Uh oh!

SparkQA commented Jul 29, 2018

liutang123 commented Jul 15, 2018 •

edited

Loading

liutang123 Jul 18, 2018 •

edited

Loading

liutang123 commented Jul 18, 2018 •

edited

Loading