[SPARK-5860][CORE] JdbcRDD: overflow on large range with high number of partitions #4701

hotou · 2015-02-20T02:01:12Z

Fix a overflow bug in JdbcRDD when calculating partitions for large BIGINT ids

…f partitions

AmplabJenkins · 2015-02-20T02:02:10Z

Can one of the admins verify this patch?

rxin · 2015-02-20T08:09:58Z

core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala

maybe this would be more robust if we just increment by a fixed width every iteration?

BigDecimal is overkill and feels funny to be introducing floating point. i and length are int already so using longs here is sufficient. I really think this is solve by simply changing i to i.toLong and removing the superfluous .toLong at the end then.

@srowen No that won't work, the problem here is this
length is long I think since it's diff of 2 longs, but (i * length) can overflow long. In my test case for example

567279357766147899L * 20 = -7101156918386593636

So we need a type that is can represent larger numbers

@rxin I actually favor the current partition algo, it's pretty neat in a way, for example

lowerBound = 1
upperBound = 100
numPartition = 8

With fix length increase, you get
[1,13],[14,26],[27,39],[40,52],[53,65],[66,78],[79,91],[92,100]
In which you always end up with one small partition at the end

With the current algo you get
[1,12],[13,25],[26,37],[38,50],[51,62],[63,75],[76,87],[88,100]
You get more evenly distributed partitions

Ah right I see that since the bounds are used for ID-like values, it's not unrealistic to think that length is near 2^63, and it's a long. Hm, does this overflow anyway in the final partition? if upperBound is near 2^63 then this can compute an end that doesn't fit back into a long even if the intermediate result does. PS why not BigInteger?

While we're fixing it though, I tend to think that @rxin is right and that this can just map to fixed intervals rather than perform this computation.

BTW since the range is inclusive, I feel like the last partition should subtract 1 from its end? and this should never be larger than upperBound? there might be a few reasons to revise this logic.

It does although the difference is +/-1 for all partitions but the last. Yea I get your point about the final partition but isn't it possible that its end is >= upperBound? I haven't thought it through, I admit. A version with BigInteger here is definitely an improvement so I'd support that. I suppose I'm still concerned this fails in exactly the corner case you're trying to fix -- very large length.

@srowen Sorry I don't understand how last partition can get >= upperBound

With this codes

val length = 1 + upperBound - lowerBound
(0 until numPartitions).map(i => {
val start = lowerBound + ((BigInt(i) * length) / numPartitions).toLong
val end = lowerBound + ((BigInt(i + 1) * length) / numPartitions).toLong - 1
new JdbcPartition(i, start, end)
})

the last iteration on this:
i = numPartitions - 1

So end = lowerBound + (numPartitions - 1 + 1) * (1 + upperBound - lowerBound) / numPartitions - 1
= lowerBound + numPartitions/ numPartitions * (1 + upperBound - lowerBound) - 1
= lowerBound 1 + upperBound - lowerBound - 1
= upperBound

Yep that's sound, good explanation. This is also why it can't overflow when toLong is called. I think this is the right fix with BigInt.

I think there is a problem here where
length = 1 + upperBound - lowerBound
This can over flow as well, if lowerBound = 0 and upperBound = Long.MAX_VALUE
I should fix this, actually make the change smaller

srowen · 2015-02-20T13:20:31Z

core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala

Were these changes accidental?

I just want to add a new test for this bug, is this not the right place to put it ?

Yes but here it looks like you just indented existing code or something. The new test below looks right.

Oh.. that's because I need to create a new table, and I put that and the old table creation in one single
try {
...
} finally {
conn.close()
}

srowen · 2015-02-21T13:34:06Z

core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala

LGTM except can't we keep this catch block in common in the outer try block? I realize it would mean changing some of the variable names in the second block.

There was a problem when I tried to to that. The original writer uses the inner catch block to prevent re-creating the table.

catch {
case e: SQLException if e.getSQLState == "X0Y32" =>
// table exists
}

Which means it has to exist for each table to be created. I was simply following that pattern.

An alternative would be is to drop and re-create the table each time, which produce cleaner codes, but may slow down the test suite a little bit

Oh right I get it, never mind. LGTM. I can merge in a bit if there are no further comments.

[SPARK-5860][CORE] JdbcRDD overflow on large range with high number o…

4e9ff4f

…f partitions

rxin reviewed Feb 20, 2015
View reviewed changes

srowen mentioned this pull request Feb 20, 2015

Fixed overflow on large range with high number of partitions #4646

Closed

srowen reviewed Feb 20, 2015
View reviewed changes

hotou added 3 commits February 20, 2015 10:02

[SPARK-5860][CORE] switch to BigInt instead of BigDecimal

c88755a

[SPARK-5860][CORE] Prevent overflowing at the length level

7883ad9

[SPARK-5860][CORE] Prevent overflowing at the length level

9e038d1

srowen reviewed Feb 21, 2015
View reviewed changes

asfgit closed this in 7683982 Feb 21, 2015

[SPARK-5860][CORE] JdbcRDD: overflow on large range with high number of partitions #4701

[SPARK-5860][CORE] JdbcRDD: overflow on large range with high number of partitions #4701

Uh oh!

Conversation

hotou commented Feb 20, 2015

Uh oh!

AmplabJenkins commented Feb 20, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants