-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-6731] Bump version of apache commons-math3 #5380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Version 3.1.1 is two years old and the newer version includes approximate percentile statistics (among other things).
|
@punya can you create a Jira ticket for this and put it in the PR subject like "[SPARK-12345] Bump version of apache commons-math3" ? |
|
Test build #29765 has finished for PR 5380 at commit
|
|
It's not entirely that simple, since this is meant to match the version used in other dependencies like Hadoop. The Commons Math3 dependencies aren't 100% compatible across minor releases. I generally favor this update, but you would first need to run tests with a number of different Hadoop profiles, for example, to verify it still works. Give that a go first? |
|
Is there an automated way to do that? |
|
@srowen based on http://central.maven.org/maven2/org/apache/hadoop/hadoop-core/0.20.2/hadoop-core-0.20.2.pom it looks like hadoop (at least this version) doesn't depend on |
|
I looked at https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-project/2.6.0/hadoop-project-2.6.0.pom and did find that they depend on commons-math3 version 3.1.1. Do we aim for strict compatibility with the versions in the Hadoop parent POM, or do we only consider a subset of Hadoop projects that might be on the classpath with Spark? |
|
My guess is that this is probably OK -- I have actually been using commons math 3.4.1 in a project on Hadoop 2.6 without issues, though I recall a subtle and nasty incompatibility problem from 3.2 to 3.3 (that is very unlikely to be relevant here or manifest here). Clearly the intent has been backwards compatibility: http://commons.apache.org/proper/commons-math/changes-report.html Although it's always safe to try to harmonize dependencies, it's not 100% possible. If there's a compelling reason to update, and we see that tests pass in a few permutations of Hadoop builds, then I'd support it. For example can you try |
|
Thanks for the feedback, I'll let you know how the tests go. |
|
I've tried Hadoop 1.2.1, 2.2.0 and 2.6.0 and tests passed. I am fairly sure this is an OK update. Any other experience on your end? |
|
Unfortunately, I got random flaky tests when I ran it with 2.6.0 (and I didn't get a chance to determine whether the flakes were associated to the version update). My hunch is that they weren't (because they had nothing to do with commons-math3 and seemed related to timeouts). Based on your experience I'd favor merging the PR. |
|
I think we'll want to upgrade this at some point since there are a number of bug fixes and small new features we want to use. I think I've convinced myself that this is empirically compatible, intended to be compatible, and the particular issue I got hit by is not relevant here. |
|
@mengxr Hm! that's good to know. I agree that I doubt users are affected by this -- Spark isn't -- and users should probably be packaging their own copy of |
Version 3.1.1 is two years old and the newer version includes
approximate percentile statistics (among other things).