-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-19458][BUILD]load hive jars from local repo which has downloaded #16803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #72370 has finished for PR 16803 at commit
|
|
retest this please |
|
Test build #72379 has finished for PR 16803 at commit
|
|
Test build #72380 has finished for PR 16803 at commit
|
|
Test build #72381 has finished for PR 16803 at commit
|
|
Test build #72382 has finished for PR 16803 at commit
|
|
retest this please |
|
Test build #72383 has finished for PR 16803 at commit
|
|
Test build #72384 has finished for PR 16803 at commit
|
|
Test build #72403 has finished for PR 16803 at commit
|
|
This should be taged as |
| OptionAssigner(args.ivyRepoPath, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, | ||
| sysProp = "spark.jars.ivy"), | ||
| OptionAssigner(args.repositories, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, | ||
| sysProp = "spark.jars.repositories"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a new option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is used to store user's repos , then we can use it in download hive jars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to document it in http://spark.apache.org/docs/latest/configuration.html, like what we did for spark.jars.ivy
|
|
||
| def isSameFile(left: String, right: String): Boolean = { | ||
| val leftInput: FileInputStream = new FileInputStream(left) | ||
| val leftMd5 = UTF8String.fromString(org.apache.commons.codec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why convert it to UTF8String and compare?
|
Test build #72425 has started for PR 16803 at commit |
|
retest this please |
|
Test build #72428 has finished for PR 16803 at commit
|
|
Please rename it to |
|
Adding a new option |
|
yes, but in this pr which modify the IsolatedClientLoader.forVersion it only affects Hive jars, maybe later we can use it in another place to download other jars |
|
Test build #72457 has finished for PR 16803 at commit
|
|
@cloud-fan @gatorsmile @yhuai I will appreciate that help to continue review this. Thanks a lot~ |
|
I'm not an expert about this area, but do we have to introduce a new config? can we load hive jar from local maven repo? |
|
if we not set ivy.jars.repos , it will use default ${user.home}/.m2 repo, and if we set ivy.jars.path which has download, it will can alos load from this path. |
|
Test build #72629 has started for PR 16803 at commit |
|
@dongjoon-hyun @srowen could you help to review this? thanks very much! |
|
retest this please |
|
Test build #72678 has finished for PR 16803 at commit
|
|
retest this please. |
|
ping @cloud-fan @gatorsmile @dongjoon-hyun Any thoughts on this? |
|
Test build #78257 has finished for PR 16803 at commit
|
|
Sorry I'm not familiar with the build stuff, cc @srowen |
|
cc @jerryshao |
|
@windpiger can you please rebase the code, it seems too old to review. |
|
Should we close this PR since it goes stale? @cloud-fan @jerryshao |
|
I'm going to close this PR because it goes stale, please feel free to reopen it or open another PR if anyone have more thoughts on this issue. |
What changes were proposed in this pull request?
Currently when we new a HiveClient for a specific metastore version and
spark.sql.hive.metastore.jarsis setted tomaven, Spark will download the hive jars from remote repo(http://www.datanucleus.org/downloads/maven2).we should allow the user to load hive jars from the user defined repos(e.g.local repo which has already downloaded), and user defined repos should take priority order than default repos(e.g. ${user.home}/.m2/repository)
the similar way which is to be referenced from
SparkSubmitprocessing--packageshttps://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L289-L298
How was this patch tested?
unit test added