[SPARK-19458][BUILD]load hive jars from local repo which has downloaded #16803

windpiger · 2017-02-04T09:24:32Z

What changes were proposed in this pull request?

Currently when we new a HiveClient for a specific metastore version and spark.sql.hive.metastore.jars is setted to maven, Spark will download the hive jars from remote repo(http://www.datanucleus.org/downloads/maven2).

we should allow the user to load hive jars from the user defined repos(e.g.local repo which has already downloaded), and user defined repos should take priority order than default repos(e.g. ${user.home}/.m2/repository)

the similar way which is to be referenced from SparkSubmit processing --packages
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L289-L298

How was this patch tested?

unit test added

SparkQA · 2017-02-04T11:23:57Z

Test build #72370 has finished for PR 16803 at commit 5c8727d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

windpiger · 2017-02-04T16:15:10Z

retest this please

SparkQA · 2017-02-04T16:15:24Z

Test build #72379 has finished for PR 16803 at commit 0c76584.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-04T16:19:13Z

Test build #72380 has finished for PR 16803 at commit 6e66793.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-04T16:25:19Z

Test build #72381 has finished for PR 16803 at commit c790a4b.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-04T16:29:25Z

Test build #72382 has finished for PR 16803 at commit 5ab3cdc.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

windpiger · 2017-02-04T16:38:30Z

retest this please

SparkQA · 2017-02-04T18:31:27Z

Test build #72383 has finished for PR 16803 at commit 5570e74.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-04T18:48:28Z

Test build #72384 has finished for PR 16803 at commit 630b019.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-02-05T06:48:18Z

Test build #72403 has finished for PR 16803 at commit d61c077.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

windpiger · 2017-02-05T07:11:01Z

cc @cloud-fan @gatorsmile @yhuai

cloud-fan · 2017-02-06T05:07:31Z

This should be taged as [BUILD]

cloud-fan · 2017-02-06T05:08:01Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

+      OptionAssigner(args.ivyRepoPath, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,
+        sysProp = "spark.jars.ivy"),
+      OptionAssigner(args.repositories, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,
+        sysProp = "spark.jars.repositories"),


this is a new option?

yes, it is used to store user's repos , then we can use it in download hive jars.

We need to document it in http://spark.apache.org/docs/latest/configuration.html, like what we did for spark.jars.ivy

cloud-fan · 2017-02-06T05:08:54Z

core/src/test/scala/org/apache/spark/deploy/SparkSubmitUtilsSuite.scala

+
+    def isSameFile(left: String, right: String): Boolean = {
+      val leftInput: FileInputStream = new FileInputStream(left)
+      val leftMd5 = UTF8String.fromString(org.apache.commons.codec


why convert it to UTF8String and compare?

SparkQA · 2017-02-06T07:37:33Z

Test build #72425 has started for PR 16803 at commit 1bb31e5.

windpiger · 2017-02-06T08:21:59Z

retest this please

SparkQA · 2017-02-06T11:01:40Z

Test build #72428 has finished for PR 16803 at commit 1bb31e5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-02-06T13:19:59Z

Please rename it to [SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded

gatorsmile · 2017-02-06T13:29:16Z

Adding a new option spark.jars.repositories afffects more than loading hive jars, right?

windpiger · 2017-02-06T15:24:13Z

yes, but in this pr which modify the IsolatedClientLoader.forVersion it only affects Hive jars, maybe later we can use it in another place to download other jars

SparkQA · 2017-02-06T18:13:08Z

Test build #72457 has finished for PR 16803 at commit 9330e35.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

windpiger · 2017-02-08T02:08:13Z

@cloud-fan @gatorsmile @yhuai I will appreciate that help to continue review this. Thanks a lot~

cloud-fan · 2017-02-08T02:31:38Z

I'm not an expert about this area, but do we have to introduce a new config? can we load hive jar from local maven repo?

windpiger · 2017-02-09T05:33:07Z

if we not set ivy.jars.repos , it will use default ${user.home}/.m2 repo, and if we set ivy.jars.path which has download, it will can alos load from this path.

SparkQA · 2017-02-09T05:37:38Z

Test build #72629 has started for PR 16803 at commit 51b8f5e.

windpiger · 2017-02-09T05:40:07Z

@dongjoon-hyun @srowen could you help to review this? thanks very much!

windpiger · 2017-02-10T02:08:00Z

retest this please

SparkQA · 2017-02-10T04:56:14Z

Test build #72678 has finished for PR 16803 at commit 51b8f5e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jiangxb1987 · 2017-06-19T14:58:38Z

retest this please.

jiangxb1987 · 2017-06-19T15:00:11Z

ping @cloud-fan @gatorsmile @dongjoon-hyun Any thoughts on this?

SparkQA · 2017-06-19T18:11:16Z

Test build #78257 has finished for PR 16803 at commit 51b8f5e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-06-20T08:59:16Z

Sorry I'm not familiar with the build stuff, cc @srowen

jiangxb1987 · 2017-08-30T17:59:59Z

cc @jerryshao

jerryshao · 2017-08-31T03:00:09Z

@windpiger can you please rebase the code, it seems too old to review.

jiangxb1987 · 2017-10-03T14:36:21Z

Should we close this PR since it goes stale？ @cloud-fan @jerryshao

jiangxb1987 · 2017-11-06T15:37:05Z

I'm going to close this PR because it goes stale, please feel free to reopen it or open another PR if anyone have more thoughts on this issue.

[SPARK-19458][SQL]load hive jars from local repo which has downloaded

5c8727d

windpiger added 3 commits February 5, 2017 00:10

put user defined repo before default repo

0c76584

fix a code style

338aed5

revert IvyTestUtils

2891891

windpiger added 2 commits February 5, 2017 00:16

fix a comment

6e66793

revert HiveUtilsSuite

c790a4b

remove ivyPath from HiveUtils

5ab3cdc

fix some code style

41466ff

windpiger added 2 commits February 5, 2017 00:30

fix some code style

5570e74

fix some test failed

630b019

user defined repo first

d61c077

windpiger changed the title ~~[WIP][SPARK-19458][SQL]load hive jars from local repo which has downloaded~~ [SPARK-19458][SQL]load hive jars from local repo which has downloaded Feb 5, 2017

cloud-fan reviewed Feb 6, 2017

View reviewed changes

remove utf-8 string

1bb31e5

windpiger changed the title ~~[SPARK-19458][SQL]load hive jars from local repo which has downloaded~~ [SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded Feb 6, 2017

add doc

9330e35

windpiger changed the title ~~[SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded~~ [SPARK-19458][BUILD]load hive jars from local repo which has downloaded Feb 8, 2017

merge with master

51b8f5e

jiangxb1987 mentioned this pull request Nov 6, 2017

[BUILD] Close stale PRs #19669

Closed

asfgit closed this in ed1478c Nov 7, 2017

[SPARK-19458][BUILD]load hive jars from local repo which has downloaded #16803

[SPARK-19458][BUILD]load hive jars from local repo which has downloaded #16803

Uh oh!

Conversation

windpiger commented Feb 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

windpiger commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

windpiger commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 5, 2017

Uh oh!

windpiger commented Feb 5, 2017

Uh oh!

cloud-fan commented Feb 6, 2017

Uh oh!

cloud-fan Feb 6, 2017

Choose a reason for hiding this comment

Uh oh!

windpiger Feb 6, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile Feb 6, 2017

Choose a reason for hiding this comment

Uh oh!

cloud-fan Feb 6, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 6, 2017

Uh oh!

windpiger commented Feb 6, 2017

Uh oh!

SparkQA commented Feb 6, 2017

Uh oh!

gatorsmile commented Feb 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gatorsmile commented Feb 6, 2017

Uh oh!

windpiger commented Feb 6, 2017

Uh oh!

SparkQA commented Feb 6, 2017

Uh oh!

windpiger commented Feb 8, 2017

Uh oh!

cloud-fan commented Feb 8, 2017

Uh oh!

windpiger commented Feb 9, 2017

Uh oh!

SparkQA commented Feb 9, 2017

Uh oh!

windpiger commented Feb 9, 2017

Uh oh!

windpiger commented Feb 10, 2017

Uh oh!

SparkQA commented Feb 10, 2017

Uh oh!

jiangxb1987 commented Jun 19, 2017

Uh oh!

jiangxb1987 commented Jun 19, 2017

Uh oh!

SparkQA commented Jun 19, 2017

Uh oh!

windpiger commented Feb 4, 2017 •

edited

Loading

gatorsmile commented Feb 6, 2017 •

edited

Loading