Skip to content

Conversation

@windpiger
Copy link
Contributor

@windpiger windpiger commented Feb 4, 2017

What changes were proposed in this pull request?

Currently when we new a HiveClient for a specific metastore version and spark.sql.hive.metastore.jars is setted to maven, Spark will download the hive jars from remote repo(http://www.datanucleus.org/downloads/maven2).

we should allow the user to load hive jars from the user defined repos(e.g.local repo which has already downloaded), and user defined repos should take priority order than default repos(e.g. ${user.home}/.m2/repository)

the similar way which is to be referenced from SparkSubmit processing --packages
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L289-L298

How was this patch tested?

unit test added

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72370 has finished for PR 16803 at commit 5c8727d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72379 has finished for PR 16803 at commit 0c76584.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72380 has finished for PR 16803 at commit 6e66793.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72381 has finished for PR 16803 at commit c790a4b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72382 has finished for PR 16803 at commit 5ab3cdc.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72383 has finished for PR 16803 at commit 5570e74.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72384 has finished for PR 16803 at commit 630b019.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 5, 2017

Test build #72403 has finished for PR 16803 at commit d61c077.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger windpiger changed the title [WIP][SPARK-19458][SQL]load hive jars from local repo which has downloaded [SPARK-19458][SQL]load hive jars from local repo which has downloaded Feb 5, 2017
@windpiger
Copy link
Contributor Author

cc @cloud-fan @gatorsmile @yhuai

@cloud-fan
Copy link
Contributor

This should be taged as [BUILD]

OptionAssigner(args.ivyRepoPath, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,
sysProp = "spark.jars.ivy"),
OptionAssigner(args.repositories, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,
sysProp = "spark.jars.repositories"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a new option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is used to store user's repos , then we can use it in download hive jars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to document it in http://spark.apache.org/docs/latest/configuration.html, like what we did for spark.jars.ivy


def isSameFile(left: String, right: String): Boolean = {
val leftInput: FileInputStream = new FileInputStream(left)
val leftMd5 = UTF8String.fromString(org.apache.commons.codec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why convert it to UTF8String and compare?

@SparkQA
Copy link

SparkQA commented Feb 6, 2017

Test build #72425 has started for PR 16803 at commit 1bb31e5.

@windpiger
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 6, 2017

Test build #72428 has finished for PR 16803 at commit 1bb31e5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

gatorsmile commented Feb 6, 2017

Please rename it to [SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded

@gatorsmile
Copy link
Member

Adding a new option spark.jars.repositories afffects more than loading hive jars, right?

@windpiger windpiger changed the title [SPARK-19458][SQL]load hive jars from local repo which has downloaded [SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded Feb 6, 2017
@windpiger
Copy link
Contributor Author

yes, but in this pr which modify the IsolatedClientLoader.forVersion it only affects Hive jars, maybe later we can use it in another place to download other jars

@SparkQA
Copy link

SparkQA commented Feb 6, 2017

Test build #72457 has finished for PR 16803 at commit 9330e35.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger windpiger changed the title [SPARK-19458][BUILD][SQL]load hive jars from local repo which has downloaded [SPARK-19458][BUILD]load hive jars from local repo which has downloaded Feb 8, 2017
@windpiger
Copy link
Contributor Author

@cloud-fan @gatorsmile @yhuai I will appreciate that help to continue review this. Thanks a lot~

@cloud-fan
Copy link
Contributor

I'm not an expert about this area, but do we have to introduce a new config? can we load hive jar from local maven repo?

@windpiger
Copy link
Contributor Author

if we not set ivy.jars.repos , it will use default ${user.home}/.m2 repo, and if we set ivy.jars.path which has download, it will can alos load from this path.

@SparkQA
Copy link

SparkQA commented Feb 9, 2017

Test build #72629 has started for PR 16803 at commit 51b8f5e.

@windpiger
Copy link
Contributor Author

@dongjoon-hyun @srowen could you help to review this? thanks very much!

@windpiger
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 10, 2017

Test build #72678 has finished for PR 16803 at commit 51b8f5e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please.

@jiangxb1987
Copy link
Contributor

ping @cloud-fan @gatorsmile @dongjoon-hyun Any thoughts on this?

@SparkQA
Copy link

SparkQA commented Jun 19, 2017

Test build #78257 has finished for PR 16803 at commit 51b8f5e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

Sorry I'm not familiar with the build stuff, cc @srowen

@jiangxb1987
Copy link
Contributor

cc @jerryshao

@jerryshao
Copy link
Contributor

@windpiger can you please rebase the code, it seems too old to review.

@jiangxb1987
Copy link
Contributor

Should we close this PR since it goes stale? @cloud-fan @jerryshao

@jiangxb1987
Copy link
Contributor

I'm going to close this PR because it goes stale, please feel free to reopen it or open another PR if anyone have more thoughts on this issue.

@asfgit asfgit closed this in ed1478c Nov 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants