Skip to content

Conversation

@tedyu
Copy link
Contributor

@tedyu tedyu commented Feb 3, 2016

See http://search-hadoop.com/m/q3RTtFoTDi2HVCrM1 for related stack trace

java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.<init>(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)

@SparkQA
Copy link

SparkQA commented Feb 4, 2016

Test build #50703 has finished for PR 11066 at commit 6ee7d98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu tedyu changed the title Protect against SessionState being null when accessing HiveClientImpl#conf [SPARK-13180] Protect against SessionState being null when accessing HiveClientImpl#conf Feb 4, 2016

/** Returns the configuration for the current session. */
def conf: HiveConf = SessionState.get().getConf
def conf: HiveConf = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tedyu
Copy link
Contributor Author

tedyu commented Feb 4, 2016

The method in HiveQl is marked private[this].
HiveClientImpl doesn't reference HiveQl.

HiveShim.scala and SparkSQLCLIDriver.scala refer to SessionState but they don't seem to be suitable place to put the common code.

Do you have suggestion on which class to put the common code in ?

Thanks

@hvanhovell
Copy link
Contributor

@tedyu lets create a HiveConfUtil object in the client package, put the method in and use that in both places.

@SparkQA
Copy link

SparkQA commented Feb 4, 2016

Test build #50749 has finished for PR 11066 at commit aff16d6.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2016

Test build #50750 has finished for PR 11066 at commit 438e2a3.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2016

Test build #50751 has finished for PR 11066 at commit 0a43c19.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2016

Test build #50754 has finished for PR 11066 at commit f20a61f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented Feb 4, 2016

This error surfaced:

[info]   2016-02-04 10:15:21.319 - stderr> Exception in thread "main" java.lang.LinkageError: loader constraint violation: when resolving method "org.apache.spark.sql.hive.      client.HiveConfUtil$.conf()Lorg/apache/hadoop/hive/conf/HiveConf;"
 the class loader (instance of org/apache/spark/sql/hive/client/IsolatedClientLoader$$anon$1) of the current    class, org/apache/spark/sql/hive/client/HiveClientImpl, and the class loader
(instance of sun/misc/Launcher$AppClassLoader) for resolved class, org/apache/spark/sql/hive/client/ HiveConfUtil$, have different Class objects for the type /hadoop/hive/conf/HiveConf; used in the signature

@tedyu
Copy link
Contributor Author

tedyu commented Feb 4, 2016

I wonder if rev 1 should be used:

the utility class only saves less than 10 lines of code
intricacies were introduced w.r.t. class loading

@tedyu
Copy link
Contributor Author

tedyu commented Feb 5, 2016

@hvanhovell
What do you think ?

@SparkQA
Copy link

SparkQA commented Feb 5, 2016

Test build #50827 has finished for PR 11066 at commit 8b9e24c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented Feb 5, 2016

@davies
Can you take a look ?

@davies
Copy link
Contributor

davies commented Feb 5, 2016

@tedyu This seems like a workaround. Have you find the root cause? Maybe there is a place that we didn't set the SessionState for current thread.

@tedyu
Copy link
Contributor Author

tedyu commented Feb 5, 2016

From HiveContext.scala , line 552 (the last line below) which appeared in the stack trace:

   * 1. create a new SessionState for each HiveContext
   * 2. when the Hive session is first initialized, params in HiveConf will get picked up by the
   *    SQLConf.  Additionally, any properties set by set() or a SET command inside sql() will be
   *    set in the SQLConf *as well as* in the HiveConf.
   */
  @transient
  protected[hive] lazy val hiveconf: HiveConf = {
    val c = executionHive.conf

If I read the doc correctly, the above call is supposed to create new SessionState which is what the PR is doing.

@davies
Copy link
Contributor

davies commented Feb 5, 2016

The doc is out-updated, HiveConf is created in HiveClientImp.scla LINE 107 (when created HiveClientImp obj).

@tedyu
Copy link
Contributor Author

tedyu commented Feb 5, 2016

The user was using 1.6.0 where there was no HiveClientImp.scala
In ClientWrapper.scala (1.6.0), HiveConf is created at line 175

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket.

Note: Looks like this code path has been rewritten heavily and the problem might not apply anymore.

@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants