[SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 #23788

wangyum · 2019-02-14T10:23:38Z

What changes were proposed in this pull request?

This PR mainly contains:

Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4.
Resolve compatibility issues between Hive 1.2.1 and Hive 2.3.4 in the sql/hive module.

How was this patch tested?

jenkins test hadoop-2.7
manual test hadoop-3:

build/sbt clean package -Phadoop-3.2 -Phive
export SPARK_PREPEND_CLASSES=true

# rm -rf metastore_db

cat <<EOF > test_hadoop3.scala
spark.range(10).write.saveAsTable("test_hadoop3")
spark.table("test_hadoop3").show
EOF

bin/spark-shell --conf spark.hadoop.hive.metastore.schema.verification=false --conf spark.hadoop.datanucleus.schema.autoCreateAll=true -i test_hadoop3.scala

SparkQA · 2019-02-14T15:08:10Z

Test build #102343 has finished for PR 23788 at commit 0644e94.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-02-14T16:46:13Z

Test build #102356 has finished for PR 23788 at commit 3c0c72e.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-02-15T07:26:09Z

Test build #102374 has finished for PR 23788 at commit fc10762.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2019-02-20T09:00:04Z

pom.xml

+        </dependency>
+        <dependency>
+          <groupId>${hive.group}</groupId>
+          <artifactId>hive-llap-client</artifactId>


Need this dependency, otherwise:

build/sbt "hive/testOnly *.StatisticsSuite" -Phadoop-3.1 sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/llap/security/LlapSigner$Signable at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) at java.lang.Class.getConstructor0(Class.java:3075) at java.lang.Class.getDeclaredConstructor(Class.java:2178) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:79) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:208) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:201) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:500) at org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:521) at org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:181) at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:129) at org.apache.spark.sql.hive.test.TestHive$.<init>(TestHive.scala:53) at org.apache.spark.sql.hive.test.TestHive$.<clinit>(TestHive.scala) at org.apache.spark.sql.hive.test.TestHiveSingleton.$init$(TestHiveSingleton.scala:30) at org.apache.spark.sql.hive.StatisticsSuite.<init>(StatisticsSuite.scala:45) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:435) at sbt.ForkMain$Run$2.call(ForkMain.java:296) at sbt.ForkMain$Run$2.call(ForkMain.java:286) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: sbt.ForkMain$ForkError: java.lang.ClassNotFoundException: org.apache.hadoop.hive.llap.security.LlapSigner$Signable at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) at java.lang.Class.getConstructor0(Class.java:3075) at java.lang.Class.getDeclaredConstructor(Class.java:2178) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:79) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:208) at org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDTF(Registry.java:201) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:500) at org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:521) at org.apache.spark.sql.hive.test.TestHiveSparkSession.<init>(TestHive.scala:181) at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:129) at org.apache.spark.sql.hive.test.TestHive$.<init>(TestHive.scala:53) at org.apache.spark.sql.hive.test.TestHive$.<clinit>(TestHive.scala) at org.apache.spark.sql.hive.test.TestHiveSingleton.$init$(TestHiveSingleton.scala:30) at org.apache.spark.sql.hive.StatisticsSuite.<init>(StatisticsSuite.scala:45) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:435) at sbt.ForkMain$Run$2.call(ForkMain.java:296) at sbt.ForkMain$Run$2.call(ForkMain.java:286) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

pom.xml

SparkQA · 2019-02-20T12:25:11Z

Test build #102540 has finished for PR 23788 at commit 4527555.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-02-28T19:27:46Z

Test build #102861 has finished for PR 23788 at commit 19df56f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2019-03-05T02:39:19Z

@wangyum, you can pick my commits and PR from #21588 to make the tests passed if needed here.

srowen

I don't know the details well here but it looks reasonable if it's mostly moves, dependency changes, and adding Hive 2 code paths

...3.4/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java

SparkQA · 2019-03-15T23:47:11Z

Test build #103543 has finished for PR 23788 at commit b0b132e.

This patch fails from timeout after a configured wait of 400m.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2019-03-17T06:46:54Z

retest this please

SparkQA · 2019-03-17T22:22:24Z

Test build #103587 has finished for PR 23788 at commit b0b132e.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

pom.xml

…le to sql/core/v1.2.1 ## What changes were proposed in this pull request? To make #23788 easy to review. This PR moves `OrcColumnVector.java`, `OrcShimUtils.scala`, `OrcFilters.scala` and `OrcFilterSuite.scala` to `sql/core/v1.2.1` and copies it to `sql/core/v2.3.4`. ## How was this patch tested? manual tests ```shell diff -urNa sql/core/v1.2.1 sql/core/v2.3.4 ``` Closes #24119 from wangyum/SPARK-27182. Authored-by: Yuming Wang <[email protected]> Signed-off-by: gatorsmile <[email protected]>

# Conflicts: # dev/deps/spark-deps-hadoop-3.2 # pom.xml # sql/core/pom.xml # sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala # sql/core/v1.2.1/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala # sql/core/v2.3.4/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java # sql/core/v2.3.4/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala # sql/core/v2.3.4/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala

wangyum · 2019-03-27T11:31:18Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala

-      builder <- buildSearchArgument(dataTypeMap, conjunction, newBuilder)
-    } yield builder.build()
+    if (HiveUtils.isHive2) {
+      BuiltinOrcFilters.createFilter(schema, filters).asInstanceOf[Option[SearchArgument]]


If the built-in Hive is 2.3.4, we use org.apache.spark.sql.execution.datasources.orc.OrcFilters to create the filter.

wangyum · 2019-03-27T11:36:05Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala

-    val parameterInfo = new SimpleGenericUDAFParameterInfo(inputInspectors, false, false)
-    resolver.getEvaluator(parameterInfo)
+    val clazz = Utils.classForName(classOf[SimpleGenericUDAFParameterInfo].getName)
+    if (HiveUtils.isHive2) {


Hive 2.3.x(HIVE-13453):
val parameterInfo = new SimpleGenericUDAFParameterInfo(inputInspectors, false, false, false)
Hive 1.x:
val parameterInfo = new SimpleGenericUDAFParameterInfo(inputInspectors, false, false)

wangyum · 2019-03-27T11:49:43Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveShim.scala

    def deserializePlan[UDFType](is: java.io.InputStream, clazz: Class[_]): UDFType = {
-      deserializeObjectByKryo(Utilities.runtimeSerializationKryo.get(), is, clazz)
-        .asInstanceOf[UDFType]
+      if (HiveUtils.isHive2) {


Hive 2.x(HIVE-12302):

import org.apache.hadoop.hive.ql.exec.SerializationUtilities val kryo = SerializationUtilities.borrowKryo() try { SerializationUtilities.deserializeObjectByKryo(kryo, is, clazz).asInstanceOf[UDFType] } finally { SerializationUtilities.releaseKryo(kryo) }

Hive 1.x:

import org.apache.hadoop.hive.ql.exec.Utilities Utilities.deserializeObjectByKryo(Utilities.runtimeSerializationKryo.get(), is, clazz) .asInstanceOf[UDFType]

SparkQA · 2019-03-27T15:06:42Z

Test build #104009 has finished for PR 23788 at commit ce27fb3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

pom.xml

dev/deps/spark-deps-hadoop-3.2

srowen · 2019-03-27T18:31:42Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala


+  private val hive1Version = "1.2.1"
+  private val hive2Version = "2.3.4"
+  val isHive2: Boolean = HiveVersionInfo.getVersion.equals(hive2Version)


Should this check be a little more general, to match all 2.x versions? should it fail on Hive 3.x?

I think it only works on Hive 2.3.x.

The current code is not compatible with Hive 2.0 - Hive 2.2: https://github.com/apache/spark/pull/23788/files#diff-53f31aa4bbd9274f40547cd00cf0826dR341

The current code is not compatible with Hive 3.1(HIVE-12192).

[ERROR] /Users/yumwang/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala:451: type mismatch; found : Timestamp (in org.apache.hadoop.hive.common.type) required: Timestamp (in java.sql) [ERROR] row.setLong(ordinal, DateTimeUtils.fromJavaTimestamp(oi.getPrimitiveJavaObject(value)))

OK, should this at least check for "isHive23" then and match on starting with "2.3."? otherwise this may well work with 2.3.5 but will fail. What about a future 2.4?

Can we ... drop Hive 1.2.x support entirely here or in a next PR?

Yes. isHive23 is more reasonable. I will update it later.

Can we ... drop Hive 1.2.x support entirely here or in a next PR?

Removing Hive 1.2.x support may be a bit risky. cc @gatorsmile

wangyum · 2019-03-28T05:05:20Z

pom.xml

            <artifactId>commons-logging</artifactId>
          </exclusion>
+          <!-- Hive 2.3.4 -->
+          <exclusion>


Exclude jetty-all, it conflict with jetty 9.4.12.v20180830:

build/sbt clean package -Phadoop-3.2 -Phive ... [error] /home/yumwang/opensource/spark/core/src/main/scala/org/apache/spark/SSLOptions.scala:78: value setTrustStorePath is not a member of org.eclipse.jetty.util.ssl.SslContextFactory [error] trustStore.foreach(file => sslContextFactory.setTrustStorePath(file.getAbsolutePath)) [error]

wangyum · 2019-03-28T06:31:35Z

pom.xml

            <artifactId>groovy-all</artifactId>
          </exclusion>
+          <!-- Hive 2.3.4 -->
+          <exclusion>


Exclude log4j-slf4j-impl, otherwise:

$ build/sbt clean package -Phadoop-3.2 -Phive $ export SPARK_PREPEND_CLASSES=true $ bin/spark-shell NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/logging/log4j/spi/AbstractLoggerAdapter at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.slf4j.impl.StaticLoggerBinder.<clinit>(StaticLoggerBinder.java:36) at org.apache.spark.internal.Logging$.org$apache$spark$internal$Logging$$isLog4j12(Logging.scala:217) at org.apache.spark.internal.Logging.initializeLogging(Logging.scala:122) at org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:111) at org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:105) at org.apache.spark.deploy.SparkSubmit.initializeLogIfNecessary(SparkSubmit.scala:73) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.spi.AbstractLoggerAdapter at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 22 more

wangyum · 2019-03-28T14:27:29Z

pom.xml

+            <groupId>${hive.group}</groupId>
+          <artifactId>hive-llap-tez</artifactId>
+          </exclusion>
+          <exclusion>


Exclude calcite-druid and avatica. more details: https://issues.apache.org/jira/browse/SPARK-27054

SparkQA · 2019-04-04T07:05:02Z

Test build #104277 has finished for PR 23788 at commit 78ceb00.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2019-04-04T07:07:57Z

retest this please

SparkQA · 2019-04-04T12:25:00Z

Test build #104281 has finished for PR 23788 at commit 78ceb00.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2019-04-04T17:21:21Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala

-        Some(builder.startAnd().equals(attribute, value).end())
+        val bd = builder.startAnd()
+        val method = findMethod(bd.getClass, "equals", classOf[String], classOf[Object])
+        Some(method.invoke(bd, attribute, value.asInstanceOf[AnyRef]).asInstanceOf[Builder].end())


Cast value to AnyRef based on the following:
https://github.com/apache/spark/pull/8799/files#diff-6cac9bc2656e3782b0312dceb8c55d47R132
https://github.com/apache/hive/blob/release-1.2.1/serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java#L255

Otherwise:

[error] /Users/yumwang/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala:180: type mismatch; [error] found : Any [error] required: Object [error] Some(method.invoke(bd, attribute, value).asInstanceOf[Builder].end())

SparkQA · 2019-04-04T21:13:38Z

Test build #104294 has finished for PR 23788 at commit 2c571c7.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-04T21:32:25Z

Test build #104295 has finished for PR 23788 at commit d22e7e0.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

…-client and co.cask.tephra:*

pom.xml

SparkQA · 2019-04-07T07:05:01Z

Test build #104350 has finished for PR 23788 at commit a3f7cff.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2019-04-08T03:30:25Z

retest this please

SparkQA · 2019-04-08T07:05:01Z

Test build #104370 has finished for PR 23788 at commit a3f7cff.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-08T12:55:46Z

Test build #104375 has finished for PR 23788 at commit 073c883.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2019-04-08T15:41:31Z

LGTM

So far, this change only impacts the profile Hadoop 3.x. It is safe in general. It might still have a few issues when we running the tests using Hadoop 3.x. Let us resolve them if existed, when we trigger the tests using Hadoop 3.x profile.

Thanks! Merged to master.

dongjoon-hyun · 2020-02-11T00:55:53Z

Hi, @yhuai and @liancheng .
This is the PR which switched from orc:nohive to hive-storage-api in Hive 2.3 profile

To @wangyum and @gatorsmile .
@yhuai created the following JIRA to revert partially this PR.

https://issues.apache.org/jira/browse/SPARK-30784 Hive 2.3 profile should still use orc-nohive

dongjoon-hyun · 2020-02-11T01:01:35Z

...ore/v2.3.4/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java

 import java.math.BigDecimal;

-import org.apache.orc.storage.ql.exec.vector.*;
+import org.apache.hadoop.hive.ql.exec.vector.*;


Yes .. we shouldn't do this..

dongjoon-hyun · 2020-02-11T01:01:49Z

sql/core/v2.3.4/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala

+import org.apache.hadoop.hive.ql.io.sarg.{PredicateLeaf, SearchArgument}
+import org.apache.hadoop.hive.ql.io.sarg.SearchArgument.Builder
+import org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.newBuilder
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable


dongjoon-hyun · 2020-02-11T01:01:55Z

...core/v2.3.4/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcShimUtils.scala

+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
+import org.apache.hadoop.hive.ql.io.sarg.{SearchArgument => OrcSearchArgument}
+import org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf.{Operator => OrcOperator}
+import org.apache.hadoop.hive.serde2.io.{DateWritable, HiveDecimalWritable}


yhuai · 2020-02-11T03:27:49Z

thank you @dongjoon-hyun. do you and @wangyum have any concern of using nohive?

yhuai · 2020-02-11T04:01:14Z

i created #27536

…lap.scope` in root `pom.xml` ### What changes were proposed in this pull request? This PR aims to fix `hive-llap-common` dependency to use `hive.llap.scope` in root pom for Apache Spark 3.5 and 4.0. ### Why are the changes needed? Apache Spark has been supposed to use `hive.llap.scope` for `hive-llap-common` dependency and `hive` module do it correctly. https://github.com/apache/spark/blob/a1b0f256c04e5b632075358d1e2f946e64588da6/sql/hive/pom.xml#L119-L123 Since Apache Spark 3.0.0 (SPARK-27176), the root `pom.xml` file has been using a wrong scope mistakenly. Probably, it's due to `-Phive-provided` support. This causes a confusion to other external systems and the users. We had better fix the root `pom.xml` to use `hive.llap.scope` correctly. - #23788 ### Does this PR introduce _any_ user-facing change? No, there is no change technically because `hive` module has been using a correct scope. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49733 from dongjoon-hyun/SPARK-51039. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…lap.scope` in root `pom.xml` ### What changes were proposed in this pull request? This PR aims to fix `hive-llap-common` dependency to use `hive.llap.scope` in root pom for Apache Spark 3.5 and 4.0. ### Why are the changes needed? Apache Spark has been supposed to use `hive.llap.scope` for `hive-llap-common` dependency and `hive` module do it correctly. https://github.com/apache/spark/blob/a1b0f256c04e5b632075358d1e2f946e64588da6/sql/hive/pom.xml#L119-L123 Since Apache Spark 3.0.0 (SPARK-27176), the root `pom.xml` file has been using a wrong scope mistakenly. Probably, it's due to `-Phive-provided` support. This causes a confusion to other external systems and the users. We had better fix the root `pom.xml` to use `hive.llap.scope` correctly. - #23788 ### Does this PR introduce _any_ user-facing change? No, there is no change technically because `hive` module has been using a correct scope. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49733 from dongjoon-hyun/SPARK-51039. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 7243de6) Signed-off-by: Dongjoon Hyun <[email protected]>

…lap.scope` in root `pom.xml` This PR aims to fix `hive-llap-common` dependency to use `hive.llap.scope` in root pom for Apache Spark 3.5 and 4.0. Apache Spark has been supposed to use `hive.llap.scope` for `hive-llap-common` dependency and `hive` module do it correctly. https://github.com/apache/spark/blob/a1b0f256c04e5b632075358d1e2f946e64588da6/sql/hive/pom.xml#L119-L123 Since Apache Spark 3.0.0 (SPARK-27176), the root `pom.xml` file has been using a wrong scope mistakenly. Probably, it's due to `-Phive-provided` support. This causes a confusion to other external systems and the users. We had better fix the root `pom.xml` to use `hive.llap.scope` correctly. - #23788 No, there is no change technically because `hive` module has been using a correct scope. Pass the CIs. No. Closes #49733 from dongjoon-hyun/SPARK-51039. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 7243de6) Signed-off-by: Dongjoon Hyun <[email protected]>

…lap.scope` in root `pom.xml` ### What changes were proposed in this pull request? This PR aims to fix `hive-llap-common` dependency to use `hive.llap.scope` in root pom for Apache Spark 3.5 and 4.0. ### Why are the changes needed? Apache Spark has been supposed to use `hive.llap.scope` for `hive-llap-common` dependency and `hive` module do it correctly. https://github.com/apache/spark/blob/94185fd22eefd11ddded66b09538ae3e7693664f/sql/hive/pom.xml#L119-L123 Since Apache Spark 3.0.0 (SPARK-27176), the root `pom.xml` file has been using a wrong scope mistakenly. Probably, it's due to `-Phive-provided` support. This causes a confusion to other external systems and the users. We had better fix the root `pom.xml` to use `hive.llap.scope` correctly. - apache#23788 ### Does this PR introduce _any_ user-facing change? No, there is no change technically because `hive` module has been using a correct scope. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#49733 from dongjoon-hyun/SPARK-51039. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 0d19313) Signed-off-by: Dongjoon Hyun <[email protected]>

wangyum changed the title ~~[SPARK-23710][SQL] Only hadoop-3.1 upgrades built-in Hive to 2.3.4~~ [SPARK-23710][SQL] Only upgrade hadoop-3.1's built-in Hive to 2.3.4 Feb 14, 2019

HyukjinKwon mentioned this pull request Feb 16, 2019

[SPARK-24590][BUILD] Make Jenkins tests passed with hadoop 3 profile #21588

Closed

wangyum commented Feb 20, 2019

View reviewed changes

pom.xml Show resolved Hide resolved

srowen reviewed Mar 15, 2019

View reviewed changes

...3.4/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java Show resolved Hide resolved

wangyum changed the title ~~[SPARK-23710][SQL] Only upgrade hadoop-3.1's built-in Hive to 2.3.4~~ [SPARK-23710][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 Mar 15, 2019

wangyum changed the title ~~[SPARK-23710][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4~~ [SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 Mar 15, 2019

wangyum mentioned this pull request Mar 17, 2019

[SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 #24119

Closed

steveloughran reviewed Mar 21, 2019

View reviewed changes

pom.xml Show resolved Hide resolved

wangyum commented Mar 27, 2019

View reviewed changes

liancheng reviewed Mar 27, 2019

View reviewed changes

pom.xml Show resolved Hide resolved

srowen reviewed Mar 27, 2019

View reviewed changes

wangyum commented Mar 28, 2019

View reviewed changes

Fix exclusion

0f2f07c

wangyum commented Mar 28, 2019

View reviewed changes

Matching the Hive 2.3.x prefix

78ceb00

wangyum added 2 commits April 5, 2019 00:34

Fix indent

2c571c7

Object -> AnyRef

d22e7e0

wangyum commented Apr 4, 2019

View reviewed changes

Exclude org.eclipse.jetty.orbit:javax.servlet, org.apache.hbase:hbase…

a3f7cff

…-client and co.cask.tephra:*

felixcheung reviewed Apr 7, 2019

View reviewed changes

pom.xml Outdated Show resolved Hide resolved

hive.parquet.version -> 1.8.1

073c883

gatorsmile closed this in 33f3c48 Apr 8, 2019

dongjoon-hyun mentioned this pull request Feb 11, 2020

[SPARK-29981][BUILD] Add hive-1.2/2.3 profiles #26619

Closed

dongjoon-hyun reviewed Feb 11, 2020

View reviewed changes

dongjoon-hyun mentioned this pull request Jan 30, 2025

[SPARK-51039][BUILD] Fix hive-llap-common dependency to use hive.llap.scope in root pom.xml #49733

Closed

[SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 #23788

[SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 #23788

Uh oh!

Conversation

wangyum commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Feb 14, 2019

Uh oh!

SparkQA commented Feb 14, 2019

Uh oh!

SparkQA commented Feb 15, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SparkQA commented Feb 20, 2019

Uh oh!

SparkQA commented Feb 28, 2019

Uh oh!

HyukjinKwon commented Mar 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SparkQA commented Mar 15, 2019

Uh oh!

HyukjinKwon commented Mar 17, 2019

Uh oh!

SparkQA commented Mar 17, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 27, 2019

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangyum Mar 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangyum Mar 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 4, 2019

Uh oh!

wangyum commented Apr 4, 2019

Uh oh!

SparkQA commented Apr 4, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 4, 2019

Uh oh!

SparkQA commented Apr 4, 2019

Uh oh!

Uh oh!

SparkQA commented Apr 7, 2019

Uh oh!

gatorsmile commented Apr 8, 2019

wangyum commented Feb 14, 2019 •

edited

Loading

HyukjinKwon commented Mar 5, 2019 •

edited

Loading

wangyum Mar 28, 2019 •

edited

Loading

wangyum Mar 28, 2019 •

edited

Loading

dongjoon-hyun commented Feb 11, 2020 •

edited

Loading