Skip to content

Commit 2d609bf

Browse files
[SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10
### What changes were proposed in this pull request? This PR aims to bump Spark's built-in Hive from 2.3.9 to Hive 2.3.10, with two additional changes: - due to API breaking changes of Thrift, `libthrift` is upgraded from `0.12` to `0.16`. - remove version management of `commons-lang:2.6`, it comes from Hive transitive deps, Hive 2.3.10 drops it in apache/hive#4892 This is the first part of #45372 ### Why are the changes needed? Bump Hive to the latest version of 2.3, prepare for upgrading Guava, and dropping vulnerable dependencies like Jackson 1.x / Jodd ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. (wait for sunchao to complete the 2.3.10 release to make jars visible on Maven Central) ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45372 Closes #46468 from pan3793/SPARK-47018. Lead-authored-by: Cheng Pan <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 1138b2a commit 2d609bf

File tree

17 files changed

+61
-74
lines changed

17 files changed

+61
-74
lines changed

connector/kafka-0-10-assembly/pom.xml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,6 @@
5454
<artifactId>commons-codec</artifactId>
5555
<scope>provided</scope>
5656
</dependency>
57-
<dependency>
58-
<groupId>commons-lang</groupId>
59-
<artifactId>commons-lang</artifactId>
60-
<scope>provided</scope>
61-
</dependency>
6257
<dependency>
6358
<groupId>com.google.protobuf</groupId>
6459
<artifactId>protobuf-java</artifactId>

connector/kinesis-asl-assembly/pom.xml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,6 @@
5454
<artifactId>jackson-databind</artifactId>
5555
<scope>provided</scope>
5656
</dependency>
57-
<dependency>
58-
<groupId>commons-lang</groupId>
59-
<artifactId>commons-lang</artifactId>
60-
<scope>provided</scope>
61-
</dependency>
6257
<dependency>
6358
<groupId>org.glassfish.jersey.core</groupId>
6459
<artifactId>jersey-client</artifactId>

dev/deps/spark-deps-hadoop-3-hive-2.3

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,6 @@ commons-compress/1.26.1//commons-compress-1.26.1.jar
4646
commons-crypto/1.1.0//commons-crypto-1.1.0.jar
4747
commons-dbcp/1.4//commons-dbcp-1.4.jar
4848
commons-io/2.16.1//commons-io-2.16.1.jar
49-
commons-lang/2.6//commons-lang-2.6.jar
5049
commons-lang3/3.14.0//commons-lang3-3.14.0.jar
5150
commons-math3/3.6.1//commons-math3-3.6.1.jar
5251
commons-pool/1.5.4//commons-pool-1.5.4.jar
@@ -81,19 +80,19 @@ hadoop-cloud-storage/3.4.0//hadoop-cloud-storage-3.4.0.jar
8180
hadoop-huaweicloud/3.4.0//hadoop-huaweicloud-3.4.0.jar
8281
hadoop-shaded-guava/1.2.0//hadoop-shaded-guava-1.2.0.jar
8382
hadoop-yarn-server-web-proxy/3.4.0//hadoop-yarn-server-web-proxy-3.4.0.jar
84-
hive-beeline/2.3.9//hive-beeline-2.3.9.jar
85-
hive-cli/2.3.9//hive-cli-2.3.9.jar
86-
hive-common/2.3.9//hive-common-2.3.9.jar
87-
hive-exec/2.3.9/core/hive-exec-2.3.9-core.jar
88-
hive-jdbc/2.3.9//hive-jdbc-2.3.9.jar
89-
hive-llap-common/2.3.9//hive-llap-common-2.3.9.jar
90-
hive-metastore/2.3.9//hive-metastore-2.3.9.jar
91-
hive-serde/2.3.9//hive-serde-2.3.9.jar
83+
hive-beeline/2.3.10//hive-beeline-2.3.10.jar
84+
hive-cli/2.3.10//hive-cli-2.3.10.jar
85+
hive-common/2.3.10//hive-common-2.3.10.jar
86+
hive-exec/2.3.10/core/hive-exec-2.3.10-core.jar
87+
hive-jdbc/2.3.10//hive-jdbc-2.3.10.jar
88+
hive-llap-common/2.3.10//hive-llap-common-2.3.10.jar
89+
hive-metastore/2.3.10//hive-metastore-2.3.10.jar
90+
hive-serde/2.3.10//hive-serde-2.3.10.jar
9291
hive-service-rpc/4.0.0//hive-service-rpc-4.0.0.jar
93-
hive-shims-0.23/2.3.9//hive-shims-0.23-2.3.9.jar
94-
hive-shims-common/2.3.9//hive-shims-common-2.3.9.jar
95-
hive-shims-scheduler/2.3.9//hive-shims-scheduler-2.3.9.jar
96-
hive-shims/2.3.9//hive-shims-2.3.9.jar
92+
hive-shims-0.23/2.3.10//hive-shims-0.23-2.3.10.jar
93+
hive-shims-common/2.3.10//hive-shims-common-2.3.10.jar
94+
hive-shims-scheduler/2.3.10//hive-shims-scheduler-2.3.10.jar
95+
hive-shims/2.3.10//hive-shims-2.3.10.jar
9796
hive-storage-api/2.8.1//hive-storage-api-2.8.1.jar
9897
hk2-api/3.0.3//hk2-api-3.0.3.jar
9998
hk2-locator/3.0.3//hk2-locator-3.0.3.jar
@@ -184,7 +183,7 @@ kubernetes-model-storageclass/6.12.1//kubernetes-model-storageclass-6.12.1.jar
184183
lapack/3.0.3//lapack-3.0.3.jar
185184
leveldbjni-all/1.8//leveldbjni-all-1.8.jar
186185
libfb303/0.9.3//libfb303-0.9.3.jar
187-
libthrift/0.12.0//libthrift-0.12.0.jar
186+
libthrift/0.16.0//libthrift-0.16.0.jar
188187
log4j-1.2-api/2.22.1//log4j-1.2-api-2.22.1.jar
189188
log4j-api/2.22.1//log4j-api-2.22.1.jar
190189
log4j-core/2.22.1//log4j-core-2.22.1.jar

docs/building-spark.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,9 +85,9 @@ Example:
8585

8686
To enable Hive integration for Spark SQL along with its JDBC server and CLI,
8787
add the `-Phive` and `-Phive-thriftserver` profiles to your existing build options.
88-
By default Spark will build with Hive 2.3.9.
88+
By default Spark will build with Hive 2.3.10.
8989

90-
# With Hive 2.3.9 support
90+
# With Hive 2.3.10 support
9191
./build/mvn -Pyarn -Phive -Phive-thriftserver -DskipTests clean package
9292

9393
## Packaging without Hadoop Dependencies for YARN

docs/sql-data-sources-hive-tables.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -127,10 +127,10 @@ The following options can be used to configure the version of Hive that is used
127127
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
128128
<tr>
129129
<td><code>spark.sql.hive.metastore.version</code></td>
130-
<td><code>2.3.9</code></td>
130+
<td><code>2.3.10</code></td>
131131
<td>
132132
Version of the Hive metastore. Available
133-
options are <code>2.0.0</code> through <code>2.3.9</code> and <code>3.0.0</code> through <code>3.1.3</code>.
133+
options are <code>2.0.0</code> through <code>2.3.10</code> and <code>3.0.0</code> through <code>3.1.3</code>.
134134
</td>
135135
<td>1.4.0</td>
136136
</tr>
@@ -142,9 +142,9 @@ The following options can be used to configure the version of Hive that is used
142142
property can be one of four options:
143143
<ol>
144144
<li><code>builtin</code></li>
145-
Use Hive 2.3.9, which is bundled with the Spark assembly when <code>-Phive</code> is
145+
Use Hive 2.3.10, which is bundled with the Spark assembly when <code>-Phive</code> is
146146
enabled. When this option is chosen, <code>spark.sql.hive.metastore.version</code> must be
147-
either <code>2.3.9</code> or not defined.
147+
either <code>2.3.10</code> or not defined.
148148
<li><code>maven</code></li>
149149
Use Hive jars of specified version downloaded from Maven repositories. This configuration
150150
is not generally recommended for production deployments.

docs/sql-migration-guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1068,7 +1068,7 @@ Python UDF registration is unchanged.
10681068
Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs.
10691069
Currently, Hive SerDes and UDFs are based on built-in Hive,
10701070
and Spark SQL can be connected to different versions of Hive Metastore
1071-
(from 0.12.0 to 2.3.9 and 3.0.0 to 3.1.3. Also see [Interacting with Different Versions of Hive Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)).
1071+
(from 2.0.0 to 2.3.10 and 3.0.0 to 3.1.3. Also see [Interacting with Different Versions of Hive Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)).
10721072

10731073
#### Deploying in Existing Hive Warehouses
10741074
{:.no_toc}

pom.xml

Lines changed: 13 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -132,8 +132,8 @@
132132
<hive.group>org.apache.hive</hive.group>
133133
<hive.classifier>core</hive.classifier>
134134
<!-- Version used in Maven Hive dependency -->
135-
<hive.version>2.3.9</hive.version>
136-
<hive23.version>2.3.9</hive23.version>
135+
<hive.version>2.3.10</hive.version>
136+
<hive23.version>2.3.10</hive23.version>
137137
<!-- Version used for internal directory structure -->
138138
<hive.version.short>2.3</hive.version.short>
139139
<!-- note that this should be compatible with Kafka brokers version 0.10 and up -->
@@ -192,8 +192,6 @@
192192
<commons-codec.version>1.17.0</commons-codec.version>
193193
<commons-compress.version>1.26.1</commons-compress.version>
194194
<commons-io.version>2.16.1</commons-io.version>
195-
<!-- org.apache.commons/commons-lang/-->
196-
<commons-lang2.version>2.6</commons-lang2.version>
197195
<!-- org.apache.commons/commons-lang3/-->
198196
<commons-lang3.version>3.14.0</commons-lang3.version>
199197
<!-- org.apache.commons/commons-pool2/-->
@@ -206,7 +204,7 @@
206204
<jodd.version>3.5.2</jodd.version>
207205
<jsr305.version>3.0.0</jsr305.version>
208206
<jaxb.version>2.2.11</jaxb.version>
209-
<libthrift.version>0.12.0</libthrift.version>
207+
<libthrift.version>0.16.0</libthrift.version>
210208
<antlr4.version>4.13.1</antlr4.version>
211209
<jpam.version>1.1</jpam.version>
212210
<selenium.version>4.17.0</selenium.version>
@@ -615,11 +613,6 @@
615613
<artifactId>commons-text</artifactId>
616614
<version>1.12.0</version>
617615
</dependency>
618-
<dependency>
619-
<groupId>commons-lang</groupId>
620-
<artifactId>commons-lang</artifactId>
621-
<version>${commons-lang2.version}</version>
622-
</dependency>
623616
<dependency>
624617
<groupId>commons-io</groupId>
625618
<artifactId>commons-io</artifactId>
@@ -2294,8 +2287,8 @@
22942287
<artifactId>janino</artifactId>
22952288
</exclusion>
22962289
<exclusion>
2297-
<groupId>org.pentaho</groupId>
2298-
<artifactId>pentaho-aggdesigner-algorithm</artifactId>
2290+
<groupId>net.hydromatic</groupId>
2291+
<artifactId>aggdesigner-algorithm</artifactId>
22992292
</exclusion>
23002293
<!-- End of Hive 2.3 exclusion -->
23012294
</exclusions>
@@ -2365,6 +2358,10 @@
23652358
<groupId>org.codehaus.groovy</groupId>
23662359
<artifactId>groovy-all</artifactId>
23672360
</exclusion>
2361+
<exclusion>
2362+
<groupId>com.lmax</groupId>
2363+
<artifactId>disruptor</artifactId>
2364+
</exclusion>
23682365
</exclusions>
23692366
</dependency>
23702367

@@ -2805,6 +2802,10 @@
28052802
<groupId>org.slf4j</groupId>
28062803
<artifactId>slf4j-api</artifactId>
28072804
</exclusion>
2805+
<exclusion>
2806+
<groupId>javax.annotation</groupId>
2807+
<artifactId>javax.annotation-api</artifactId>
2808+
</exclusion>
28082809
</exclusions>
28092810
</dependency>
28102811
<dependency>
@@ -2898,12 +2899,6 @@
28982899
<artifactId>hive-storage-api</artifactId>
28992900
<version>${hive.storage.version}</version>
29002901
<scope>${hive.storage.scope}</scope>
2901-
<exclusions>
2902-
<exclusion>
2903-
<groupId>commons-lang</groupId>
2904-
<artifactId>commons-lang</artifactId>
2905-
</exclusion>
2906-
</exclusions>
29072902
</dependency>
29082903
<dependency>
29092904
<groupId>commons-cli</groupId>

sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/KerberosSaslHelper.java

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
import org.apache.thrift.TProcessorFactory;
3131
import org.apache.thrift.transport.TSaslClientTransport;
3232
import org.apache.thrift.transport.TTransport;
33+
import org.apache.thrift.transport.TTransportException;
3334

3435
public final class KerberosSaslHelper {
3536

@@ -68,8 +69,8 @@ public static TTransport createSubjectAssumedTransport(String principal,
6869
new TSaslClientTransport("GSSAPI", null, names[0], names[1], saslProps, null,
6970
underlyingTransport);
7071
return new TSubjectAssumingTransport(saslTransport);
71-
} catch (SaslException se) {
72-
throw new IOException("Could not instantiate SASL transport", se);
72+
} catch (SaslException | TTransportException se) {
73+
throw new IOException("Could not instantiate transport", se);
7374
}
7475
}
7576

sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/PlainSaslHelper.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
import org.apache.thrift.transport.TSaslClientTransport;
3939
import org.apache.thrift.transport.TSaslServerTransport;
4040
import org.apache.thrift.transport.TTransport;
41+
import org.apache.thrift.transport.TTransportException;
4142
import org.apache.thrift.transport.TTransportFactory;
4243

4344
public final class PlainSaslHelper {
@@ -64,7 +65,7 @@ public static TTransportFactory getPlainTransportFactory(String authTypeStr)
6465
}
6566

6667
public static TTransport getPlainTransport(String username, String password,
67-
TTransport underlyingTransport) throws SaslException {
68+
TTransport underlyingTransport) throws SaslException, TTransportException {
6869
return new TSaslClientTransport("PLAIN", null, null, null, new HashMap<String, String>(),
6970
new PlainCallbackHandler(username, password), underlyingTransport);
7071
}

sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/TSetIpAddressProcessor.java

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,12 @@ public TSetIpAddressProcessor(Iface iface) {
4646
}
4747

4848
@Override
49-
public boolean process(final TProtocol in, final TProtocol out) throws TException {
49+
public void process(final TProtocol in, final TProtocol out) throws TException {
5050
setIpAddress(in);
5151
setUserName(in);
5252
try {
53-
return super.process(in, out);
53+
super.process(in, out);
54+
return;
5455
} finally {
5556
THREAD_LOCAL_USER_NAME.remove();
5657
THREAD_LOCAL_IP_ADDRESS.remove();

0 commit comments

Comments
 (0)