Skip to content

Commit cbffc12

Browse files
wangyumdongjoon-hyun
authored andcommitted
[SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
### What changes were proposed in this pull request? Parquet 1.12.0 New Feature - PARQUET-41 - Add bloom filters to parquet statistics - PARQUET-1373 - Encryption key management tools - PARQUET-1396 - Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory - PARQUET-1622 - Add BYTE_STREAM_SPLIT encoding - PARQUET-1784 - Column-wise configuration - PARQUET-1817 - Crypto Properties Factory - PARQUET-1854 - Properties-Driven Interface to Parquet Encryption Parquet 1.12.0 release notes: https://github.com/apache/parquet-mr/blob/apache-parquet-1.12.0/CHANGES.md ### Why are the changes needed? - Bloom filters to improve filter performance - ZSTD enhancement ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test. Closes #31649 from wangyum/SPARK-34542. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 468b944 commit cbffc12

File tree

4 files changed

+15
-15
lines changed

4 files changed

+15
-15
lines changed

dev/deps/spark-deps-hadoop-2.7-hive-2.3

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -202,12 +202,12 @@ orc-shims/1.6.7//orc-shims-1.6.7.jar
202202
oro/2.0.8//oro-2.0.8.jar
203203
osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
204204
paranamer/2.8//paranamer-2.8.jar
205-
parquet-column/1.11.1//parquet-column-1.11.1.jar
206-
parquet-common/1.11.1//parquet-common-1.11.1.jar
207-
parquet-encoding/1.11.1//parquet-encoding-1.11.1.jar
208-
parquet-format-structures/1.11.1//parquet-format-structures-1.11.1.jar
209-
parquet-hadoop/1.11.1//parquet-hadoop-1.11.1.jar
210-
parquet-jackson/1.11.1//parquet-jackson-1.11.1.jar
205+
parquet-column/1.12.0//parquet-column-1.12.0.jar
206+
parquet-common/1.12.0//parquet-common-1.12.0.jar
207+
parquet-encoding/1.12.0//parquet-encoding-1.12.0.jar
208+
parquet-format-structures/1.12.0//parquet-format-structures-1.12.0.jar
209+
parquet-hadoop/1.12.0//parquet-hadoop-1.12.0.jar
210+
parquet-jackson/1.12.0//parquet-jackson-1.12.0.jar
211211
protobuf-java/2.5.0//protobuf-java-2.5.0.jar
212212
py4j/0.10.9.2//py4j-0.10.9.2.jar
213213
pyrolite/4.30//pyrolite-4.30.jar

dev/deps/spark-deps-hadoop-3.2-hive-2.3

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -173,12 +173,12 @@ orc-shims/1.6.7//orc-shims-1.6.7.jar
173173
oro/2.0.8//oro-2.0.8.jar
174174
osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
175175
paranamer/2.8//paranamer-2.8.jar
176-
parquet-column/1.11.1//parquet-column-1.11.1.jar
177-
parquet-common/1.11.1//parquet-common-1.11.1.jar
178-
parquet-encoding/1.11.1//parquet-encoding-1.11.1.jar
179-
parquet-format-structures/1.11.1//parquet-format-structures-1.11.1.jar
180-
parquet-hadoop/1.11.1//parquet-hadoop-1.11.1.jar
181-
parquet-jackson/1.11.1//parquet-jackson-1.11.1.jar
176+
parquet-column/1.12.0//parquet-column-1.12.0.jar
177+
parquet-common/1.12.0//parquet-common-1.12.0.jar
178+
parquet-encoding/1.12.0//parquet-encoding-1.12.0.jar
179+
parquet-format-structures/1.12.0//parquet-format-structures-1.12.0.jar
180+
parquet-hadoop/1.12.0//parquet-hadoop-1.12.0.jar
181+
parquet-jackson/1.12.0//parquet-jackson-1.12.0.jar
182182
protobuf-java/2.5.0//protobuf-java-2.5.0.jar
183183
py4j/0.10.9.2//py4j-0.10.9.2.jar
184184
pyrolite/4.30//pyrolite-4.30.jar

pom.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@
136136
<kafka.version>2.6.0</kafka.version>
137137
<!-- After 10.15.1.3, the minimum required version is JDK9 -->
138138
<derby.version>10.14.2.0</derby.version>
139-
<parquet.version>1.11.1</parquet.version>
139+
<parquet.version>1.12.0</parquet.version>
140140
<orc.version>1.6.7</orc.version>
141141
<jetty.version>9.4.37.v20210219</jetty.version>
142142
<jakartaservlet.version>4.0.3</jakartaservlet.version>
@@ -2095,7 +2095,7 @@
20952095
<groupId>${hive.group}</groupId>
20962096
<artifactId>hive-service-rpc</artifactId>
20972097
</exclusion>
2098-
<!-- parquet-hadoop-bundle:1.8.1 conflict with 1.10.1 -->
2098+
<!-- parquet-hadoop-bundle:1.8.1 conflict with 1.12.0 -->
20992099
<exclusion>
21002100
<groupId>org.apache.parquet</groupId>
21012101
<artifactId>parquet-hadoop-bundle</artifactId>

sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1502,7 +1502,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
15021502
Seq(tbl, ext_tbl).foreach { tblName =>
15031503
sql(s"INSERT INTO $tblName VALUES (1, 'a', '2019-12-13')")
15041504

1505-
val expectedSize = 651
1505+
val expectedSize = 657
15061506
// analyze table
15071507
sql(s"ANALYZE TABLE $tblName COMPUTE STATISTICS NOSCAN")
15081508
var tableStats = getTableStats(tblName)

0 commit comments

Comments
 (0)