Skip to content

Commit b04330c

Browse files
huaxingaodongjoon-hyun
authored andcommitted
[SPARK-36454][SQL] Not push down partition filter to ORCScan for DSv2
### What changes were proposed in this pull request? not push down partition filter to `ORCScan` for DSv2 ### Why are the changes needed? Seems to me that partition filter is only used for partition pruning and shouldn't be pushed down to `ORCScan`. We don't push down partition filter to ORCScan in DSv1 ``` == Physical Plan == *(1) Filter (isnotnull(value#19) AND NOT (value#19 = a)) +- *(1) ColumnarToRow +- FileScan orc [value#19,p1#20,p2#21] Batched: true, DataFilters: [isnotnull(value#19), NOT (value#19 = a)], Format: ORC, Location: InMemoryFileIndex(1 paths)[file:/private/var/folders/pt/_5f4sxy56x70dv9zpz032f0m0000gn/T/spark-c1..., PartitionFilters: [isnotnull(p1#20), isnotnull(p2#21), (p1#20 = 1), (p2#21 = 2)], PushedFilters: [IsNotNull(value), Not(EqualTo(value,a))], ReadSchema: struct<value:string> ``` Also, we don't push down partition filter for parquet in DSv2. #30652 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing test suites Closes #33680 from huaxingao/orc_filter. Authored-by: Huaxin Gao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 33c6d11 commit b04330c

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcScanBuilder.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,8 @@ case class OrcScanBuilder(
5353

5454
override def pushFilters(filters: Array[Filter]): Array[Filter] = {
5555
if (sparkSession.sessionState.conf.orcFilterPushDown) {
56-
val dataTypeMap = OrcFilters.getSearchableTypeMap(schema, SQLConf.get.caseSensitiveAnalysis)
56+
val dataTypeMap = OrcFilters.getSearchableTypeMap(
57+
readDataSchema(), SQLConf.get.caseSensitiveAnalysis)
5758
_pushedFilters = OrcFilters.convertibleFilters(dataTypeMap, filters).toArray
5859
}
5960
filters

sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -460,7 +460,7 @@ class ExplainSuite extends ExplainSuiteHelper with DisableAdaptiveExecutionSuite
460460
"parquet" ->
461461
"|PushedFilters: \\[IsNotNull\\(value\\), GreaterThan\\(value,2\\)\\]",
462462
"orc" ->
463-
"|PushedFilters: \\[.*\\(id\\), .*\\(value\\), .*\\(id,1\\), .*\\(value,2\\)\\]",
463+
"|PushedFilters: \\[IsNotNull\\(value\\), GreaterThan\\(value,2\\)\\]",
464464
"csv" ->
465465
"|PushedFilters: \\[IsNotNull\\(value\\), GreaterThan\\(value,2\\)\\]",
466466
"json" ->

0 commit comments

Comments
 (0)