[SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands #36304

aokolnychyi · 2022-04-21T17:25:30Z

What changes were proposed in this pull request?

This PR adds runtime group filtering for group-based row-level operations.

Why are the changes needed?

These changes are needed to avoid rewriting unnecessary groups as the data skipping during job planning is limited and can still report false positive groups to rewrite.

Does this PR introduce any user-facing change?

This PR leverages existing APIs.

How was this patch tested?

This PR comes with tests.

aokolnychyi · 2022-04-21T17:45:21Z

@cloud-fan, here is one way to achieve proper group filtering in row-level operations. While I don't target 3.3 with this functionality, I believe it is still important to finish this before 3.3 is officially out. That way, we still have a chance to change the row-level API if needed before it gets released. Currently, it is fully backward compatible.

cc @rdblue @huaxingao @sunchao @viirya @dongjoon-hyun

Some tests are expected to fail due to SPARK-38977. I have another PR #36303 to fix this.

sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/RowLevelOperation.java

github-actions · 2022-09-23T00:28:09Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

aokolnychyi · 2022-09-26T20:50:27Z

I want to resume working on this PR but I need feedback on one point.

In the original implementation, @cloud-fan and I discussed supporting a separate scan builder for runtime group filtering in row-level operations. That way, we can prune columns and push down filters while looking for groups that have matches. We can't do that in the main row-level scan for group-based data sources as non-matching records in matching groups have to be copied over. See PR #35395 for context.

The current idea is to have an optimizer rule that would check if the main scan implements SupportsRuntimeFiltering and
assemble a plan to find matching groups. The filtering plan would use a separate scan builder.

The only challenge is ensuring the same version of the table is scanned in the main row-level scan and in the scan that searches for matching groups to rewrite. There are multiple solutions to consider.

Option 1

The first option is shown in this PR. We can add a new method to RowLevelOperation that would provide us a scan builder for runtime group filtering.

interface RowLevelOperation {
  // existing method
  ScanBuilder newScanBuilder(CaseInsensitiveStringMap options);

  // new method
  default ScanBuilder newAffectedGroupsScanBuilder(CaseInsensitiveStringMap options) {
     return newScanBuilder(options);
  }

  ...
}

Under this implementation, it is up to data sources to ensure the same version is scanned in both scans. It is a fairly simple approach but it complicates the row-level API. On top, the new method is useless for data sources that can handle a delta of rows.

Option 2

The main row-level Scan can report scanned tableVersion and we can use that information to load a correct table version in the rule that assigns a runtime filter. This can be done via TableCatalog$load(ident, version). The only API change is to extend Scan with tableVersion to know which table version is being read in the main scan.

Option 3

The rule that assigns a runtime group filter has access to the original Table object. We could just call newScanBuilder on it. However, I don't see anything in the API implying that reusing the Table instance guarantees the same version of the table will be scanned. If we call newScanBuilder on the same Table instance, do we expect the same version to be scanned? Seems like it is NOT the assumption right now.

If we can somehow benefit from reusing Table object, it will be the cleanest option from the API perspective.

Any ideas how to make Option 3 work?

cc @cloud-fan @rdblue @huaxingao @dongjoon-hyun @sunchao @viirya

rdblue · 2022-09-28T23:09:34Z

I talked with @aokolnychyi about this and I think this is a data source problem, not something Spark should track right now.

The main problem is that some table sources have different versions and that's not something that we're used to handling. Data sources that don't have different versions are not affected, so option 1 is not great because it forces everyone to deal with a problem only few sources have.

Spark could use option 2 and track this itself, but that complicates the API as well and we don't know that we need it yet. If we do add version/history to Spark then we'd probably want to add SHOW HISTORY and things as well.

We've also found a reliable way for option 3 to work. The underlying table instance is the same, so the filter method just needs to check that the table instance has not been refreshed or modified when the runtime filter is applied to it.

I think that option 3 is the simplest approach in terms of new Spark APIs (none!) and is the right way forward until Spark decides to model tables with multiple versions.

aokolnychyi · 2022-09-29T15:29:10Z

I agree, @rdblue.

Given that versioned data sources can handle it internally (as the same Table instance will be used in both scans), I think that's good enough for now. We can implement option 2 or something similar later if needed. We shouldn't block runtime filtering for row-level operations on that. I'll update the PR.

…row-level commands

aokolnychyi · 2022-09-30T22:17:59Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

      .createWithDefault(67108864L)

+  val RUNTIME_ROW_LEVEL_OPERATION_GROUP_FILTER_ENABLED =
+    buildConf("spark.sql.optimizer.runtime.rowLevelOperationGroupFilter.enabled")


I went back and forth on the name. On one hand, we have dynamic partition pruning. On the other hand, we call it runtime filtering in DS V2. Ideas are welcome.

I also used the spark.sql.optimizer.runtime prefix like for runtime Bloom filter joins. There are other runtime-related configs that don't use this prefix so let me know the correct config namespace.

aokolnychyi · 2022-09-30T22:23:53Z

...a/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala

+  override def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
+    // apply special dynamic filtering only for group-based row-level operations
+    case GroupBasedRowLevelOperation(replaceData, cond,
+        DataSourceV2ScanRelation(_, scan: SupportsRuntimeV2Filtering, _, _, _))


This is the optimizer rule that checks whether the primary row-level scan supports runtime filtering. As long as a data source implements SupportsRuntimeV2Filtering, it should be enough to benefit from the new functionality.

Also, the runtime group filter uses the existing framework for runtime filtering in DS V2, meaning we get all the benefits like subquery reuse, etc.

aokolnychyi · 2022-09-30T22:25:53Z

...a/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala

+          // use the original table instance that was loaded for this row-level operation
+          // in order to leverage a regular batch scan in the group filter query
+          val originalTable = r.relation.table.asRowLevelOperationTable.table
+          val relation = r.relation.copy(table = originalTable)


We build DataSourceV2Relation so the scan can prune columns and push filters into groups.

aokolnychyi · 2022-10-04T16:01:45Z

@cloud-fan @rdblue @huaxingao @dongjoon-hyun @sunchao @viirya, I've updated this PR and it should be ready for a detailed review round whenever you have a minute.

dongjoon-hyun · 2022-10-07T01:55:54Z

sql/core/src/test/scala/org/apache/spark/sql/connector/DeleteFromTableSuite.scala

 }

-class GroupBasedDeleteFromTableSuite extends DeleteFromTableSuiteBase
+class GroupBasedDeleteFromTableSuite extends DeleteFromTableSuiteBase {


Could you create GroupBasedDeleteFromTableSuite.scala?

@dongjoon-hyun, do you mean move this class into its own file? I can surely do that.

Done. I also renamed the original file as DeleteFromTableSuiteBase is the only class now.

dongjoon-hyun · 2022-10-07T01:56:22Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+        "discard groups that don't have to be rewritten.")
+      .version("3.4.0")
+      .booleanConf
+      .createWithDefault(true)


I agree with starting with true in this case.

dongjoon-hyun

+1 for AS-IS implementation (with only minor comments). Thank you, @aokolnychyi and @rdblue .

huaxingao · 2022-10-10T06:55:49Z

+1, this PR looks good to me.

viirya · 2022-10-10T08:42:06Z

...t/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryRowLevelOperationTable.scala

    properties: util.Map[String, String])
  extends InMemoryTable(name, schema, partitioning, properties) with SupportsRowLevelOperations {

+  var replacedPartitions: Seq[Seq[Any]] = Seq.empty


Maybe add a comment to mention this is for test.

Added a comment above.

dongjoon-hyun · 2022-10-11T20:42:29Z

Merged to master for Apache Spark 3.4.0.
Thank you, @aokolnychyi , @rdblue , @viirya , @huaxingao .

Also, cc @cloud-fan

aokolnychyi · 2022-10-11T21:51:05Z

Thank you for reviewing, @dongjoon-hyun @viirya @huaxingao @rdblue!

Also, thanks @cloud-fan for the original discussion we had around this feature.

cloud-fan · 2022-10-19T15:49:32Z

...a/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala

+ *
+ * Note this rule only applies to group-based row-level operations.
+ */
+case class RowLevelOperationRuntimeGroupFiltering(optimizeSubqueries: Rule[LogicalPlan])


why do we need to pass the rule as a parameter? Can't we call OptimizeSubqueries directly in this rule?

I also thought about this, but I think it's hard to reference OptimizeSubqueries outside Optimizer, since the former is more like a "inner class" of the latter, and references the current instance of latter in itself (i.e.: Optimizer.this.execute(Subquery.fromExpression(s)))

@sunchao is correct. It wasn't easy to call OptimizeSubqueries outside Optimizer. Hence, I had to come up with this workaround.

@cloud-fan, I also considered simply adding OptimizeSubqueries to the batch with runtime partition filtering. However, SPARK-36444 specifically removed it from there.

An alternative idea could be to move OptimizeSubqueries into its own file. However, that's tricky too as it calls the optimizer.

Optimizer.this.execute(Subquery.fromExpression(s))

cloud-fan · 2022-10-19T15:55:23Z

...a/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala

+      case r: DataSourceV2Relation if r eq relation =>
+        val oldOutput = r.output
+        val newOutput = oldOutput.map(_.newInstance())
+        r.copy(output = newOutput) -> oldOutput.zip(newOutput)


nit:

val newRelation = r.newInstance newRelation -> r.output.zip(newRelation.output)

cloud-fan · 2022-10-19T16:37:02Z

...a/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala

+      }
+
+      // optimize subqueries to rewrite them as joins and trigger job planning
+      replaceData.copy(query = optimizeSubqueries(newQuery))


Does this mean we revert what we did in RewriteDeleteFromTable before?

Not really, @cloud-fan. This rule simply attaches a runtime filter to the plan that was created while rewriting the delete. We do replace the query but it is pretty much the same plan, just with an extra runtime filter.

cloud-fan · 2022-10-19T16:38:32Z

This is much simpler than I expected, great design! Sorry for the late review.

cloud-fan · 2022-10-19T16:51:39Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala

    Batch("PartitionPruning", Once,
-      PartitionPruning) :+
+      PartitionPruning,
+      RowLevelOperationRuntimeGroupFiltering(OptimizeSubqueries)) :+


I think another idea is to run OptimizeSubqueries in this batch:

PartitionPruning, RowLevelOperationRuntimeGroupFiltering, // The rule above may create subqueries, need to optimize them. OptimizeSubqueries

This would be much cleaner but SPARK-36444 removed OptimizeSubqueries from that batch.

aokolnychyi · 2022-10-20T00:21:03Z

I will be off until next Monday. I'll address the comments then. Thanks for taking a look, @cloud-fan!

aokolnychyi · 2022-11-04T19:03:30Z

Still remember about following up on this and another PR. Slowly getting there.

…untimeGroupFiltering ### What changes were proposed in this pull request? This PR is to address the feedback on PR #36304 after that change was merged. ### Why are the changes needed? These changes are needed for better code quality. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #38526 from aokolnychyi/spark-38959-follow-up. Authored-by: aokolnychyi <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

…d optimize subqueries ### What changes were proposed in this pull request? This is a followup to #36304 to simplify `RowLevelOperationRuntimeGroupFiltering`. It does 3 things: 1. run `OptimizeSubqueries` in the batch `PartitionPruning`, so that `RowLevelOperationRuntimeGroupFiltering` does not need to invoke it manually. 2. skip dpp subquery in `OptimizeSubqueries`, to avoid the issue fixed by #33664 3. `RowLevelOperationRuntimeGroupFiltering` creates `InSubquery` instead of `DynamicPruningSubquery`, so that it can be optimized by `OptimizeSubqueries` later. This also avoids unnecessary planning overhead of `DynamicPruningSubquery`, as there is no join and we can only run it as a subquery. ### Why are the changes needed? code simplification ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes #38557 from cloud-fan/help. Lead-authored-by: Wenchen Fan <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

…untimeGroupFiltering ### What changes were proposed in this pull request? This PR is to address the feedback on PR apache#36304 after that change was merged. ### Why are the changes needed? These changes are needed for better code quality. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes apache#38526 from aokolnychyi/spark-38959-follow-up. Authored-by: aokolnychyi <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

…d optimize subqueries ### What changes were proposed in this pull request? This is a followup to apache#36304 to simplify `RowLevelOperationRuntimeGroupFiltering`. It does 3 things: 1. run `OptimizeSubqueries` in the batch `PartitionPruning`, so that `RowLevelOperationRuntimeGroupFiltering` does not need to invoke it manually. 2. skip dpp subquery in `OptimizeSubqueries`, to avoid the issue fixed by apache#33664 3. `RowLevelOperationRuntimeGroupFiltering` creates `InSubquery` instead of `DynamicPruningSubquery`, so that it can be optimized by `OptimizeSubqueries` later. This also avoids unnecessary planning overhead of `DynamicPruningSubquery`, as there is no join and we can only run it as a subquery. ### Why are the changes needed? code simplification ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes apache#38557 from cloud-fan/help. Lead-authored-by: Wenchen Fan <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

github-actions bot added the SQL label Apr 21, 2022

aokolnychyi commented Apr 21, 2022

View reviewed changes

sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/RowLevelOperation.java Outdated Show resolved Hide resolved

github-actions bot added the Stale label Sep 23, 2022

github-actions bot closed this Sep 25, 2022

viirya reopened this Sep 26, 2022

viirya removed the Stale label Sep 26, 2022

[SPARK-38959][SQL] DataSource V2: Support runtime group filtering in …

6708f2a

…row-level commands

aokolnychyi force-pushed the spark-38959 branch from 4f77298 to 6708f2a Compare September 30, 2022 22:14

aokolnychyi commented Sep 30, 2022

View reviewed changes

dongjoon-hyun reviewed Oct 7, 2022

View reviewed changes

dongjoon-hyun approved these changes Oct 7, 2022

View reviewed changes

viirya reviewed Oct 10, 2022

View reviewed changes

viirya approved these changes Oct 10, 2022

View reviewed changes

Review feedback

ccccfd4

huaxingao approved these changes Oct 11, 2022

View reviewed changes

dongjoon-hyun closed this in 1c6bd9e Oct 11, 2022

cloud-fan reviewed Oct 19, 2022

View reviewed changes

aokolnychyi mentioned this pull request Nov 6, 2022

[SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering #38526

Closed

cloud-fan mentioned this pull request Nov 8, 2022

[SPARK-38959][SQL][FOLLOWUP] Optimizer batch PartitionPruning should optimize subqueries #38557

Closed

[SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands #36304

[SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands #36304

Uh oh!

Conversation

aokolnychyi commented Apr 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

aokolnychyi commented Apr 21, 2022

Uh oh!

Uh oh!

github-actions bot commented Sep 23, 2022

Uh oh!

aokolnychyi commented Sep 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdblue commented Sep 28, 2022

Uh oh!

aokolnychyi commented Sep 29, 2022

Uh oh!

aokolnychyi Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented Oct 4, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huaxingao commented Oct 10, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Oct 11, 2022

Uh oh!

aokolnychyi commented Oct 11, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Oct 19, 2022

aokolnychyi commented Apr 21, 2022 •

edited

Loading

aokolnychyi commented Sep 26, 2022 •

edited

Loading

aokolnychyi Sep 30, 2022 •

edited

Loading

aokolnychyi Sep 30, 2022 •

edited

Loading

aokolnychyi Sep 30, 2022 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

aokolnychyi Nov 6, 2022 •

edited

Loading