Skip to content

Conversation

@cloud-fan
Copy link
Contributor

backport #30777 to 3.0


What changes were proposed in this pull request?

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution AlreadyOptimized is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the analyzed flag to true after optimization, and analyzer will skip processing plans whose analyzed flag is true.

Why are the changes needed?

make the code simpler and safer

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests.

…analyzed

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution `AlreadyOptimized` is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the `analyzed` flag to true after optimization, and analyzer will skip processing plans whose `analyzed` flag is true.

make the code simpler and safer

no

existing tests.

Closes apache#30777 from cloud-fan/ds.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit b4bea1a)
Signed-off-by: HyukjinKwon <[email protected]>
@cloud-fan cloud-fan changed the title [SPARK-28863][SQL][FOLLOWUP] Make sure optimized plan will not be re-analyzed [SPARK-28863][SQL][FOLLOWUP][3.0] Make sure optimized plan will not be re-analyzed Dec 21, 2020
@cloud-fan
Copy link
Contributor Author

cc @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37769/

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37769/

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Test build #133170 has finished for PR 30872 at commit ccc1807.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to branch-3.0.

HyukjinKwon pushed a commit that referenced this pull request Dec 22, 2020
…e re-analyzed

backport #30777 to 3.0

----------

### What changes were proposed in this pull request?

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution `AlreadyOptimized` is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the `analyzed` flag to true after optimization, and analyzer will skip processing plans whose `analyzed` flag is true.

### Why are the changes needed?

make the code simpler and safer

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests.

Closes #30872 from cloud-fan/ds.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants