Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution AlreadyOptimized is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the analyzed flag to true after optimization, and analyzer will skip processing plans whose analyzed flag is true.

Why are the changes needed?

make the code simpler and safer

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests.

@SparkQA
Copy link

SparkQA commented Dec 15, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37413/

@github-actions github-actions bot added the SQL label Dec 15, 2020
@SparkQA
Copy link

SparkQA commented Dec 15, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37413/

@SparkQA
Copy link

SparkQA commented Dec 15, 2020

Test build #132812 has finished for PR 30777 at commit bcca0d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

cc @brkyvz

@maropu
Copy link
Member

maropu commented Dec 20, 2020

The change looks reasonable to me.

HyukjinKwon pushed a commit that referenced this pull request Dec 21, 2020
…analyzed

### What changes were proposed in this pull request?

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution `AlreadyOptimized` is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the `analyzed` flag to true after optimization, and analyzer will skip processing plans whose `analyzed` flag is true.

### Why are the changes needed?

make the code simpler and safer

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests.

Closes #30777 from cloud-fan/ds.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit b4bea1a)
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon
Copy link
Member

Merged to master and branch-3.1.

@cloud-fan, it has a conflict with branch-3.0. Do you mind opening a backport PR?

@brkyvz
Copy link
Contributor

brkyvz commented Dec 21, 2020

late LGTM. I believe this is a much more robust solution

cloud-fan added a commit to cloud-fan/spark that referenced this pull request Dec 21, 2020
…analyzed

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution `AlreadyOptimized` is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the `analyzed` flag to true after optimization, and analyzer will skip processing plans whose `analyzed` flag is true.

make the code simpler and safer

no

existing tests.

Closes apache#30777 from cloud-fan/ds.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit b4bea1a)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 22, 2020
…e re-analyzed

backport #30777 to 3.0

----------

### What changes were proposed in this pull request?

It's a known issue that re-analyzing an optimized plan can lead to various issues. We made several attempts to avoid it from happening, but the current solution `AlreadyOptimized` is still not 100% safe, as people can inject catalyst rules to call analyzer directly.

This PR proposes a simpler and safer idea: we set the `analyzed` flag to true after optimization, and analyzer will skip processing plans whose `analyzed` flag is true.

### Why are the changes needed?

make the code simpler and safer

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests.

Closes #30872 from cloud-fan/ds.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants