-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-17031][SQL] Add Scanner operator to wrap the optimized plan directly in planner
#14619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #63679 has finished for PR 14619 at commit
|
|
Could you elaborate on what you are trying to do here? I am missing context here. What is the advantage of doing this? |
|
@hvanhovell This idea is inspired by @cloud-fan, as he stated in comment, we'd better have a wrapper node for scan, so that the planner may match the wrapper node directly instead of resolving the whole plan using |
|
Test build #63696 has finished for PR 14619 at commit
|
|
I'm pretty confused by this as well. Is this just collapsing Filter, Project, and an arbitrary node into a single logical node? |
|
see discussion here: #13893 (comment) Currently we collect the projects and filters on scan node at planner by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't reuse the existing ColumnPruning and PushDownPredicate rules. I'd like to add the wrapper at the very beginning, e.g. SessionCatalog.lookUpRelation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger that, will change it and retest.
|
Test build #64533 has finished for PR 14619 at commit
|
|
Test build #64544 has finished for PR 14619 at commit
|
|
@cloud-fan I've moved the Firstly, scan a relation is not among basic operators in SQL language, when we declare a relation, we imply it should be scanned, so It seems semantically duplicate to declare a Secondly, a wrapper node should contain the output, predicates that can be used in partition pruning, and a relation to be scanned. But this may cause complex situation in some cases, for example, in At last, I feel adding such a operator have caused too many changes, perhaps we should make some improvement on After all, I'm passionate to this improvement and will try my best to contribute, please correct me if I'm wrong, thank you! |
|
Test build #65033 has finished for PR 14619 at commit
|
|
Test build #65044 has finished for PR 14619 at commit
|
What changes were proposed in this pull request?
Added
Scanneroperator to wrap the optimized plan directly in planner, it holds project lists as well as filter predicates.Updated relative
AnalyzerandOptimizerrules.How was this patch tested?
Existing testcases.