-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[Spark-20145] Fix range case insensitive bug in SQL #17487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators { | ||
| case u: UnresolvedTableValuedFunction if u.functionArgs.forall(_.resolved) => | ||
| builtinFunctions.get(u.functionName) match { | ||
| builtinFunctions.get(u.functionName.toLowerCase) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the function name is case sensitive in Hive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? There are no table valued functions in Hive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samelamin you have to make this dependent on the case sensitivity setting of the analyzer. This means you will have to turn this object into case class that takes a SQLConf as its argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok cool thanks @hvanhovell ill amend it over the weekend. Do you know how I can run a specific set of tests on SBT?
im getting garbage collection errors when I try to run using sbt test-only form http://spark.apache.org/developer-tools.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example,
build/sbt "catalyst/test-only org.apache.spark.sql.catalyst.optimizer.ColumnPruningSuite"
build/sbt -Phive "hive/test-only org.apache.spark.sql.hive.execution.SQLQuerySuite"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell instead of creating a new case class, is there a way I can reuse the UnresolvedTableValuedFunction case class and just add in the SQLConf class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, @gatorsmile is right. We don't do case sensitive resolution for functions. So you can ignore my comment, and undo the last change you have made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So would you like me to revert the change to the case class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry about that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I can revert but how will we test it since the apply method isn't being called by the tests. They new up an unresolvedTableValuedFunction? Unless we hardcore it in unresolvedTableValuedFunction to always be case insensitive then I don't know how else to do it?
|
ok to test |
|
Test build #75426 has finished for PR 17487 at commit
|
|
based on comments from @hvanhovell I am depending on the case sensitivity setting of the analyser. That said I had to make the functionName a var to change the value to lower case which feels like a code smell to me. I am happy with suggestions to how I can improve on it |
|
Test build #75453 has finished for PR 17487 at commit
|
| * }}} | ||
| */ | ||
| case class UnresolvedTableValuedFunction(functionName: String, functionArgs: Seq[Expression]) | ||
| case class UnresolvedTableValuedFunction(conf: SQLConf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samelamin this is a moot point now, but try to avoid mutable state as much as you can. In this case it would have been better to pass the SQLConf class as a parameter to the ResolveTableValuedFunction, and deal with case-sensitivity there.
|
Ok I was planning on doing that but unfortunately all the tests don't go through the resolveTableValuedFuntion but new up an UnTableValued. I totally agree with you with regards to state. What we can do is create a method that calculates the function name inside the unresolvedTableValuedFunction which then has the dependency on the conf. How does that sound? |
|
Test build #75460 has finished for PR 17487 at commit
|
|
Maybe @gatorsmile or @srowen have an idea. How can I test the case insensitive unresolvedTableValuedFunction without using a var for the functionName. I'm open to all ideas folks! introducing a var just for the tests feels inheritely incorrect. |
|
FYI I can squash these commits to clean up the history once we agree on how best to approach the testing |
|
Test build #75462 has finished for PR 17487 at commit
|
|
Test build #75463 has finished for PR 17487 at commit
|
|
@samelamin IMHO, you do not need a unit test case. We can just add end-to-end test cases. See the code changes I made in my private branch. (BTW, it would be better if you add concise comments above the code change) To generate the new result file after you add the new queries into the |
|
@gatorsmile thanks so much for that, i spent over 2 hours getting that sql file correct manually! what was frustrating was after all that i had to revert my changes 😢 Atleast I know this now so it wasnt a complete waste! Anyways I updated the code to remove the vars How does it look now folks |
|
Test build #75480 has finished for PR 17487 at commit
|
|
Test build #75481 has finished for PR 17487 at commit
|
|
LGTM cc @hvanhovell @cloud-fan |
|
LGTM |
|
perfect! Can it be merged then? |
|
Merging in master. |
What changes were proposed in this pull request?
Range in SQL should be case insensitive
How was this patch tested?
unit test