-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token #21334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
a870c39
Improve ParseErrorListener range accuracy when offending symbol is a …
acc594a
Fix character to be relative to the current line
fc3341d
Final fix
882ac38
added test
fcb8970
checkstyle parens
e16ee34
Merge branch 'master' into patch-1
4603235
Update ErrorParserSuite.scala
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,14 +22,17 @@ import org.apache.spark.SparkFunSuite | |
| * Test various parser errors. | ||
| */ | ||
| class ErrorParserSuite extends SparkFunSuite { | ||
| def intercept(sql: String, line: Int, startPosition: Int, messages: String*): Unit = { | ||
| def intercept(sql: String, line: Int, startPosition: Int, stopPosition: Int, | ||
| messages: String*): Unit = { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: |
||
| val e = intercept[ParseException](CatalystSqlParser.parsePlan(sql)) | ||
|
|
||
| // Check position. | ||
| assert(e.line.isDefined) | ||
| assert(e.line.get === line) | ||
| assert(e.startPosition.isDefined) | ||
| assert(e.startPosition.get === startPosition) | ||
| assert(e.stop.startPosition.isDefined) | ||
| assert(e.stop.startPosition.get === stopPosition) | ||
|
|
||
| // Check messages. | ||
| val error = e.getMessage | ||
|
|
@@ -39,23 +42,24 @@ class ErrorParserSuite extends SparkFunSuite { | |
| } | ||
|
|
||
| test("no viable input") { | ||
| intercept("select ((r + 1) ", 1, 16, "no viable alternative at input", "----------------^^^") | ||
| intercept("select ((r + 1) ", 1, 16, 16, | ||
| "no viable alternative at input", "----------------^^^") | ||
| } | ||
|
|
||
| test("extraneous input") { | ||
| intercept("select 1 1", 1, 9, "extraneous input '1' expecting", "---------^^^") | ||
| intercept("select *\nfrom r as q t", 2, 12, "extraneous input", "------------^^^") | ||
| intercept("select 1 1", 1, 9, 10, "extraneous input '1' expecting", "---------^^^") | ||
| intercept("select *\nfrom r as q t", 2, 12, 13, "extraneous input", "------------^^^") | ||
| } | ||
|
|
||
| test("mismatched input") { | ||
| intercept("select * from r order by q from t", 1, 27, | ||
| intercept("select * from r order by q from t", 1, 27, 31, | ||
| "mismatched input", | ||
| "---------------------------^^^") | ||
| intercept("select *\nfrom r\norder by q\nfrom t", 4, 0, "mismatched input", "^^^") | ||
| intercept("select *\nfrom r\norder by q\nfrom t", 4, 0, 4, "mismatched input", "^^^") | ||
| } | ||
|
|
||
| test("semantic errors") { | ||
| intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, | ||
| intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, | ||
| "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not supported", | ||
| "^^^") | ||
| } | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like computation of
startcan be moved outside ? Only the computation of stop is different between commonToken and non common tokens ?Also, just for my understanding, can you please briefly explain the difference between the common token and other ones ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not exactly the same code, but does it have the same result? Looking OK to me but @rubenfiszel could you comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a pure code point of view, it's not equivalent since it is using the token.getCharPositionInline instead of the method arg.
It might be equivalent but that would require an invariant to hold (method getCharPositionInline == token.getCharPositionInLine) that seems unnecessary since the intent of this specific case is to leverage the informations from the CommonToken directly.
The difference between CommonToken and other types of offending symbols is that it is clear for CommonToken where is the stop.
We use this internally on our fork of spark to get nice language-server-protocol errors that are correctly delimited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rberenguel thanks for your explanation.