[SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token #21334

rubenfiszel · 2018-05-15T15:47:45Z

In the case where the offending symbol is a CommonToken, this PR increases the accuracy of the start and stop origin by leveraging the start and stop index information from CommonToken.

…token In the case where the offending symbol is a CommonToken, this PR increases the accuracy of the start and stop origin by leveraging the start and stop index information from CommonToken.

Fix character to be relative to the current line

ash211 · 2018-05-15T20:58:06Z

Hi @rubenfiszel thanks for the contribution! Can you please take a glance through http://spark.apache.org/contributing.html to see the best way to get your change merged into Apache Spark?

I'd suggest you:

file an issue at https://issues.apache.org/jira/projects/SPARK and put that in the PR title
include a test that verifies the fix is working as you expect

Cheers!

rubenfiszel · 2018-05-22T12:50:32Z

@ash211 As requested, implemented tests and created the associated ticket

gatorsmile · 2019-03-31T03:00:03Z

cc @maropu @dilipbiswal

srowen

Seems reasonable to me

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala

SparkQA · 2019-03-31T16:40:05Z

Test build #4669 has finished for PR 21334 at commit e16ee34.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2019-04-01T04:05:33Z

In the PR description, could you put the simple example that this pr could make more accurate in parser errors?

maropu · 2019-04-01T04:06:14Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala

 class ErrorParserSuite extends SparkFunSuite {
-  def intercept(sql: String, line: Int, startPosition: Int, messages: String*): Unit = {
+  def intercept(sql: String, line: Int, startPosition: Int, stopPosition: Int,
+                messages: String*): Unit = {


nit:

def intercept( sql: String, line: Int, startPosition: Int, stopPosition: Int, messages: String*): Unit = {

dilipbiswal · 2019-04-01T08:02:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala

-    throw new ParseException(None, msg, position, position)
+    val (start, stop) = offendingSymbol match {
+      case token: CommonToken =>
+        val start = Origin(Some(line), Some(token.getCharPositionInLine))


seems like computation of start can be moved outside ? Only the computation of stop is different between commonToken and non common tokens ?

Also, just for my understanding, can you please briefly explain the difference between the common token and other ones ?

It's not exactly the same code, but does it have the same result? Looking OK to me but @rubenfiszel could you comment?

From a pure code point of view, it's not equivalent since it is using the token.getCharPositionInline instead of the method arg.

It might be equivalent but that would require an invariant to hold (method getCharPositionInline == token.getCharPositionInLine) that seems unnecessary since the intent of this specific case is to leverage the informations from the CommonToken directly.

The difference between CommonToken and other types of offending symbols is that it is clear for CommonToken where is the stop.

We use this internally on our fork of spark to get nice language-server-protocol errors that are correctly delimited.

@rberenguel thanks for your explanation.

SparkQA · 2019-04-04T18:37:54Z

Test build #4684 has finished for PR 21334 at commit 4603235.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2019-04-04T23:21:20Z

Merged to master

Improve ParseErrorListener range accuracy when offending symbol is a …

a870c39

…token In the case where the offending symbol is a CommonToken, this PR increases the accuracy of the start and stop origin by leveraging the start and stop index information from CommonToken.

rubenfiszel changed the title ~~Improve ParseError stop location when offending symbol is a token~~ [minor][SQL]Improve ParseError stop location when offending symbol is a token May 15, 2018

This was referenced May 15, 2018

Improve ParseError stop location when offending symbol is a token palantir/spark#374

Closed

Improve ParseError stop location when offending symbol is a token palantir/spark#375

Merged

Fix character to be relative to the current line

acc594a

Fix character to be relative to the current line

Ruben Fiszel and others added 2 commits May 16, 2018 00:58

Final fix

fc3341d

added test

882ac38

rubenfiszel changed the title ~~[minor][SQL]Improve ParseError stop location when offending symbol is a token~~ [SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token May 22, 2018

srowen reviewed Mar 31, 2019

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala Outdated Show resolved Hide resolved

Ruben Fiszel added 2 commits March 31, 2019 17:12

checkstyle parens

fcb8970

Merge branch 'master' into patch-1

e16ee34

Update ErrorParserSuite.scala

4603235

maropu reviewed Apr 1, 2019

View reviewed changes

dilipbiswal reviewed Apr 1, 2019

View reviewed changes

srowen approved these changes Apr 4, 2019

View reviewed changes

srowen closed this in 0e44a51 Apr 4, 2019

[SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token #21334

[SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token #21334

Uh oh!

Conversation

rubenfiszel commented May 15, 2018

Uh oh!

ash211 commented May 15, 2018

Uh oh!

rubenfiszel commented May 22, 2018

Uh oh!

gatorsmile commented Mar 31, 2019

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SparkQA commented Mar 31, 2019

Uh oh!

maropu commented Apr 1, 2019

Uh oh!

maropu Apr 1, 2019

Choose a reason for hiding this comment

Uh oh!

dilipbiswal Apr 1, 2019

Choose a reason for hiding this comment

Uh oh!

srowen Apr 4, 2019

Choose a reason for hiding this comment

Uh oh!

rubenfiszel Apr 4, 2019

Choose a reason for hiding this comment

Uh oh!

dilipbiswal Apr 4, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 4, 2019

Uh oh!

srowen commented Apr 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants