Skip to content

Conversation

@rubenfiszel
Copy link

In the case where the offending symbol is a CommonToken, this PR increases the accuracy of the start and stop origin by leveraging the start and stop index information from CommonToken.

…token

In the case where the offending symbol is a CommonToken, this PR increases the accuracy of the start and stop origin by leveraging the start and stop index information from CommonToken.
@rubenfiszel rubenfiszel changed the title Improve ParseError stop location when offending symbol is a token [minor][SQL]Improve ParseError stop location when offending symbol is a token May 15, 2018
Fix character to be relative to the current line
@ash211
Copy link
Contributor

ash211 commented May 15, 2018

Hi @rubenfiszel thanks for the contribution! Can you please take a glance through http://spark.apache.org/contributing.html to see the best way to get your change merged into Apache Spark?

I'd suggest you:

Cheers!

Ruben Fiszel and others added 2 commits May 16, 2018 00:58
@rubenfiszel rubenfiszel changed the title [minor][SQL]Improve ParseError stop location when offending symbol is a token [SPARK-24345][SQL]Improve ParseError stop location when offending symbol is a token May 22, 2018
@rubenfiszel
Copy link
Author

@ash211 As requested, implemented tests and created the associated ticket

@gatorsmile
Copy link
Member

cc @maropu @dilipbiswal

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me

@SparkQA
Copy link

SparkQA commented Mar 31, 2019

Test build #4669 has finished for PR 21334 at commit e16ee34.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Apr 1, 2019

In the PR description, could you put the simple example that this pr could make more accurate in parser errors?

class ErrorParserSuite extends SparkFunSuite {
def intercept(sql: String, line: Int, startPosition: Int, messages: String*): Unit = {
def intercept(sql: String, line: Int, startPosition: Int, stopPosition: Int,
messages: String*): Unit = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

  def intercept(
      sql: String,
      line: Int,
      startPosition: Int,
      stopPosition: Int,
      messages: String*): Unit = {

throw new ParseException(None, msg, position, position)
val (start, stop) = offendingSymbol match {
case token: CommonToken =>
val start = Origin(Some(line), Some(token.getCharPositionInLine))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like computation of start can be moved outside ? Only the computation of stop is different between commonToken and non common tokens ?

Also, just for my understanding, can you please briefly explain the difference between the common token and other ones ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exactly the same code, but does it have the same result? Looking OK to me but @rubenfiszel could you comment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a pure code point of view, it's not equivalent since it is using the token.getCharPositionInline instead of the method arg.

It might be equivalent but that would require an invariant to hold (method getCharPositionInline == token.getCharPositionInLine) that seems unnecessary since the intent of this specific case is to leverage the informations from the CommonToken directly.

The difference between CommonToken and other types of offending symbols is that it is clear for CommonToken where is the stop.

We use this internally on our fork of spark to get nice language-server-protocol errors that are correctly delimited.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rberenguel thanks for your explanation.

@SparkQA
Copy link

SparkQA commented Apr 4, 2019

Test build #4684 has finished for PR 21334 at commit 4603235.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 4, 2019

Merged to master

@srowen srowen closed this in 0e44a51 Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants