Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Apr 25, 2020

What changes were proposed in this pull request?

In the PR, I propose to fix the InSet.sql method for the cases when input collection contains values of internal Catalyst's types, for instance UTF8String. Elements of the input set hset are converted to Scala types, and wrapped by Literal to properly form SQL view of the input collection.

Why are the changes needed?

The changes fixed the bug in InSet.sql that makes wrong assumption about types of collection elements. See more details in SPARK-31563.

Does this PR introduce any user-facing change?

Highly likely, not.

How was this patch tested?

Added a test to ColumnExpressionSuite

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible. This needs to go to 2.4 too right?

@SparkQA
Copy link

SparkQA commented Apr 25, 2020

Test build #121802 has finished for PR 28343 at commit d46c177.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you for splitting your existing PR, @MaxGekk .
Yes, @srowen . This is a fix for all 2.x. So, I asked to split the original PR.
Merged to master/3.0/2.4.

dongjoon-hyun pushed a commit that referenced this pull request Apr 25, 2020
…st's internal types

### What changes were proposed in this pull request?
In the PR, I propose to fix the `InSet.sql` method for the cases when input collection contains values of internal Catalyst's types, for instance `UTF8String`. Elements of the input set `hset` are converted to Scala types, and wrapped by `Literal` to properly form SQL view of the input collection.

### Why are the changes needed?
The changes fixed the bug in `InSet.sql` that makes wrong assumption about types of collection elements. See more details in SPARK-31563.

### Does this PR introduce any user-facing change?
Highly likely, not.

### How was this patch tested?
Added a test to `ColumnExpressionSuite`

Closes #28343 from MaxGekk/fix-InSet-sql.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 7d8216a)
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Apr 25, 2020
…st's internal types

In the PR, I propose to fix the `InSet.sql` method for the cases when input collection contains values of internal Catalyst's types, for instance `UTF8String`. Elements of the input set `hset` are converted to Scala types, and wrapped by `Literal` to properly form SQL view of the input collection.

The changes fixed the bug in `InSet.sql` that makes wrong assumption about types of collection elements. See more details in SPARK-31563.

Highly likely, not.

Added a test to `ColumnExpressionSuite`

Closes #28343 from MaxGekk/fix-InSet-sql.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 7d8216a)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

cc @holdenk since she is the release manager for 2.4.6.

@holdenk
Copy link
Contributor

holdenk commented Apr 27, 2020

Thanks everyone :)

val valueSQL = child.sql
val listSQL = hset.toSeq.map(Literal(_).sql).mkString(", ")
val listSQL = hset.toSeq
.map(elem => Literal(convertToScala(elem, child.dataType)).sql)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this converts the internal value to external value, and then Literal.apply converts external value to internal value.

Can we just do Literal(elem, child.dataType).sql?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan No, Literal fails on UTF8String value, see #28328 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is #28328 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the problem is Literal requires external types but elem has internal Catalyst's type

@MaxGekk MaxGekk deleted the fix-InSet-sql branch June 5, 2020 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants