[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as null when nullValue is set. #22389

MaxGekk · 2018-09-11T06:45:24Z

What changes were proposed in this pull request?

In the PR, I propose new CSV option emptyValue and an update in the SQL Migration Guide which describes how to revert previous behavior when empty strings were not written at all. Since Spark 2.4, empty strings are saved as "" to distinguish them from saved nulls.

Closes #22234
Closes #22367

How was this patch tested?

It was tested by CSVSuite and new tests added in the PR #22234

This reverts commit 48e143d.

SparkQA · 2018-09-11T07:05:02Z

Test build #95922 has finished for PR 22389 at commit 9a04d87.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2018-09-11T07:07:45Z

retest this please

SparkQA · 2018-09-11T10:53:15Z

Test build #95924 has finished for PR 22389 at commit 9a04d87.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-09-11T12:46:46Z

Merged to master and branch-2.4.

…sed as null when nullValue is set. ## What changes were proposed in this pull request? In the PR, I propose new CSV option `emptyValue` and an update in the SQL Migration Guide which describes how to revert previous behavior when empty strings were not written at all. Since Spark 2.4, empty strings are saved as `""` to distinguish them from saved `null`s. Closes #22234 Closes #22367 ## How was this patch tested? It was tested by `CSVSuite` and new tests added in the PR #22234 Closes #22389 from MaxGekk/csv-empty-value-master. Lead-authored-by: Mario Molina <[email protected]> Co-authored-by: Maxim Gekk <[email protected]> Signed-off-by: hyukjinkwon <[email protected]> (cherry picked from commit c9cb393) Signed-off-by: hyukjinkwon <[email protected]>

…sed as null when nullValue is set. ## What changes were proposed in this pull request? In the PR, I propose new CSV option `emptyValue` and an update in the SQL Migration Guide which describes how to revert previous behavior when empty strings were not written at all. Since Spark 2.4, empty strings are saved as `""` to distinguish them from saved `null`s. Closes apache#22234 Closes apache#22367 ## How was this patch tested? It was tested by `CSVSuite` and new tests added in the PR apache#22234 Closes apache#22389 from MaxGekk/csv-empty-value-master. Lead-authored-by: Mario Molina <[email protected]> Co-authored-by: Maxim Gekk <[email protected]> Signed-off-by: hyukjinkwon <[email protected]>

mmolimar and others added 9 commits September 11, 2018 08:41

Configurable empty values when reading/writing CSV files

458c097

Adding tests

471b8ba

Changing emptyValue order arg in streaming.py

8e91d5d

Changing emptyValue order arg in set_opts

ddbac3e

Added comments for parameters

4cb2be7

Updating the migration guide

8385c11

Changing order in args for emptyValue

a89bc67

Revert "Adding tests"

75208a4

This reverts commit 48e143d.

Addressing Hyukjin Kwon's concerns

9a04d87

MaxGekk mentioned this pull request Sep 11, 2018

[SPARK-17916][SPARK-25241][SQL][FOLLOWUP] Fix empty string being parsed as null when nullValue is set. #22367

Closed

HyukjinKwon approved these changes Sep 11, 2018

View reviewed changes

asfgit closed this in c9cb393 Sep 11, 2018

MaxGekk deleted the csv-empty-value-master branch August 17, 2019 13:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as null when nullValue is set. #22389

[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as null when nullValue is set. #22389

Uh oh!

MaxGekk commented Sep 11, 2018

Uh oh!

SparkQA commented Sep 11, 2018

Uh oh!

gatorsmile commented Sep 11, 2018

Uh oh!

SparkQA commented Sep 11, 2018

Uh oh!

HyukjinKwon commented Sep 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as null when nullValue is set. #22389

[SPARK-17916][SPARK-25241][SQL][FOLLOW-UP] Fix empty string being parsed as null when nullValue is set. #22389

Uh oh!

Conversation

MaxGekk commented Sep 11, 2018

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Sep 11, 2018

Uh oh!

gatorsmile commented Sep 11, 2018

Uh oh!

SparkQA commented Sep 11, 2018

Uh oh!

HyukjinKwon commented Sep 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants