Skip to content

Conversation

@sandeep-katta
Copy link
Contributor

What changes were proposed in this pull request?

#DataSet
fruit,color,price,quantity
apple,red,1,3
banana,yellow,2,4
orange,orange,3,5
xxx

This PR aims to fix the below

scala> spark.conf.set("spark.sql.csv.parser.columnPruning.enabled", false)
scala> spark.read.option("header", "true").option("mode", "DROPMALFORMED").csv("fruit.csv").count
res1: Long = 4

This is caused by the issue SPARK-24645.
SPARK-24645 issue can also be solved by SPARK-25387

Why are the changes needed?

SPARK-24645 caused this regression, so reverted the code as it can also be solved by SPARK-25387

Does this PR introduce any user-facing change?

No,

How was this patch tested?

Added UT, and also tested the bug SPARK-24645

SPARK-24645 regression
image

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29101][SQL] [Backport]Fix count API for csv file when DROPMALFORMED mode is selected [SPARK-29101][SQL][2.4] Fix count API for csv file when DROPMALFORMED mode is selected Sep 19, 2019
@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

Thank you for backporting, @sandeep-katta .

@dongjoon-hyun
Copy link
Member

cc @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Sep 19, 2019

Test build #110964 has finished for PR 25843 at commit c8d8ff5.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 19, 2019

Test build #110966 has finished for PR 25843 at commit c8d8ff5.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Sep 19, 2019

Test build #110969 has finished for PR 25843 at commit c8d8ff5.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 19, 2019

Test build #110980 has finished for PR 25843 at commit c8d8ff5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK. It's worth noting this is a backport of #25820

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to branch-2.4
Thank you, @sandeep-katta , @HyukjinKwon , @srowen !

dongjoon-hyun pushed a commit that referenced this pull request Sep 19, 2019
… mode is selected

### What changes were proposed in this pull request?
#DataSet
fruit,color,price,quantity
apple,red,1,3
banana,yellow,2,4
orange,orange,3,5
xxx

This PR aims to fix the below
```
scala> spark.conf.set("spark.sql.csv.parser.columnPruning.enabled", false)
scala> spark.read.option("header", "true").option("mode", "DROPMALFORMED").csv("fruit.csv").count
res1: Long = 4
```

This is caused by the issue [SPARK-24645](https://issues.apache.org/jira/browse/SPARK-24645).
SPARK-24645 issue can also be solved by [SPARK-25387](https://issues.apache.org/jira/browse/SPARK-25387)

### Why are the changes needed?

SPARK-24645 caused this regression, so reverted the code as it can also be solved by SPARK-25387

### Does this PR introduce any user-facing change?
No,

### How was this patch tested?
Added UT, and also tested the bug SPARK-24645

**SPARK-24645 regression**
![image](https://user-images.githubusercontent.com/35216143/65067957-4c08ff00-d9a5-11e9-8d43-a4a23a61e8b8.png)

Closes #25843 from sandeep-katta/SPARK-29101_branch2.4.

Authored-by: sandeep katta <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants