Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -246,9 +246,9 @@ private[csv] object CSVTypeCast {
} else {
castType match {
case _: ByteType => datum.toByte
case _: ShortType => datum.toShort
case _: IntegerType => datum.toInt
case _: LongType => datum.toLong
case _: ShortType => Try(datum.toShort).getOrElse(null)
case _: IntegerType => Try(datum.toInt).getOrElse(null)
case _: LongType => Try(datum.toLong).getOrElse(null)
Copy link
Member

@HyukjinKwon HyukjinKwon Dec 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I guess you meant when the actual data is empty string and nullValue is other values such as NA, they should be translated into null when the mode is PERMISSIVE in this case.

Then, this is a exact duplicate of https://issues.apache.org/jira/browse/SPARK-18699. You could resolve the current JIRA as a duplicate and then fix the title of this PR to [SPARK-18699] ... with some changes suggested there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, fixed the title. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use Try for each type here? In the JIRA, I proposed another approach (master...maropu:SPARK-18699), and I think the fix is more natural. Anyway, ISTM we do not decide yet it is worth fixing this issue...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My patch was created before I even knew about SPARK-18699 (I did have another ticket open, then learned it was a duplicate).
I do agree that your solution may be more generic and intuitive, if others agree - why wasn't it merged into main tree yet ? (it has been over 10 days since your PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavior changes are much arguable, so it basically takes much time to discuss.

case _: FloatType =>
datum match {
case options.nanValue => Float.NaN
Expand Down