-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13309][SQL] Fix type inference issue with CSV data #11194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @HyukjinKwon want to review this one? |
|
Test build #2542 has finished for PR 11194 at commit
|
|
@rxin Sure. |
| } | ||
|
|
||
| def mergeRowTypes(first: Array[DataType], second: Array[DataType]): Array[DataType] = { | ||
| first.zipAll(second, NullType, NullType).map { case ((a, b)) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is not the part of diff but it might be great to change ((a, b)) to (a, b))
|
Overall, it looks good to me. This was already merged in databricks/spark-csv#261 and the logic looks identical. |
|
Actually, I have had a thought that we might have to make a class such as I think this might better be done in another PR. If you agree on this, I will create an issue and PR for this after this one is merged. |
|
Yes, could be done. I personally prefer to keep code and data separate. This way:
|
|
@HyukjinKwon @rxin Is this waiting on me? Just want to confirm it I am expected to add anything more. |
|
Could we please merge this? |
|
Sorry for the delay. I'm merging this in master. Thanks! |
|
@rxin @HyukjinKwon thank you. |
Fix type inference issue for sparse CSV data - https://issues.apache.org/jira/browse/SPARK-13309 Author: Rahul Tanwani <[email protected]> Closes apache#11194 from tanwanirahul/master.
Fix type inference issue for sparse CSV data - https://issues.apache.org/jira/browse/SPARK-13309