Skip to content

Conversation

@dakersnar
Copy link
Contributor

#5647 is no longer present, but while investigating that issue I discovered some bugs with WriteCsv. This PR does the following:

  • Fixes bugs where text qualifiers (quotation marks) were not properly added to the CSV output by WriteCsv in the following cases:
    • There are separators in a row entry
    • There are separators in a header name
    • There are newlines in a row entry
    • There are newlines in a header name
  • Adds test cases both confirming [DataFrame] can't handle separators in data #5647 is solved and validating that the above bugs are fixed.

This is my first contribution to this repo so style criticism appreciated.

@ghost ghost assigned dakersnar Aug 26, 2022
@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #6303 (fe0adcd) into main (f117cd8) will increase coverage by 0.24%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6303      +/-   ##
==========================================
+ Coverage   68.39%   68.63%   +0.24%     
==========================================
  Files        1144     1171      +27     
  Lines      244991   247916    +2925     
  Branches    25411    25742     +331     
==========================================
+ Hits       167558   170157    +2599     
- Misses      70775    71015     +240     
- Partials     6658     6744      +86     
Flag Coverage Δ
Debug 68.63% <100.00%> (+0.24%) ⬆️
production 63.04% <100.00%> (+0.20%) ⬆️
test 89.09% <100.00%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.Data.Analysis/DataFrame.IO.cs 81.15% <100.00%> (+1.92%) ⬆️
...Microsoft.Data.Analysis.Tests/DataFrame.IOTests.cs 98.80% <100.00%> (+0.22%) ⬆️
...Microsoft.ML.TorchSharp/Utils/DefaultDictionary.cs 0.00% <0.00%> (-30.62%) ⬇️
...luators/Metrics/MulticlassClassificationMetrics.cs 89.47% <0.00%> (-10.53%) ⬇️
...Microsoft.ML.AutoML.Tests/AutoMLExperimentTests.cs 92.38% <0.00%> (-7.62%) ⬇️
test/Microsoft.ML.Tests/TextClassificationTests.cs 93.75% <0.00%> (-6.25%) ⬇️
test/Microsoft.ML.AutoML.Tests/AutoFitTests.cs 79.56% <0.00%> (-6.19%) ⬇️
src/Microsoft.ML.Maml/MAML.cs 24.36% <0.00%> (-2.54%) ⬇️
...eColumn.BinaryOperationImplementations.Exploded.cs 47.65% <0.00%> (-1.38%) ⬇️
test/Microsoft.ML.AutoML.Tests/DatasetUtil.cs 97.84% <0.00%> (-1.15%) ⬇️
... and 80 more

@dakersnar
Copy link
Contributor Author

@eerhardt Thanks for the feedback. Everything should be addressed now.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks.

@dakersnar dakersnar merged commit 09bbb19 into dotnet:main Sep 27, 2022
@dakersnar dakersnar added the Microsoft.Data.Analysis All DataFrame related issues and PRs label Oct 10, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Microsoft.Data.Analysis All DataFrame related issues and PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants