Skip to content

Handling CSVs with only CR newlines #150

@lbordowitz

Description

@lbordowitz

A user has uploaded a CSV which uses solely carriage return (CR, or \r) characters for their newlines. The current SourceLineReader.readLineWithTerminator handles this by successfully reading in the header row and then discarding the rest. We want it to display every row in the CSV.

We run something like this:

// get a blob from Google Cloud Platform storage
val spreadsheetSource = Source.fromInputStream(Channels.newInputStream(blob.reader()))
val reader = CsvReader.open(spreadsheetSource)
val lines = reader.all()
// lines: List[List[String]] = List(List("First Name", " Last Name", " email"))

This is despite the fact that the file we're reading from has five lines. I have also tried this with Source.fromFile, and there's no difference.

I created the file from a normal CSV with LF-style line endings, and then ran this bash command:

$ tr '\n' '\r' < fnln.csv >fnln.cr.csv

Side note: why can't we use Source's built-in getLines function? Is there a reason that we need the line terminator in each string?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions