Skip to content

Conversation

@chinandrew
Copy link
Contributor

Description

Add a cleaning step to usafacts to prevent against the bug that happened last week where one of the count values was "1,020" instead of 1020.

Changelog

  • Add a line that looks at any str's that were parsed, removes commas, and casts them to ints.
  • Update test to check this behavior and also add a few more cases/values of interest (e.g. cruise ship fips). Previously, the test only checked that the column headers were correct. To write this new test, I ran the test with the new data on main and saved the output, and then used that output as the test case when I introduced this new change

Fixes

@chinandrew
Copy link
Contributor Author

chinandrew commented Nov 10, 2020

ok seems like i broke some other tests, will fix (EDIT: fixed)

Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

(also <3 these checks, nice work)

@krivard krivard merged commit a471262 into main Nov 10, 2020
@krivard krivard deleted the robustify-usafacts branch November 10, 2020 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make CSV processing more robust for USAFacts (or more generally?)

3 participants