Extend geocode utility to actually support the state to state mappings #310

dshemetov · 2020-10-13T21:27:50Z

Add support for the state_x -> state_y where x,y are in {code, id, name} mappings

- state_x -> state_y where x,y are in {code, id, name}

chinandrew · 2020-10-13T21:36:43Z

_delphi_utils_python/tests/test_geomap.py

+        # state_name -> state_id
+        new_data = gmpr.add_geocode(self.zip_data, "zip", "state_name")
+        new_data2 = gmpr.add_geocode(new_data, "state_name", "state_id")
+        assert new_data2.shape == (12, 6)


should there also be an assert statement for new_data?

Nah, new_data is just the test data for state_name to state_id, but it starts in zip form

* update replace_geocode documentation to be clear about data columns * add test cases for renaming columns in replace_geocode * fix the state to state conversion dropped columns issue

chinandrew · 2020-10-14T21:16:03Z

not specifically related to this PR, I just missed this in the previous one: date_cols is undocumented in the replace_geocode() docstring

dshemetov · 2020-10-14T22:18:24Z

Ahhh, thanks, might as well take care of that here.

Refactor cdc_covidnet to use geo utils

…t option

chinandrew · 2020-10-20T06:12:48Z

_delphi_utils_python/tests/test_geomap.py

-        assert new_data["population"].sum() == 274963
+        assert new_data.shape == (5, 5)
        new_data = gmpr.add_population_column(self.zip_data, "zip")
-        assert new_data["population"].sum() == 274902


any reason not to keep both of these checks?

Also, if it's 5x5 output it may be simpler juts to do a direct df comparison on values

This is an old test I don't see anymore. Pull again?

I was just asking why it was deleted

Ah. I decided to move away from tests based on data-derived population counts. I figured the tests should catch whether the underlying logic or arithmetic breaks, not whether the data file changed. Am open to reasons for keeping though.

chinandrew · 2020-10-20T06:13:26Z

_delphi_utils_python/tests/test_geomap.py

+        # hrr -> nation
+        with pytest.raises(ValueError):    
+            new_data = gmpr.replace_geocode(self.zip_data, "zip", "hrr")
+            new_data2 = gmpr.replace_geocode(new_data, "hrr", "nation")
+
+        # hrr -> nation
+        with pytest.raises(ValueError):    
+            new_data = gmpr.replace_geocode(self.zip_data, "zip", "hrr")
+            new_data2 = gmpr.replace_geocode(new_data, "hrr", "nation")
+
+        # hrr -> nation
+        with pytest.raises(ValueError):    
+            new_data = gmpr.replace_geocode(self.zip_data, "zip", "hrr")
+            new_data2 = gmpr.replace_geocode(new_data, "hrr", "nation")
+


duplicated tests?

Oops, this is a new one 😄

chinandrew · 2020-10-21T20:55:40Z

_delphi_utils_python/tests/test_geomap.py


+        # hrr -> nation
+        with pytest.raises(ValueError):    
+            new_data = gmpr.replace_geocode(self.zip_data, "zip", "hrr")


I'm pretty sure if the first line valueerrors but the second one doesnt, it still passes, so may want to split this into two with pytest raises.... statements

Wouldn't that have been caught in the zip -> hrr test some lines before that though?

Maybe im understanding this test wrong. I read it as testing that both

new_data = gmpr.replace_geocode(self.zip_data, "zip", "hrr")

and

new_data2 = gmpr.replace_geocode(new_data, "hrr", "nation")

raise valueerrors. Is that right?

It's testing the second one, since we should not be mapping hrr -> nation (it's an incomplete mapping, so it's unsupported).

Ohhh got it, didn't see the second line calls new_data. Thanks.

chinandrew

lgtm

dshemetov · 2020-10-21T21:09:55Z

cdc_covidnet/delphi_cdc_covidnet/update_sensor.py

+                               new_code="state_id",
+                               dropna=False)
+    # To use the original column name, reassign original column and drop new one
+    hosp_df[APIConfig.STATE_COL] = hosp_df["state_id"].str.upper()


@chinandrew btw, this may be cdc_covidnet specific, but the state abbreviation in other indicators (like JHU) is assumed to be lower case. Are we sure this is what we want?

For some reason cdc covidnet was uppercase, so i kept it consistent. if it's something we can change, would definitely recommend we standardize, but haven't looked into it.

@krivard thoughts?

Slacked Katie on this while discussing a similar issue, apparently downstream ingestion will standardize everything and accepts lower or uppercase, so this can be removed and we can just move to lowercase for the indicator code.

Extend geocode utility to actually support the mappings:

7ff30ac

- state_x -> state_y where x,y are in {code, id, name}

chinandrew reviewed Oct 13, 2020

View reviewed changes

Add column renaming line in state conversion

3d8c331

krivard requested a review from chinandrew October 14, 2020 19:10

Geocode updates:

3b6d139

* update replace_geocode documentation to be clear about data columns * add test cases for renaming columns in replace_geocode * fix the state to state conversion dropped columns issue

Refactor to Geomapper function

8a1112d

chinandrew mentioned this pull request Oct 14, 2020

Refactor cdc_covidnet to use geo utils #311

Merged

chinandrew and others added 6 commits October 14, 2020 17:25

Change na behavior to be consistent with previous implementation

d6f43ae

Remove unused variable

d8c491a

Update comment on column assignment

893aafc

Merge pull request #311 from cmu-delphi/geo_refactor_cdccovidnet

0befb53

Refactor cdc_covidnet to use geo utils

Fix date_col docstring, add None option to date_col, add test for tha…

e1f96f9

…t option

Remove duplicated test

5859982

chinandrew reviewed Oct 20, 2020

View reviewed changes

chinandrew reviewed Oct 21, 2020

View reviewed changes

chinandrew approved these changes Oct 21, 2020

View reviewed changes

dshemetov commented Oct 21, 2020

View reviewed changes

Remove static file

84422b6

chinandrew mentioned this pull request Oct 22, 2020

cdc_covidnet: change states to lowercase to align with geo_util default #354

Merged

krivard merged commit c9fea0a into main Oct 26, 2020

krivard deleted the geoutil_state_extension branch October 29, 2020 18:10

Extend geocode utility to actually support the state to state mappings #310

Extend geocode utility to actually support the state to state mappings #310

Uh oh!

Conversation

dshemetov commented Oct 13, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chinandrew commented Oct 14, 2020

Uh oh!

dshemetov commented Oct 14, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chinandrew left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants