Skip to content

Conversation

@chinandrew
Copy link
Contributor

Description

First part of adding HHS support to geomapper. This just adds the mapping files, does not add to the actual geomapper util

Changelog

Itemize code/test/documentation changes and files added/removed.

  • Add functions for deriving mapping files
  • Add mapping csv files

Fixes

@chinandrew chinandrew requested review from a team and nmdefries and removed request for a team December 8, 2020 22:04
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered leftpadding the HHS region numbers with zero so that they're all the same length? It can make string processing and sorting more convenient.

)
state_hhs = pd.read_csv(
join(OUTPUT_DIR, STATE_HHS_OUT_FILENAME),
dtype={"state_code": str, "state_id": str, "state_name": str},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the state -> HHS file should have column names state_code and hhs_region_number. Same problem on line 576.

Copy link
Contributor Author

@chinandrew chinandrew Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks


(
zip_state.merge(state_hhs, on="state_code", how="left")
.drop(columns=["state_code", "state_id", "state_name"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to reflect changed column names on line 576. Should be the same as line 556.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the zip_state file is the one introducing these columns. The other function uses the fips population file (was just copying an different fips -> * example) which only has the population column to drop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@chinandrew
Copy link
Contributor Author

Have you considered leftpadding the HHS region numbers with zero so that they're all the same length? It can make string processing and sorting more convenient.

I went with the existing implementation (state -> hhs) that had them without the zero padding, which also follows the existing HRR convention of not left padding. I think it's to distinguish geos that are defined with zero pad (e.g. a zip code "06001" is defined as 5 digits and "6001" wouldn't be right, same for a county FIPS like "01009") from those without (e.g. HHS Regions are just "Region 1." I don't feel super strongly either way, though if we left padded HHS we should also probably do for HRR and update any downstream documentation accordingly.

@chinandrew chinandrew requested a review from nmdefries December 9, 2020 18:08
@krivard krivard merged commit 8391b26 into main Dec 9, 2020
@krivard krivard deleted the add-hhs branch December 9, 2020 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants