Skip to content

covidcast acquisition rejects valid nan-coded rows #728

@krivard

Description

@krivard

We're getting acquisition failures that may be the primary cause of the dramatic drop in added CHNG rows.

Sample error:

{
  "detail": [
    "Pandas(geo_id='17000', val=nan, se=nan, sample_size=nan, missing_val='4.0', missing_se='4.0', missing_sample_size='4.0')",
    "missing_val"
  ],
  "file": "/common/covidcast/receiving/chng/20210926_county_smoothed_outpatient_covid.csv",
  "event": "invalid value for row",
  "logger": "load_csv",
  "level": "warning",
  "timestamp": "2021-10-04T03:09:38.804771Z"
}

Looking at our tests, it appears that we always instantiate data frames as python objects and not by parsing string data. pandas may be inferring the wrong data type for the missingness columns (here looks like it thinks it's a string), causing the error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions