Skip to content

Conversation

phofl
Copy link
Member

@phofl phofl commented Jan 17, 2021

Refactored them into 4 files, similar to described in the issue. If this is accepted I will try to clean this up a bit more through unifying the calls a bit

@phofl phofl added IO CSV read_csv, to_csv Refactor Internal refactoring of code labels Jan 17, 2021
@jreback jreback added this to the 1.3 milestone Jan 19, 2021
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would update pandas/io/parsers/__init__.py to import read_csv, read_fwf, read_table, and other public functions here (if any others).

you can leave .readers just import them to __init__.py (then you don't need to change the other modules that import).

otherwise this is a straight move?

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Thanks, will do that.

Yes, just moved them around. Will go to refactoring after this

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

I moved the relevant functions up, I changed the import paths for the helper functions in the test functions

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm ping on green

@jreback
Copy link
Contributor

jreback commented Jan 19, 2021

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Have to check, this one worked 3 days ago in the pipelines

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Hm merged master, let's check if this persists. Can not reproduce this locally

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Additionally this worked the commit before when I forgot a few imports. At least a place to start pinning if it persists

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Green now

@twoertwein I think you worked with unclosed handles the last few days/weeks? Could you maybe have a look and see if this is familiar?

@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

@twoertwein
Copy link
Member

@twoertwein I think you worked with unclosed handles the last few days/weeks? Could you maybe have a look and see if this is familiar?

I assume that most of the ssl warnings come from boto3 (I honestly gave up on them for now in #39047). Wasn't a test like that recently xfailed (reason s3/boto3 being updated)?

@phofl
Copy link
Member Author

phofl commented Jan 20, 2021

Yes, but this test really failed not only raised a warning,

@twoertwein
Copy link
Member

do you mean the OSError: [Errno 98] Address already in use in test_server_and_default_headers test? Is it possible that multiple instances of pytest run on the same machine but they like to use the same port?

or test_warn_if_chunks_have_mismatched_type. I think I have looked at this (or a similar) test when working on #39047. As far as I remember, there shouldn't be a need for any "ssl-stuff" in this test. The only resource that might not be closed is StringIO. Locally (on linux) I can not re-create most of the ResourceWanrings.

@phofl
Copy link
Member Author

phofl commented Jan 20, 2021

Sorry this was not clear enough by myself. I was talking about test_warn_if_chunks_have_mismatched_type. Have the same issue, can not reproduce locally.

return TextFileReader(*args, **kwds)


def _clean_na_values(na_values, keep_default_na=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth moving from here down to another module (validation.py?) in a followup, e.g. keep readers.py for the actual readers

@jreback jreback merged commit 5d65e0a into pandas-dev:master Jan 21, 2021
@jreback
Copy link
Contributor

jreback commented Jan 21, 2021

thanks @phofl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST/REF: splitting pandas/io/parsers.py into multiple files
3 participants