Skip to content

Conversation

evanpw
Copy link
Contributor

@evanpw evanpw commented Apr 1, 2015

Ideally, I would love for this to be the default, but that wouldn't be backwards-compatible in the case where the filename ends in '.gz' or '.bz2' and you want to treat it as uncompressed. That seems like it would be very rare, though.

@shoyer
Copy link
Member

shoyer commented Apr 1, 2015

I think it would even be fine even to change the default here. We are not that strict about backwards compatibility in pandas -- any users who relied on the previous behavior were basically relying on a bug.

@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

I agree with @shoyer here, let's just have it infer these filename endings as compression (move the release note to the API section).

@jreback jreback added API Design IO CSV read_csv, to_csv labels Apr 2, 2015
@jreback jreback added this to the 0.16.1 milestone Apr 2, 2015
@evanpw evanpw force-pushed the infer_compression branch from fe09884 to 48fd726 Compare April 9, 2015 15:51
@evanpw
Copy link
Contributor Author

evanpw commented Apr 9, 2015

I've totally borked this branch with an accidental force push. I'll fix it tonight.

@evanpw evanpw force-pushed the infer_compression branch from fe09884 to 7fe1c69 Compare April 10, 2015 03:18
@evanpw
Copy link
Contributor Author

evanpw commented Apr 10, 2015

Should be fixed.

compression : {'gzip', 'bz2', 'infer', None}, default 'infer'
For on-the-fly decompression of on-disk data. If 'infer', then use gzip or
bz2 if filepath_or_buffer is a string ending in '.gz' or '.bz2',
respectively, and no decompression otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add here what None does?

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

couple of minor comments, pls rebase, and ping when green

@evanpw evanpw force-pushed the infer_compression branch from 7fe1c69 to 6cb41c6 Compare April 17, 2015 14:03
@evanpw
Copy link
Contributor Author

evanpw commented Apr 17, 2015

Docs are fixed, rebased/squashed, and tests are green.

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

lgtm

@shoyer @jorisvandenbossche

@shoyer
Copy link
Member

shoyer commented Apr 17, 2015

Looks great to me!

jreback added a commit that referenced this pull request Apr 18, 2015
ENH: Add option in read_csv to infer compression type from filename
@jreback jreback merged commit 529cd3d into pandas-dev:master Apr 18, 2015
@jreback
Copy link
Contributor

jreback commented Apr 18, 2015

@evanpw thanks!

@evanpw
Copy link
Contributor Author

evanpw commented Apr 18, 2015

Thank you!

@evanpw evanpw deleted the infer_compression branch April 18, 2015 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants