Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

Superceded by #8208 (converting to object dtype also converts the datetime64 values to datetime.datetime, so it is no longer needed that sqlalchemy can work with pandas' Timestamp type.


Closes #7103, #7936

This adds a pandas.io.sql.Timestamp class to handle Timestamps in to_sql. This basically converts it to a datetime.datetime before writing to the database.

Problem is that this makes it slower (I tested it with sqlite, and it's for a dataframe with only a datetime64 column about 20% slower), while it did already work before for some drivers (like psycopg2 and MySQLdb).

@jorisvandenbossche jorisvandenbossche added the IO SQL to_sql, read_sql, read_sql_query label Sep 7, 2014
@jorisvandenbossche jorisvandenbossche added this to the 0.15.0 milestone Sep 7, 2014
@jreback
Copy link
Contributor

jreback commented Sep 7, 2014

so just do this for certain database types then (will prob be similar for timedelta)

@jorisvandenbossche
Copy link
Member Author

It depends on the driver, not the database type. But indeed, we could do that, but that feels a bit clumsy. Also, I tested it with psycopg2/pymysql/MySQLdb/mysql.connector, but there a lot more drivers for which I don't know the behaviour (and also not for different versions of the drivers).

While testing the fix for NaN values, I noticed that doing df.astype(object) (needed to get in None values, and not NaN) also converts Timestamp/datetime64 to datetime.datetime. Is this the expected behaviour? I would have expected individual Timestamp objects.

@jreback
Copy link
Contributor

jreback commented Sep 7, 2014

@jorisvandenbossche that is expected. the object dtype is a ndarray of datetime objects (I guess for compat reasons).

You might want to preconvert any datetimes/timedeltas at the start (iow, separate the frame into various 'blocks'), which you then iterate all together. Don't try to concat (or they will be re-coerced)
And to be honest you can simply do this for types that need NaN -> None as well, e.g. drop down to numpy object arrays (or rec-arrays). Might be a bit of work at first, but then you can easily do what you need quickly. I actually do this for PyTables, see here, (and the next method, where I create a structured/rec array), and fill with already coerced values (e.g. datetime64[ns] have already by tz converted and are now int64 and such, strings are already an appropriate dtype, etc.)

@jorisvandenbossche
Copy link
Member Author

Superceded by #8208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO SQL to_sql, read_sql, read_sql_query

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: full SQL support for datetime64 values

2 participants