This repository was archived by the owner on Feb 27, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 133
pyspark - timestamp with microseconds, causes exception on .save() #39
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingin progressThis issue is being looked at and in progressThis issue is being looked at and in progress
Description
When column contains timestamp with non 0 microseconds the .save() fails with generic "com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed." exception.
Truncation of microsecond to 0 works around the problem. The output table created by the .save() has column of "datetime" data type. Hence i presume it might be related to handling of "rounding" microseconds to satisfy precision requirements of "datetime" datatype.
env:
- sql spark connector version 1.1,
- spark 2.4.5
- databricks 6.4 runtime.
how to reproduce:
batchTimestamp = datetime.now()
#
# uncomment to truncate milliseconds, only truncation to 0 works
#batchTimestamp = batchTimestamp.replace(microsecond = 0)
print(batchTimestamp.isoformat(sep=' '))
df = spark \
.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["Col1", "Col2"]) \
.withColumn('ts', lit(batchTimestamp))
df.show()
df \
.write \
.format("com.microsoft.sqlserver.jdbc.spark") \
.mode("overwrite") \
.option("url", sql_url) \
.option("dbtable", 'test_table') \
.option("user", sql_username) \
.option("password", sql_password) \
.save()Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingin progressThis issue is being looked at and in progressThis issue is being looked at and in progress