Skip to content

Commit 908c472

Browse files
committed
[SPARK-46396][SQL] Timestamp inference should not throw exception
### What changes were proposed in this pull request? When setting `spark.sql.legacy.timeParserPolicy=LEGACY`, Spark will use the LegacyFastTimestampFormatter to infer potential timestamp columns. The inference shouldn't throw exception. However, when the input is 23012150952, there is exception: ``` For input string: "23012150952" java.lang.NumberFormatException: For input string: "23012150952" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) at java.base/java.lang.Integer.parseInt(Integer.java:668) at java.base/java.lang.Integer.parseInt(Integer.java:786) at org.apache.commons.lang3.time.FastDateParser$NumberStrategy.parse(FastDateParser.java:304) at org.apache.commons.lang3.time.FastDateParser.parse(FastDateParser.java:1045) at org.apache.commons.lang3.time.FastDateFormat.parse(FastDateFormat.java:651) at org.apache.spark.sql.catalyst.util.LegacyFastTimestampFormatter.parseOptional(TimestampFormatter.scala:418) ``` This PR is to fix the issue. ### Why are the changes needed? Bug fix, Timestamp inference should not throw exception ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? New test case + existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #44338 from gengliangwang/fixParseOptional. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]> (cherry picked from commit 4a79ae9) Signed-off-by: Gengliang Wang <[email protected]>
1 parent eb1e6ad commit 908c472

File tree

2 files changed

+10
-5
lines changed

2 files changed

+10
-5
lines changed

sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -414,10 +414,14 @@ class LegacyFastTimestampFormatter(
414414

415415
override def parseOptional(s: String): Option[Long] = {
416416
cal.clear() // Clear the calendar because it can be re-used many times
417-
if (fastDateFormat.parse(s, new ParsePosition(0), cal)) {
418-
Some(extractMicros(cal))
419-
} else {
420-
None
417+
try {
418+
if (fastDateFormat.parse(s, new ParsePosition(0), cal)) {
419+
Some(extractMicros(cal))
420+
} else {
421+
None
422+
}
423+
} catch {
424+
case NonFatal(_) => None
421425
}
422426
}
423427

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -502,10 +502,11 @@ class TimestampFormatterSuite extends DatetimeFormatterSuite {
502502

503503
assert(fastFormatter.parseOptional("2023-12-31 23:59:59.9990").contains(1704067199999000L))
504504
assert(fastFormatter.parseOptional("abc").isEmpty)
505+
assert(fastFormatter.parseOptional("23012150952").isEmpty)
505506

506507
assert(simpleFormatter.parseOptional("2023-12-31 23:59:59.9990").contains(1704067208990000L))
507508
assert(simpleFormatter.parseOptional("abc").isEmpty)
508-
509+
assert(simpleFormatter.parseOptional("23012150952").isEmpty)
509510
}
510511

511512
test("SPARK-45424: do not return optional parse results when only prefix match") {

0 commit comments

Comments
 (0)