[FEA] String timestamp parsing to match Spark casting string to timestamp #3320
Labels
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Spark
Functionality that helps Spark RAPIDS
strings
strings issues (C++ and Python)
Unlike timestamp2long, there is no timestamp format string used when casting string to timestamp in spark. Instead, the cast uses an incremental parsing approach that allows for a large set of parsable/accepted formats including single digit month, day, hour, minute, and second values.
Here is the full list of supported timestamp formats https://github.com/apache/spark/blob/a4382f7fe1c36a51c64f460c6cb91e93470e0825/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L192
A custring function that supports this conversion and returns
null
when a given string is not parsable. We've tried using the timestamp2long function, but the lack of generic format support makes it impossible to match the spark cast functionality in columns with heterogeneous formats.The text was updated successfully, but these errors were encountered: