PySpark DataFrame - Add or Subtract Milliseconds from Timestamp Column
Code description
This code snippets shows you how to add or subtract milliseconds (or microseconds) and seconds from a timestamp column in Spark DataFrame.
It first creates a DataFrame in memory and then add and subtract milliseconds/seconds from the timestamp column ts
using Spark SQL internals.
Output:
+---+--------------------------+--------------------------+--------------------------+--------------------------+ |id |ts |ts1 |ts2 |ts3 | +---+--------------------------+--------------------------+--------------------------+--------------------------+ |1 |2022-09-01 12:05:37.227916|2022-09-01 12:05:37.226916|2022-09-01 12:05:37.228916|2022-09-01 12:05:38.227916| |2 |2022-09-01 12:05:37.227916|2022-09-01 12:05:37.226916|2022-09-01 12:05:37.228916|2022-09-01 12:05:38.227916| |3 |2022-09-01 12:05:37.227916|2022-09-01 12:05:37.226916|2022-09-01 12:05:37.228916|2022-09-01 12:05:38.227916| |4 |2022-09-01 12:05:37.227916|2022-09-01 12:05:37.226916|2022-09-01 12:05:37.228916|2022-09-01 12:05:38.227916| +---+--------------------------+--------------------------+--------------------------+--------------------------+
*Note - the code assuming SparkSession object already exists via variable name spark
.
Code snippet
from pyspark.sql.functions import * import datetime now = datetime.datetime.now() df = spark.range(1,5) df = df.withColumn('ts', lit(now)) df = df.withColumn('ts1', expr("ts - interval '0.001' seconds")) df = df.withColumn('ts2', expr("ts + interval '0.001' seconds")) df = df.withColumn('ts3', expr("ts + interval '1' seconds")) df.show(truncate=False)
copyright
This page is subject to Site terms.
comment Comments
No comments yet.