Spark SQL - Date Difference in Seconds, Minutes, Hours

Raymond Tang Raymond Tang 0 10422 7.69 index 10/12/2021

Spark provides a number of functions to calculate date differences.

The following code snippets can run in Spark SQL shell or through Spark SQL APIs in PySpark, Scala, etc.

Difference in days

Spark SQL - Date and Timestamp Function

Difference in months

Use function months_betweento calculate months differences in Spark SQL.

spark.sql("""select months_between(DATE'2021-10-13', DATE'2020-03-01')""").show()

You can also run the SQL directly in Spark-SQL shell:

select months_between(DATE'2021-10-13', DATE'2020-03-01')

Difference in seconds

You can use bigintand to_timestampor unix_timestamp to convert the date to seconds.

spark.sql("""select unix_timestamp(DATE'2021-10-13')-unix_timestamp(DATE'2021-10-01') as seconds""").show()
Results:== SQL ==select bigint(to_timestamp(DATE'2021-10-13')), bigint(to_timestamp(DATE'2021-10-01')))-------------------------------------------------------------------------------------^^^>>> spark.sql("""select bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')) as seconds""").show()
+-------+
|seconds|
+-------+
|1033200|
+-------+

Difference in minutes

We can convert seconds into hours by dividing it by 60.

spark.sql("""select (bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')))/60 as minutes""").show()

Difference in hours

We can convert seconds into hours by dividing it by 3600.

spark.sql("""select (bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')))/3600 as hours""").show()

warning If there are timezone changes between these two dates, the above functions will take care of it.

pyspark spark-sql spark-sql-function

Join the Discussion

View or add your thoughts below

Comments