Spark SQL - Date Difference in Seconds, Minutes, Hours

Raymond Raymond event 2021-10-12 visibility 9,705
more_vert

Spark provides a number of functions to calculate date differences.

The following code snippets can run in Spark SQL shell or through Spark SQL APIs in PySpark, Scala, etc. 

Difference in days

Spark SQL - Date and Timestamp Function

Difference in months

Use function months_between to calculate months differences in Spark SQL.

spark.sql("""select months_between(DATE'2021-10-13', DATE'2020-03-01')""").show()
You can also run the SQL directly in Spark-SQL shell:

select months_between(DATE'2021-10-13', DATE'2020-03-01')

Difference in seconds

You can use bigint and to_timestamp or unix_timestamp to convert the date to seconds.

spark.sql("""select unix_timestamp(DATE'2021-10-13')-unix_timestamp(DATE'2021-10-01') as seconds""").show()
Results:
== SQL ==
select bigint(to_timestamp(DATE'2021-10-13')), bigint(to_timestamp(DATE'2021-10-01')))
-------------------------------------------------------------------------------------^^^
>>> spark.sql("""select bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')) as seconds""").show() +-------+ |seconds| +-------+ |1033200| +-------+

Difference in minutes

We can convert seconds into hours by dividing it by 60.

spark.sql("""select (bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')))/60 as minutes""").show()

Difference in hours

We can convert seconds into hours by dividing it by 3600.

spark.sql("""select (bigint(to_timestamp(DATE'2021-10-13'))-bigint(to_timestamp(DATE'2021-10-01')))/3600 as hours""").show()
warning If there are timezone changes between these two dates, the above functions will take care of it. 
More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts