From Spark 3.0.0, a new function named date_part is added to extract a part from a date, timestamp or interval. This function is equivalent to extract function which was added in the same version. Before that, functions like year, weekofyear, weekday, second, quarter, month, hour, day, minute, dayofyear, dayofweek, dayofmonth can be used.
Syntax for date\_part function
date_part(field, source)
Syntax for extract function
extract(field FROM source)
Code snippet
The following are some examples using these functions.
spark-sql> select extract(YEAR from now()), year(now());
2022 2022
spark-sql> select extract(MONTHS from DATE'2022-06-15'), month(DATE'2022-06-15');
6 6
spark-sql> select date_part('DAY',now()), day(now());
15 15
spark-sql> select date_part('seconds', interval 5 hours 30 seconds 100 milliseconds 20 microseconds);
30.100020
spark-sql> select extract(hours from interval 5 hours 30 seconds 100 milliseconds 20 microseconds);
5
spark-sql> select date_part('doy',now()), dayofyear(now());
166 166
*When the above Spark SQL ran, the date is 2022-06-15.
Possible values for field parameter
For both above functions, the possible values for parameter field are listed below. The values are case-insensitive.
For date and timestamp data types
- "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field
- "YEAROFWEEK" - the ISO 8601 week-numbering year that the datetime falls in. For example, 2005-01-02 is part of the 53rd week of year 2004, so the result is 2004
- "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the datetime falls in
- "MONTH", ("MON", "MONS", "MONTHS") - the month field (1 - 12)
- "WEEK", ("W", "WEEKS") - the number of the ISO 8601 week-of-week-based-year. A week is considered to start on a Monday and week 1 is the first week with >3 days. In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year.
- "DAY", ("D", "DAYS") - the day of the month field (1 - 31)
- "DAYOFWEEK",("DOW") - the day of the week for datetime as Sunday(1) to Saturday(7)
- "DAYOFWEEK_ISO",("DOW_ISO") - ISO 8601 based day of the week for datetime as Monday(1) to Sunday(7)
- "DOY" - the day of the year (1 - 365/366)
- "HOUR", ("H", "HOURS", "HR", "HRS") - The hour field (0 - 23)
- "MINUTE", ("M", "MIN", "MINS", "MINUTES") - the minutes field (0 - 59)
- "SECOND", ("S", "SEC", "SECONDS", "SECS") - the seconds field, including fractional parts
For interval data type
- "YEAR", ("Y", "YEARS", "YR", "YRS") - the total
months/ 12 - "MONTH", ("MON", "MONS", "MONTHS") - the total
months% 12 - "DAY", ("D", "DAYS") - the
dayspart of interval - "HOUR", ("H", "HOURS", "HR", "HRS") - how many hours the
microsecondscontains - "MINUTE", ("M", "MIN", "MINS", "MINUTES") - how many minutes left after taking hours from
microseconds - "SECOND", ("S", "SEC", "SECONDS", "SECS") - how many second with fractions left after taking hours and minutes from
microseconds