Spark SQL - FIRST_VALUE or LAST_VALUE
In Spark SQL, function FIRST_VALUE (FIRST) and LAST_VALUE (LAST) can be used to to find the first or the last value of given column or expression for a group of rows. If parameter `isIgnoreNull` is specified as true, they return only non-null values (unless all values are null).
Function signature
first(expr[, isIgnoreNull]) first_value(expr[, isIgnoreNull]) last(expr[, isIgnoreNull]) last_value(expr[, isIgnoreNull])
Code snippets
first_value (first)
SELECT ACCT, first_value(AMT,true) FROM VALUES (101,10.01, DATE'2021-01-01'), (101,102.01, DATE'2021-01-01'), (102,93., DATE'2021-01-01'), (103,913.1, DATE'2021-01-02'), (102,913.1, DATE'2021-01-02'), (101,900.56, DATE'2021-01-03') AS TXN(ACCT,AMT, TXN_DT) group by ACCT;
Output:
ACCT first_value(AMT) 101 10.01 103 913.10 102 93.00
last_value (last)
SELECT ACCT, last_value(AMT,true), last(AMT,true) FROM VALUES (101,10.01, DATE'2021-01-01'), (101,102.01, DATE'2021-01-01'), (102,93., DATE'2021-01-01'), (103,913.1, DATE'2021-01-02'), (102,913.1, DATE'2021-01-02'), (101,900.56, DATE'2021-01-03') AS TXN(ACCT,AMT, TXN_DT) group by ACCT;
Ouput:
ACCT last_value(AMT) last(AMT) 101 900.56 900.56 103 913.10 913.10 102 913.10 913.10
copyright
This page is subject to Site terms.
comment Comments
N
Nikesh
access_time 12 months ago
more_vert
#1731 Re: Spark SQL - FIRST_VALUE or LAST_VALUE
Thanks for the article!
Do you know the Scala equivalent of this, I used the first, but it its not picking up the first not null value, its just picking the first value.
Log in with external accounts
warning Please login first to view stats information.
Hi Nikesh,
To make it simple, you can register your DataFrame as a temporary view and then you can use Spark SQL functions to aggregate and using these functions like first_value or last_value.
person Nikesh access_time 12 months ago
Re: Spark SQL - FIRST_VALUE or LAST_VALUE
Thanks for the article!
Do you know the Scala equivalent of this, I used the first, but it its not picking up the first not null value, its just picking the first value.