Teradata: FIRST_VALUE and LAST_VALUE Funtions

Like other databases, FIRST_VALUE and LAST_VALUE are two of commonly used windowing functions in Teradata. In one of my old posts about SQL Server 2012 (Chinese language), I showed how to use FIRST_VALUE and LAST_VALUE. For this article, I will provide examples of using them in Teradata but more importantly I want to highlight one part about ordered window.

Sample table

This article utilizes a table (TestDb.test_table) created in one of my previous articles: Extract JSON Data via SQL Functions in Teradata - Kontext. The content of the table looks like the following screenshot:

Use FIRST\_VALUE function

The following statement creates windows via column categorywithout ORDER BY clause:

SEL r.*,
FIRST_VALUE(amount) OVER(PARTITION BY category) AS FV_amount
FROM TestDb.test_table r;

The output:

	id	category	amount	FV_amount
1	1	A	10	10
2	2	A	11	10
3	3	A	12	10
4	4	B	100	100
5	6	B	102	100
6	5	B	101	100
7	4	C	1,000	1,000
8	6	C	1,002	1,000
9	5	C	1,001	1,000

Use LAST\_VALUE function

Similarly, we can use LAST_VALUE to retrieve the last value of amountcolumn in each window (by column category).

SEL r.*,
LAST_VALUE(amount) OVER(PARTITION BY category) AS LV_amount
FROM TestDb.test_table r;

The results:

	id	category	amount	LV_amount
1	1	A	10	12
2	2	A	11	12
3	3	A	12	12
4	4	B	100	101
5	6	B	102	101
6	5	B	101	101
7	4	C	1,000	1,001
8	6	C	1,002	1,001
9	5	C	1,001	1,001

With ORDER BY clause

If we add ORDER BY clause to the statement, the results will be different:

SEL r.*,
LAST_VALUE(amount) OVER(PARTITION BY category ORDER BY id) AS LV_amount
FROM TestDb.test_table r;

Results:

	id	category	amount	LV_amount
1	1	A	10	10
2	2	A	11	11
3	3	A	12	12
4	4	B	100	100
5	5	B	101	101
6	6	B	102	102
7	4	C	1,000	1,000
8	5	C	1,001	1,001
9	6	C	1,002	1,002

Column LV_amount just returns different values for each row in the same window. This is the special part I want to highlight: **If ORDER BY is not specified, the entire partition is used for a window frame; if it is specified, the default rows bounding are applied:**UNBOUNDED PRECEDING AND CURRENT ROW. With this rule, it is obviously that the above statement will return different values.

With ORDER BY and window frame bound

We can use window frame bound clause to specify the range for each calculation.

SEL r.*,
LAST_VALUE(amount) OVER(PARTITION BY category ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LV_amount
FROM TestDb.test_table r;

Now the results look like the following:

	id	category	amount	LV_amount
1	1	A	10	12
2	2	A	11	12
3	3	A	12	12
4	4	B	100	102
5	5	B	101	102
6	6	B	102	102
7	4	C	1,000	1,002
8	5	C	1,001	1,002
9	6	C	1,002	1,002

Retrieve previous and next row values

With ROWS BETWEEN clause, we can easily retrieve previous and next row values.

SEL r.*,
FIRST_VALUE(amount IGNORE NULLS) OVER(PARTITION BY category ORDER BY id ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS PREV_amount,
LAST_VALUE(amount) OVER(PARTITION BY category ORDER BY id ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS NEXT_amount
FROM TestDb.test_table r;

The results:

	id	category	amount	PREV_amount	NEXT_amount
1	1	A	10	?	11
2	2	A	11	10	12
3	3	A	12	11	?
4	4	B	100	?	101
5	5	B	101	100	102
6	6	B	102	101	?
7	4	C	1,000	?	1,001
8	5	C	1,001	1,000	1,002
9	6	C	1,002	1,001	?

References

Spark SQL - FIRST_VALUE or LAST_VALUE - Kontext

Select First Row in Each GROUP BY Group in Teradata - Kontext