Raymond Raymond

Spark SQL - NTILE Window Function

event 2021-01-06 visibility 2,858 comment 0 insights toc
more_vert
insights Stats
toc Table of contents

About NTILE function

Spark NTILE function divides the rows in each window to 'n' buckets ranging from 1 to at most 'n' (n is the specified parameter). 

Example

The following sample SQL uses NTILE function to divide records in each window to two buckets. 

SELECT TXN.*, NTILE(2) OVER (PARTITION BY TXN_DT ORDER BY AMT) AS NTILE_BUCKET FROM VALUES 
(101,10.01, DATE'2021-01-01'),
(101,102.01, DATE'2021-01-01'),
(102,93., DATE'2021-01-01'),
(103,913.1, DATE'2021-01-02'),
(102,913.1, DATE'2021-01-02'),
(101,900.56, DATE'2021-01-03')
AS TXN(ACCT,AMT, TXN_DT);

Result:

ACCT    AMT     TXN_DT  NTILE_BUCKET
101     10.01   2021-01-01      1
102     93.00   2021-01-01      1
101     102.01  2021-01-01      2
101     900.56  2021-01-03      1
103     913.10  2021-01-02      1
102     913.10  2021-01-02      2
infoPARTITION BY is not mandatory; if it is not specified, all the records will be moved to one single partition which can cause performance issues. 

Example table

The virtual table/data frame is cited from SQL - Construct Table using Literals.

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts