Spark SQL - NTILE Window Function
About NTILE function
Spark NTILE function divides the rows in each window to 'n' buckets ranging from 1 to at most 'n' (n is the specified parameter).
Example
The following sample SQL uses NTILE function to divide records in each window to two buckets.
SELECT TXN.*, NTILE(2) OVER (PARTITION BY TXN_DT ORDER BY AMT) AS NTILE_BUCKET FROM VALUES (101,10.01, DATE'2021-01-01'), (101,102.01, DATE'2021-01-01'), (102,93., DATE'2021-01-01'), (103,913.1, DATE'2021-01-02'), (102,913.1, DATE'2021-01-02'), (101,900.56, DATE'2021-01-03') AS TXN(ACCT,AMT, TXN_DT);
Result:
ACCT AMT TXN_DT NTILE_BUCKET 101 10.01 2021-01-01 1 102 93.00 2021-01-01 1 101 102.01 2021-01-01 2 101 900.56 2021-01-03 1 103 913.10 2021-01-02 1 102 913.10 2021-01-02 2
infoPARTITION BY is not mandatory; if it is not specified, all the records will be moved to one single partition which can cause performance issues.
Example table
The virtual table/data frame is cited from SQL - Construct Table using Literals.
copyright
This page is subject to Site terms.
comment Comments
No comments yet.