Hive SQL - Union data with UNION ALL and UNION DISTINCT
insights Stats
Code snippets and tips for various programming languages/frameworks. All code examples are under MIT or Apache 2.0 license unless specified otherwise.
Like other data warehouse software, Hive provides built-in clauses to union data - UNION ALL and UNION DISTINCT (UNION). UNION ALL will include all records in both left and right tables while UNION will only return distinct records only. You can mix these two unions in one query but the UNION will overwrites any UNION ALL to its left.
Syntax
select_statement UNION [ALL | DISTINCT] select_statement [UNION [ALL | DISTINCT] select_statement) *
Code snippet
The following code snippet provides one example of using
SELECT acct, txn_date, amount from transactions UNION ALL select 104, DATE'2022-07-23', 200;
Sample result:
101 2021-01-01 10.01 101 2021-01-01 102.01 102 2021-01-01 93.00 103 2021-01-02 913.10 102 2021-01-02 913.10 101 2021-01-03 900.56 104 2022-07-23 200.00
The following is another example using UNION DISTINCT.
SELECT acct, txn_date from transactions UNION DISTINCT select 104, DATE'2022-07-23';
The result will remove duplicated records:
101 2021-01-01 101 2021-01-03 102 2021-01-01 102 2021-01-02 103 2021-01-02 104 2022-07-23
The above sample table is from Hive SQL - Analytics with GROUP BY and GROUPING SETS, Cubes, Rollups.