Multilvel partitioning hive

event 2020-09-28 thumb_up 1 visibility 356 comment 0 insights

more_vert

warning Please login first to view stats information.

I came across issue while running Sqoop import to a partitioned table, and found workaround for same, sharing my two cents..
Let’s begin…..
Create hive partitioned table at same time, import data:

 sqoop  import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com   \
--username xx \
--password yyy  \
--table t_test_emp  \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1

No issues, no the big deal right? Let’s try this…

sqoop  import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com  \
--username xxx \
--password yyy \
--table t_test_emp2  \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO,JOB \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1

This has failed, what’s the difference? Here if you observe we have two level of partition DEPTNO,JOB
And previous was single level partition.
There isn’t a straight way to achieve this. Here comes the HCATLOG to rescue:
Hcatlog in brief:
HCatalog is a table storage management tool for Hadoop that exposes the tabular data of Hive metastore to other Hadoop applications.In short we can say hive is built over hcat.
With help of hcat we can do it in two steps:

Run the scripts:
a) create table (multilevel partitioned table)

sqoop import \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--query  'select ENAME,JOB,MGR,HIREDATE,SAL,COMM from  t_test_emp  where  $CONDITIONS AND 1=2' \
--create-hcatalog-table \
--hcatalog-database hr_hive \
--hcatalog-table test_part3 \
--hcatalog-partition-keys  DEPTNO,EMPNO \
--hcatalog-partition-values   1,1  -m 1

b)import data to previously created table i.e. test_part3

 
sqoop \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--table t_test_emp \
--columns EMPNO,DEPTNO,ENAME,JOB \
--hcatalog-database dev_hce_cs \
--hcatalog-table test_part3 \
--split-by DEPTNO

Multilvel partitioning hive

insights Stats

sqoop import --create-hive-table \ --connect jdbc:oracle:thin:@localhost:1521/hr.com \ --username xxx \ --password yyy \ --table t_test_emp2 \ --hive-import \ --columns EMPNO,ENAME,JOB \ --hive-partition-key DEPTNO,JOB \ --split-by DEPTNO \ --hive-table hr_hive.emps -m1

Log in with external accounts