Kontext Copilot - An AI assistant for data analytics. Learn more
Expression of Interest
Multilvel partitioning hive
insights Stats
warning Please login first to view stats information.
S
Swapnil Patil
Sqoop import with multilevel partitioning
I came across issue while running Sqoop import to a partitioned
table, and found workaround for same, sharing my two cents..
I came across issue while running Sqoop import to a partitioned table, and found workaround for same, sharing my two cents..
Let’s begin…..
Create hive partitioned table at same time, import data:
sqoop import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xx \
--password yyy \
--table t_test_emp \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1
No issues, no the big deal right? Let’s try this…
sqoop import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--table t_test_emp2 \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO,JOB \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1
This has failed, what’s the difference? Here if you observe we have two level of partition DEPTNO,JOB
And previous was single level partition.
There isn’t a straight way to achieve this. Here comes the HCATLOG to rescue:
Hcatlog in brief:
HCatalog is a table storage management tool for Hadoop that exposes the tabular data of Hive metastore to other Hadoop applications.In short we can say hive is built over hcat.
With help of hcat we can do it in two steps:
Run the scripts:
a) create table (multilevel partitioned table)
sqoop import \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--query 'select ENAME,JOB,MGR,HIREDATE,SAL,COMM from t_test_emp where $CONDITIONS AND 1=2' \
--create-hcatalog-table \
--hcatalog-database hr_hive \
--hcatalog-table test_part3 \
--hcatalog-partition-keys DEPTNO,EMPNO \
--hcatalog-partition-values 1,1 -m 1
b)import data to previously created table i.e. test_part3
sqoop \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--table t_test_emp \
--columns EMPNO,DEPTNO,ENAME,JOB \
--hcatalog-database dev_hce_cs \
--hcatalog-table test_part3 \
--split-by DEPTNO
info Last modified by Raymond 4 years ago
copyright
This page is subject to Site terms.
comment Comments
No comments yet.