Multilvel partitioning hive

access_time 1 month ago visibility19 comment 0
I came across issue while running Sqoop import to a partitioned table, and found workaround for same, sharing my two cents..
Let’s begin…..
Create hive partitioned table at same time, import data:
 sqoop  import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com   \
--username xx \
--password yyy  \
--table t_test_emp  \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1
No issues, no the big deal right? Let’s try this…

sqoop  import --create-hive-table \
--connect jdbc:oracle:thin:@localhost:1521/hr.com  \
--username xxx \
--password yyy \
--table t_test_emp2  \
--hive-import \
--columns EMPNO,ENAME,JOB \
--hive-partition-key DEPTNO,JOB \
--split-by DEPTNO \
--hive-table hr_hive.emps -m1


This has failed, what’s the difference? Here if you observe we have two level of partition DEPTNO,JOB
And previous was single level partition.
There isn’t a straight way to achieve this. Here comes the HCATLOG to rescue:
Hcatlog in brief:
HCatalog is a table storage management tool for Hadoop that exposes the tabular data of Hive metastore to other Hadoop applications.In short we can say hive is built over hcat.
With help of hcat we can do it in two steps:

 
Run the scripts:
a) create table (multilevel partitioned table)
sqoop import \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--query  'select ENAME,JOB,MGR,HIREDATE,SAL,COMM from  t_test_emp  where  $CONDITIONS AND 1=2' \
--create-hcatalog-table \
--hcatalog-database hr_hive \
--hcatalog-table test_part3 \
--hcatalog-partition-keys  DEPTNO,EMPNO \
--hcatalog-partition-values   1,1  -m 1
b)import data to previously created table i.e. test_part3
 
sqoop \
--connect jdbc:oracle:thin:@localhost:1521/hr.com \
--username xxx \
--password yyy \
--table t_test_emp \
--columns EMPNO,DEPTNO,ENAME,JOB \
--hcatalog-database dev_hce_cs \
--hcatalog-table test_part3 \
--split-by DEPTNO
 


info Last modified by Swapnil at 1 month ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Want to publish your article on Kontext?

Learn more

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

More from Kontext

local_offer hive local_offer SQL local_offer hive-sql-ddl

visibility 204
thumb_up 0
access_time 3 months ago

This page shows how to create, drop, and truncate Hive tables via Hive SQL (HQL). Refer to  Differences between Hive External and Internal (Managed) Tables to understand the differences between managed and unmanaged tables in Hive.  Example: CREATE TABLE IF NOT EXISTS ...

local_offer SQL Server local_offer hive

visibility 2398
thumb_up 0
access_time 2 years ago

In one of my previous post, I showed how to configure Apache Hive 3.0.0 in Windows 10. Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide I didn’t configure Hive metastore thus by default Hive will use embedded mode for metastore. The metadata is stored in Apache Derby database.

local_offer hive local_offer SQL local_offer hive-sql-ddl

visibility 21
thumb_up 0
access_time 3 months ago

This page shows how to create, drop, alter and use Hive databases via Hive SQL (HQL). CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name [COMMENT database_comment] [LOCATION hdfs_path] [MANAGEDLOCATION hdfs_path] [WITH DBPROPERTIES (property_name=property_value, ...)]; LOCATION is ...

About column


I came across issue while running Sqoop import to a partitioned table, and found workaround for same, sharing my two cents..

 

rss_feed Subscribe RSS