Secure Password Protection for Sqoop Jobs

event 2019-01-07 visibility 914 comment 0 insights
more_vert
insights Stats
Raymond Raymond Sqoop

Apache Sqoop, a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

Password and other secrets need to be protected properly when running Sqoop jobs. In Sqoop, there are multiple approaches to pass in passwords for connecting to RDBMS. Each of them provides different security level. 

Options

Option 1 - clear password through --password argument

sqoop [subcommand] --username user --password pwd

This is the weakest approach as password is exposed directly in the command line.

Option 2 - interactive password through -P argument

sqoop [subcommand] --username user -P

Password needs to be manually input interactively. You cannot use this approach to schedule the job.

Option 3 - storing password in file through --password-file argument

sqoop [subcommand] --username user --password-file mypasswordfile.path

Password is still clearly stored in a file which is weak though better than option 1.

From Hadoop 2.2.0, we can use hadoop credential command to create password alias and then use it in Sqoop or other tools.

Generate the password in Java key store

Java key store is one of the supported providers.

#Store the password in HDFS

hadoop credential create mydatabase.password -provider jceks://hdfs/user/hue/mypwd.jceks

# Store the password locally

hadoop credential create mydatabase.password -provider jceks://file/home/user/mypwd.jceks

Use the password alias

sqoop [subcommand] \
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/hue/mypwd.jceks\
--verbose \
--username user \
--password-alias mydatabase.password \

….

In this way, clear password is not exposed directly.

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts