access_time 3 years ago visibility4827 comment 0 languageEnglish
more_vert

This page summarizes the steps required to install Apache Sqoop (v1.4.7) in Windows 10 environment.

What is Sqoop

Sqoop is an ETL tool for Hadoop,which is designed to efficiently transfer data between structured (RDBMS), semi-structured (Cassandra, Hbase and etc.) and unstructured data sources (HDFS).

Project site

http://sqoop.apache.org/

Prerequisites

Hadoop

In this tutorial, I am going to install Sqoop in the same server that I configured Hadoop. Follow the link below to setup Hadoop if you have not done that:

Install Hadoop 3.0.0 in Windows (Single Node)

* This is only required if you want to run some Sqoop scripts to test and also Hadoop related environment variables are setup as part of the above guide.

Installation guide

The documentation for Sqoop 1.4.7 is available in the following link:

http://sqoop.apache.org/docs/1.4.7/index.html

Download binary package

Download from the following link:

http://www.apache.org/dyn/closer.lua/sqoop/1.4.7

I am downloading the file sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz.

UnZip binary package

Open Git Bash, and change directory (cd) to the folder where you save the binary package and then unzip:

$ cd F:\DataAnalytics

fahao@Raymond-Alienware MINGW64 /f/DataAnalytics
$ tar -xvzf  sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

In my machine, the content is extracted to “F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0”.

Setup environment variables

Make sure the following environment variable is setup:

  • SQOOP_HOME: pointing to your Sqoop folder in the previous step.

image

Configure

Run the following command in Git Bash to configure Sqoop.

cd  $SQOOP_HOME\\bin

./configure-sqoop

You may get the following warnings depends on whether you have installed the related frameworks in your machine.

Warning: F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist!                                       HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exi                                      st! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exi                                      st! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not ex                                      ist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.

Verify installation

Run the following command in Command Prompt to verify your installation:

%SQOOP_HOME%\bin\sqoop.cmd version

It should generate similar output as the following:

F:\DataAnalytics\sqoop-1.4.7.bin__hadoop-2.6.0\bin>%SQOOP_HOME%\bin\sqoop.cmd version
Warning: HBASE_HOME and HBASE_VERSION not set.
Warning: HCAT_HOME not set
Warning: HCATALOG_HOME does not exist HCatalog imports will fail.
Please set HCATALOG_HOME to the root of your HCatalog installation.
Warning: ACCUMULO_HOME not set.
Warning: ZOOKEEPER_HOME not set.
Warning: HBASE_HOME does not exist HBase imports will fail.
Please set HBASE_HOME to the root of your HBase installation.
Warning: ACCUMULO_HOME does not exist Accumulo imports will fail.
Please set ACCUMULO_HOME to the root of your Accumulo installation.
Warning: ZOOKEEPER_HOME does not exist Accumulo imports will fail.
Please set ZOOKEEPER_HOME to the root of your Zookeeper installation.
2018-04-22 23:55:56,197 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017

Now, we have Sqoop installed in the same Windows machine of Hadoop.

Next step, I am going to show you how to use Sqoop to import data from RDBMS into HDFS.

info Last modified by Raymond at 3 years ago * This page is subject to Site terms.

More from Kontext

local_offer zeppelin local_offer spark local_offer hadoop local_offer linux local_offer sqoop local_offer hive local_offer WSL

visibility 1093
thumb_up 0
access_time 2 years ago

This page summarizes the installation guides about big data tools on Windows through Windows Subsystem for Linux (WSL). ...

open_in_new Sqoop

local_offer linux local_offer sqoop local_offer WSL

visibility 1123
thumb_up 0
access_time 2 years ago

This page summarizes the steps required to install Apache Sqoop (v1.4.7) in Windows 10 environment via Windows Subsystem for Linux (WSL). Prerequisites If you have already installed Hadoop 3.2.0 in WSL, ignore the following steps as you don’t need to install it again. Follow&...

open_in_new Sqoop

local_offer hadoop local_offer sqoop

visibility 134
thumb_up 0
access_time 2 years ago

In Sqoop, there are multiple approaches to pass in passwords for RDBMS. Options Option 1 - clear password through --password argument sqoop [subcommand] --username user --password pwd This is the weakest approach as password is exposed directly...

open_in_new Sqoop

local_offer hadoop local_offer hdfs local_offer parquet local_offer sqoop

visibility 2627
thumb_up 0
access_time 3 years ago

This page continues with the following documentation about configuring a Hadoop multi-nodes cluster via adding a new edge node to configure administration or client tools. ...

open_in_new Sqoop

info About author

comment Comments (0)

comment Add comment

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

No comments yet.

Dark theme mode

Dark theme mode is available on Kontext.

Learn more arrow_forward

Kontext Column

Created for everyone to publish data, programming and cloud related articles. Follow three steps to create your columns.


Learn more arrow_forward