By using this site, you acknowledge that you have read and understand our Cookie policy, Privacy policy and Terms .

Previously, I demonstrated how to configured Apache Hive 3.0.0 on Windows 10.

On this page, I’m going to show you how to install the latest version Apache Hive 3.1.1 on Windows 10 using Windows Subsystem for Linux (WSL) Ubuntu distro.

Prerequisites

Follow either of the following pages to install WSL in a system or non-system drive on your Windows 10.

Please also install Hadoop 3.2.0 on your WSL following the second page.

Now let’s start to install Apache Hive 3.1.1 in WSL.

Download binary package

Select a package from the download page:

https://hive.apache.org/downloads.html

For me, the recommended location is: http://www.strategylions.com.au/mirror/hive/hive-3.1.1/apache-hive-3.1.1-bin.tar.gz

In WSL bash terminal, run the following command to download the package:

wget http://www.strategylions.com.au/mirror/hive/hive-3.1.1/apache-hive-3.1.1-bin.tar.gz

Unzip binary package

If you have configured Hadoop 3.2.0 successfully, there should be one hadoop folder existing in your home folder already:

$ ls -lt
total 611896
drwxrwxrwx 1 tangr tangr      4096 May 16 00:32 dfs
drwxrwxrwx 1 tangr tangr      4096 May 15 23:48 hadoop
-rw-rw-rw- 1 tangr tangr  345625475 Jan 22 02:15 hadoop-3.2.0.tar.gz
-rw-rw-rw- 1 tangr tangr 280944629 Nov  1  2018 apache-hive-3.1.1-bin.tar.gz

Now unzip Hive package using the following command:

tar -xvzf apache-hive-3.1.1-bin.tar.gz -C ~/hadoop

In the hadoop folder there are now two subfolders:

$ ls ~/hadoop
apache-hive-3.1.1-bin  hadoop-3.2.0

Setup environment variables

In the prerequisites sections, we’ve already configured some environment variables like the following:

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
export HADOOP_HOME=/home/tangr/hadoop/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin

*Note: your user name can be different.

Let’s run the following command to add Hive required environment variables into .bashrc file too:

vi ~/.bashrc

Add the following lines to the end of the file:

export HIVE_HOME=/home/tangr/hadoop/apache-hive-3.1.1-bin

export PATH=$HIVE_HOME/bin:$PATH

Change the highlighted user name to your own one.

Run the following command to source the variables:

source ~/.bashrc

Verify the environment variables:

echo $HIVE_HOME
/home/tangr/hadoop/apache-hive-3.1.1-bin

Setup Hive HDFS folders

Start your Hadoop services (if you have not done that) by running the following command:

$HADOOP_HOME/sbin/start-all.sh

In WSL, you may need to restart you ssh services if ssh doesn’t work:

localhost: ssh: connect to host localhost port 22: Connection refused

To restart the services, run the following command:

sudo service ssh restart

Run the following command (jps) to make sure all the services are running successfully.

$ jps
2306 NameNode
2786 SecondaryNameNode
3235 NodeManager
3577 Jps
2491 DataNode
3039 ResourceManager

As you can see, all the services are running successfully in my WSL.

Now let’s setup the HDFS folders for Hive.

Run the following commands:

hadoop fs -mkdir /tmp

hadoop fs -mkdir -p /user/hive/warehouse

hadoop fs -chmod g+w /tmp

hadoop fs -chmod g+w /user/hive/warehouse

Configure Hive metastore

Now we need to run schematool to setup metastore for Hive.

$HIVE_HOME/bin/schematool -dbType <db type> -initSchema

For argument dbType, it can be any of the following values:

derby|mysql|postgres|oracle|mssql

By default, Apache Derby will be used. However it is a standalone database and can only be used for one connection concurrently.

So now you have two options:

  • Option 1 (highly-recommended): Initialize using a remote database. For my scenario, I will use a SQL Server database as remote store. For more details, please follow this page to setup a remote database as datastore: Configure a SQL Server Database as Remote Hive Metastore.
  • Option 2: Initialize using Derby by running the following command:

$HIVE_HOME/bin/schematool -dbType derby -initSchema

Configure Hive API authentication

Add the following section to $HIVE_HOME/conf/hive-site.xml file:

<property>
    <name>hive.metastore.event.db.notification.api.auth</name>
     <value>false</value>
     <description>
       Should metastore do authorization against database notification related APIs such as get_next_notification.
       If set to true, then only the superusers in proxy settings have the permission
     </description>
   </property>

And then update Hadoop core-site.xml configuration file to add the following configurations:

<property>
      <name>hadoop.proxyuser.tangr.hosts</name>
      <value>*</value>
</property>

<property>
      <name>hadoop.proxyuser.tangr.groups</name>
      <value>*</value>
</property>

Replace the highlighted user name to your own user name.

Now all the configurations are done.

Start HiveServer2 service

Run the command below to start the HiveServer2 service:

$HIVE_HOME/bin/hive --service metastore &

$HIVE_HOME/bin/hive --service hiveserver2 &

Wait until you can open HiveServer2 Web UI:  http://localhost:10002/.

Practices

You can follow section ‘DDL practices’ in my previous post to test your Hive data warehouse.

Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide


I’ll continue to publish a number of other posts about installing latest Hadoop ecosystem tools/frameworks in WSL. You can follow this website by subscribing RRS.

info Last modified by Raymond at 9 months ago * This page is subject to Site terms.

More from Kontext

local_offer hdfs local_offer hadoop local_offer windows

visibility 71
thumb_up 0
access_time 2 months ago

Network Attached Storage are commonly used in many enterprises where files are stored remotely on those servers.  They typically provide access to files using network file sharing protocols such as  ...

open_in_new View open_in_new Hadoop

local_offer hive local_offer hdfs

visibility 61
thumb_up 0
access_time 2 months ago

In Hive, there are two types of tables can be created - internal and external table. Internal tables are also called managed tables. Different features are available to different types. This article lists some of the common differences.  Internal table By default, Hive creates ...

open_in_new View open_in_new Hadoop

Schema Merging (Evolution) with Parquet in Spark and Hive

local_offer parquet local_offer pyspark local_offer spark-2-x local_offer hive local_offer hdfs

visibility 331
thumb_up 0
access_time 3 months ago

Schema evolution is supported by many frameworks or data serialization systems such as Avro, Orc, Protocol Buffer and Parquet. With schema evolution, one set of data can be stored in multiple files with different but compatible schema. In Spark, Parquet data source can detect and merge sch...

open_in_new View open_in_new Spark + PySpark

Fix for Hadoop 3.2.1 namenode format issue on Windows 10

local_offer windows10 local_offer hadoop local_offer hdfs

visibility 254
thumb_up 0
access_time 3 months ago

Issue When installing Hadoop 3.2.1 on Windows 10,  you may encounter the following error when trying to format HDFS  namnode: ERROR namenode.NameNode: Failed to start namenode. The error happens when running the following comm...

open_in_new View open_in_new Hadoop

info About author

comment Comments (2)

comment Add comment

Please log in or register to comment. account_circle Log in person_add Register
R
Raymondarrow_drop_down

If it doesn’t exist, you can create one using the template file in the same directory : hive-site.xml.template.

If the template file didn’t exist either, you can create this file directly. The root element for this XML file is configuration:

<configuration>

...

</configuration>


format_quote

person Arun access_time 9 months ago
Re: Apache Hive 3.1.1 Installation on Windows 10 using Windows Subsystem for Linux

Hi Mate, we couldn't find file "$HIVE_HOME/conf/hive-site.xml" in 3.1.1 package. Alternatively tried other versions couldn't find same file there too.

Please let me know how do I get/fix it.

Many Thanks 

reply Reply
account_circle Arun

Hi Mate, we couldn't find file "$HIVE_HOME/conf/hive-site.xml" in 3.1.1 package. Alternatively tried other versions couldn't find same file there too.

Please let me know how do I get/fix it.

Many Thanks 


reply Reply

Dark theme mode

Dark theme mode is available on Kontext.

Learn more arrow_forward
Kontext Column

Kontext Column

Created for everyone to publish data, programming and cloud related articles. Follow three steps to create your columns.

Learn more arrow_forward
info Follow us on Twitter to get the latest article updates. Follow us