arrow_back Apache Hive 3.1.2 Installation on Windows 10

comment Comments
Raymond Raymond #372 access_time 4 years ago more_vert

Hi Renganathan,

Can you please try the following actions?

  1. Change Hadoop 3.3.0 environment variables to the Cygwin version as documented in this article? In the output you pasted earlier, it is still using Windows path:
    export HADOOP_HOME='/cygdrive/f/big-data/hadoop-3.3.0'
    export PATH=$PATH:$HADOOP_HOME/bin
    export HIVE_HOME='/cygdrive/f/big-data/apache-hive-3.1.2-bin'

    *Remember to change the path to your own ones

  2. Make sure you can run these commands successfully with expected output:
    ls /cygdrive/f/big-data/hadoop-3.3.0
    ls /cygdrive/f/big-data/apache-hive-3.1.2-bin

    *Remember to change the path to your own ones.

  3. Examine all your Hadoop paths and configurations that there is no space in any path include DFS path in Hadoop configurations.

If you follow exactly all my steps in Hadoop 3.3.0 and Hive 3.1.2 setup, there should be no issues - I've tested it. 

BTW, to answer one of your previous question, the script locates at: apache-hive-3.1.2-bin\bin\ext\schemaTool.sh.

-Raymond

format_quote

person Renganathan access_time 4 years ago

I have just tried with Hadoop 3.3.0 and I am getting the same error while I use the schematool. All the hadoop processes are running fine (they were running fine even in 2.9.1). Somewhere the HIVE setup is picking up my partial user name (the second word is picked; first & second words are separated by space) and class not found error is thrown.

R Renganathan Mutthiah #371 access_time 4 years ago more_vert

I have just tried with Hadoop 3.3.0 and I am getting the same error while I use the schematool. All the hadoop processes are running fine (they were running fine even in 2.9.1). Somewhere the HIVE setup is picking up my partial user name (the second word is picked; first & second words are separated by space) and class not found error is thrown.

Administrator Administrator #370 access_time 4 years ago more_vert

Your Hadoop version is 2.9.1 while in the tutorial it is tested with Hadoop 3.3.0. Can you please use the same Hadoop version? For different versions, some libraries may conflict with each other and Hive 3.1.2 works with Hadoop 3.x.y but not Hadoop 2.x.

format_quote

person Renganathan access_time 4 years ago

Sure, thanks for your help Raymond!

Below is the output of the classpath you requested.

$ echo $HADOOP_CLASSPATH

E:\lion\Hadoop\hadoop-2.9.1\contrib\capacity-scheduler\*.jar;E:\Lion\Hadoop\hadoop-2.9.1\etc\hadoop;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\*:/cygdrive/e/lion/Hadoop/apache-hive-3.1.2-bin/lib/*.jar

And you mentioned about the function schemaTool(). I am unable to find the function in my hive.sh script. Not sure where it is located.

The error I am getting after I submit the command is:

$HIVE_HOME/bin/schematool -dbType derby -initSchema

Error: Could not find or load main class Lion

And I am setting up all the environment variables in my crygwin before executing the schematool command.

Please let me know if you need any further details.

Thanks!

R Renganathan Mutthiah #369 access_time 4 years ago more_vert

Sure, thanks for your help Raymond!

Below is the output of the classpath you requested.

$ echo $HADOOP_CLASSPATH

E:\lion\Hadoop\hadoop-2.9.1\contrib\capacity-scheduler\*.jar;E:\Lion\Hadoop\hadoop-2.9.1\etc\hadoop;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\*:/cygdrive/e/lion/Hadoop/apache-hive-3.1.2-bin/lib/*.jar

And you mentioned about the function schemaTool(). I am unable to find the function in my hive.sh script. Not sure where it is located.

The error I am getting after I submit the command is:

$HIVE_HOME/bin/schematool -dbType derby -initSchema

Error: Could not find or load main class Lion

And I am setting up all the environment variables in my crygwin before executing the schematool command.

Please let me know if you need any further details.

Thanks!

format_quote

person Raymond access_time 4 years ago

Hi Renganathan,

For all these environment variables setup, did you add all of them into ~/.bashrc and run command 'source ~/.bashrc'?

export HADOOP_HOME='/cygdrive/f/big-data/hadoop-3.3.0'
export PATH=$PATH:$HADOOP_HOME/bin
export HIVE_HOME='/cygdrive/f/big-data/apache-hive-3.1.2-bin'
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_CLASSPATH=$(hadoop classpath)
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*.jar

Can you also run the following command line in cygwin and paste the output here?

echo $HADOOP_CLASSPATH

And also can you paste all your detailed error here if it is ok?

The schema tool requires the following JAVA class to be present and its JAR file needs to be in the Java classpath. 

schemaTool() {
  HIVE_OPTS=''
  CLASS=org.apache.hive.beeline.HiveSchemaTool
  execHiveCmd $CLASS "$@"
}

schemaTool_help () {
  HIVE_OPTS=''
  CLASS=org.apache.hive.beeline.HiveSchemaTool
  execHiveCmd $CLASS "--help"
}

Raymond Raymond #368 access_time 4 years ago more_vert

Hi Renganathan,

For all these environment variables setup, did you add all of them into ~/.bashrc and run command 'source ~/.bashrc'?

export HADOOP_HOME='/cygdrive/f/big-data/hadoop-3.3.0'
export PATH=$PATH:$HADOOP_HOME/bin
export HIVE_HOME='/cygdrive/f/big-data/apache-hive-3.1.2-bin'
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_CLASSPATH=$(hadoop classpath)
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*.jar

Can you also run the following command line in cygwin and paste the output here?

echo $HADOOP_CLASSPATH

And also can you paste all your detailed error here if it is ok?

The schema tool requires the following JAVA class to be present and its JAR file needs to be in the Java classpath. 

schemaTool() {
  HIVE_OPTS=''
  CLASS=org.apache.hive.beeline.HiveSchemaTool
  execHiveCmd $CLASS "$@"
}

schemaTool_help () {
  HIVE_OPTS=''
  CLASS=org.apache.hive.beeline.HiveSchemaTool
  execHiveCmd $CLASS "--help"
}

format_quote

person Renganathan access_time 4 years ago

Hi Raymond,

Thanks for looking into this. I see that I have followed all the steps. I do have derby JAR files under apache-hive-3.1.2-bin\lib. I use Cygwin to setup the environment variables and run the schematool in it.

I started debugging the shell scripts. The class not found occurs in hive.ksh (located in the bin folder) at the last line --> $TORUN "$@"

It seems that TORUN resolves to schemaTool (capital 'T'). 

I am not sure how to proceed further to identify the issue root cause.

Thanks!

R Renganathan Mutthiah #367 access_time 4 years ago more_vert

Hi Raymond,

Thanks for looking into this. I see that I have followed all the steps. I do have derby JAR files under apache-hive-3.1.2-bin\lib. I use Cygwin to setup the environment variables and run the schematool in it.

I started debugging the shell scripts. The class not found occurs in hive.ksh (located in the bin folder) at the last line --> $TORUN "$@"

It seems that TORUN resolves to schemaTool (capital 'T'). 

I am not sure how to proceed further to identify the issue root cause.

Thanks!

format_quote

person Raymond access_time 4 years ago

Hi, did you follow all the steps exactly? As we have to use Cygwin to run the commands since the Command Prompt script version is not available for the latest Hive, all the steps I included in the guide is critical. It looks like the script cannot find your derby JAR files due to Java classpath (Hadoop/Hive environment variables are not setup correctly). The folder where you run the schema init script is also important to use derby since it is a file based database.

Considering derby is not good for concurrency for Hive connections, I suggest to use a remote metastore like SQL Server: Configure a SQL Server Database as Remote Hive Metastore

Alternatively, to save all the troubles, I highly recommend you follow my newly publish article Apache Hive 3.1.2 Installation on Linux Guide. That article will configure Hadoop 3.3.0 and Hive 3.1.2 in a WSL environment on Windows 10. It also includes steps to install MySQL as remote metastore for your Hive data warehouse. 

Let me know if you encounter errors in WSL.

-Raymond

Raymond Raymond #366 access_time 4 years ago more_vert

Hi, did you follow all the steps exactly? As we have to use Cygwin to run the commands since the Command Prompt script version is not available for the latest Hive, all the steps I included in the guide is critical. It looks like the script cannot find your derby JAR files due to Java classpath (Hadoop/Hive environment variables are not setup correctly). The folder where you run the schema init script is also important to use derby since it is a file based database.

Considering derby is not good for concurrency for Hive connections, I suggest to use a remote metastore like SQL Server: Configure a SQL Server Database as Remote Hive Metastore

Alternatively, to save all the troubles, I highly recommend you follow my newly publish article Apache Hive 3.1.2 Installation on Linux Guide. That article will configure Hadoop 3.3.0 and Hive 3.1.2 in a WSL environment on Windows 10. It also includes steps to install MySQL as remote metastore for your Hive data warehouse. 

Let me know if you encounter errors in WSL.

-Raymond

format_quote

person Renganathan access_time 4 years ago

Hi, thanks for the article.

When I run the command "$HIVE_HOME/bin/schematool -dbType derby -initSchema", I am getting the error .. Error: Could not find or load main class ???. (??? - is my user name).

Can you please help how to resolve it.

R Renganathan Mutthiah #365 access_time 4 years ago more_vert

Hi, thanks for the article.

When I run the command "$HIVE_HOME/bin/schematool -dbType derby -initSchema", I am getting the error .. Error: Could not find or load main class ???. (??? - is my user name).

Can you please help how to resolve it.

Raymond Raymond #356 access_time 4 years ago more_vert

The sequence is fine. Hadoop needs to be installed first before you install Hive as Hive utilizes HDFS for data store in a on-premise cluster.

format_quote

person Ankit access_time 4 years ago

Hi Raymond,

Thank you so much for your help.

I will first try to find out the issue on my own. If it doesn't work, would send email to enquiry mailbox.

I installed Hadoop 3.3.0 on my system and then directly tried to install Hive 3.1.2.

Could you please confirm if that sequence is fine or I need to install something in between Hadoop and hive?

Thanks again :)

Regards,

Ankit

A Ankit Tiwari #355 access_time 4 years ago more_vert

Hi Raymond,

Thank you so much for your help.

I will first try to find out the issue on my own. If it doesn't work, would send email to enquiry mailbox.

I installed Hadoop 3.3.0 on my system and then directly tried to install Hive 3.1.2.

Could you please confirm if that sequence is fine or I need to install something in between Hadoop and hive?

Thanks again :)

Regards,

Ankit

format_quote

person Raymond access_time 4 years ago

For some of my previous configurations, the logs were not printed out either.

Can you please check whether there is a folder named metastore_db created? It should be ok if the folder exists and have content.

To make everything easy, I would suggest to install SQL Server express or Developer edition and then configure your Hive metastore to use SQL Server:

https://kontext.tech/column/hadoop/302/configure-a-sql-server-database-as-remote-hive-metastore 

If it still doesn't work, please send an email to enquiry[at]kontext.tech and I can arrange a Teams meeting with you to debug. 

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts