arrow_back Apache Hive 3.1.2 Installation on Windows 10
person Renganathan access_time 4 years ago
I have just tried with Hadoop 3.3.0 and I am getting the same error while I use the schematool. All the hadoop processes are running fine (they were running fine even in 2.9.1). Somewhere the HIVE setup is picking up my partial user name (the second word is picked; first & second words are separated by space) and class not found error is thrown.
I have just tried with Hadoop 3.3.0 and I am getting the same error while I use the schematool. All the hadoop processes are running fine (they were running fine even in 2.9.1). Somewhere the HIVE setup is picking up my partial user name (the second word is picked; first & second words are separated by space) and class not found error is thrown.
Your Hadoop version is 2.9.1 while in the tutorial it is tested with Hadoop 3.3.0. Can you please use the same Hadoop version? For different versions, some libraries may conflict with each other and Hive 3.1.2 works with Hadoop 3.x.y but not Hadoop 2.x.
person Renganathan access_time 4 years ago
Sure, thanks for your help Raymond!
Below is the output of the classpath you requested.
$ echo $HADOOP_CLASSPATH
E:\lion\Hadoop\hadoop-2.9.1\contrib\capacity-scheduler\*.jar;E:\Lion\Hadoop\hadoop-2.9.1\etc\hadoop;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\*:/cygdrive/e/lion/Hadoop/apache-hive-3.1.2-bin/lib/*.jar
And you mentioned about the function schemaTool(). I am unable to find the function in my hive.sh script. Not sure where it is located.
The error I am getting after I submit the command is:
$HIVE_HOME/bin/schematool -dbType derby -initSchema
Error: Could not find or load main class Lion
And I am setting up all the environment variables in my crygwin before executing the schematool command.
Please let me know if you need any further details.
Thanks!
Sure, thanks for your help Raymond!
Below is the output of the classpath you requested.
$ echo $HADOOP_CLASSPATH
E:\lion\Hadoop\hadoop-2.9.1\contrib\capacity-scheduler\*.jar;E:\Lion\Hadoop\hadoop-2.9.1\etc\hadoop;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\common\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\hdfs\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\yarn\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\lib\*;E:\Lion\Hadoop\hadoop-2.9.1\share\hadoop\mapreduce\*:/cygdrive/e/lion/Hadoop/apache-hive-3.1.2-bin/lib/*.jar
And you mentioned about the function schemaTool(). I am unable to find the function in my hive.sh script. Not sure where it is located.
The error I am getting after I submit the command is:
$HIVE_HOME/bin/schematool -dbType derby -initSchema
Error: Could not find or load main class Lion
And I am setting up all the environment variables in my crygwin before executing the schematool command.
Please let me know if you need any further details.
Thanks!
person Raymond access_time 4 years ago
Hi Renganathan,
For all these environment variables setup, did you add all of them into ~/.bashrc and run command 'source ~/.bashrc'?
export HADOOP_HOME='/cygdrive/f/big-data/hadoop-3.3.0' export PATH=$PATH:$HADOOP_HOME/bin export HIVE_HOME='/cygdrive/f/big-data/apache-hive-3.1.2-bin' export PATH=$PATH:$HIVE_HOME/bin export HADOOP_CLASSPATH=$(hadoop classpath) export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*.jar
Can you also run the following command line in cygwin and paste the output here?
echo $HADOOP_CLASSPATH
And also can you paste all your detailed error here if it is ok?
The schema tool requires the following JAVA class to be present and its JAR file needs to be in the Java classpath.
schemaTool() { HIVE_OPTS='' CLASS=org.apache.hive.beeline.HiveSchemaTool execHiveCmd $CLASS "$@" } schemaTool_help () { HIVE_OPTS='' CLASS=org.apache.hive.beeline.HiveSchemaTool execHiveCmd $CLASS "--help" }
Hi Renganathan,
For all these environment variables setup, did you add all of them into ~/.bashrc and run command 'source ~/.bashrc'?
export HADOOP_HOME='/cygdrive/f/big-data/hadoop-3.3.0' export PATH=$PATH:$HADOOP_HOME/bin export HIVE_HOME='/cygdrive/f/big-data/apache-hive-3.1.2-bin' export PATH=$PATH:$HIVE_HOME/bin export HADOOP_CLASSPATH=$(hadoop classpath) export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*.jar
Can you also run the following command line in cygwin and paste the output here?
echo $HADOOP_CLASSPATH
And also can you paste all your detailed error here if it is ok?
The schema tool requires the following JAVA class to be present and its JAR file needs to be in the Java classpath.
schemaTool() { HIVE_OPTS='' CLASS=org.apache.hive.beeline.HiveSchemaTool execHiveCmd $CLASS "$@" } schemaTool_help () { HIVE_OPTS='' CLASS=org.apache.hive.beeline.HiveSchemaTool execHiveCmd $CLASS "--help" }
person Renganathan access_time 4 years ago
Hi Raymond,
Thanks for looking into this. I see that I have followed all the steps. I do have derby JAR files under apache-hive-3.1.2-bin\lib. I use Cygwin to setup the environment variables and run the schematool in it.
I started debugging the shell scripts. The class not found occurs in hive.ksh (located in the bin folder) at the last line --> $TORUN "$@"
It seems that TORUN resolves to schemaTool (capital 'T').
I am not sure how to proceed further to identify the issue root cause.
Thanks!
Hi Raymond,
Thanks for looking into this. I see that I have followed all the steps. I do have derby JAR files under apache-hive-3.1.2-bin\lib. I use Cygwin to setup the environment variables and run the schematool in it.
I started debugging the shell scripts. The class not found occurs in hive.ksh (located in the bin folder) at the last line --> $TORUN "$@"
It seems that TORUN resolves to schemaTool (capital 'T').
I am not sure how to proceed further to identify the issue root cause.
Thanks!
person Raymond access_time 4 years ago
Hi, did you follow all the steps exactly? As we have to use Cygwin to run the commands since the Command Prompt script version is not available for the latest Hive, all the steps I included in the guide is critical. It looks like the script cannot find your derby JAR files due to Java classpath (Hadoop/Hive environment variables are not setup correctly). The folder where you run the schema init script is also important to use derby since it is a file based database.
Considering derby is not good for concurrency for Hive connections, I suggest to use a remote metastore like SQL Server: Configure a SQL Server Database as Remote Hive Metastore.
Alternatively, to save all the troubles, I highly recommend you follow my newly publish article Apache Hive 3.1.2 Installation on Linux Guide. That article will configure Hadoop 3.3.0 and Hive 3.1.2 in a WSL environment on Windows 10. It also includes steps to install MySQL as remote metastore for your Hive data warehouse.
Let me know if you encounter errors in WSL.
-Raymond
Hi, did you follow all the steps exactly? As we have to use Cygwin to run the commands since the Command Prompt script version is not available for the latest Hive, all the steps I included in the guide is critical. It looks like the script cannot find your derby JAR files due to Java classpath (Hadoop/Hive environment variables are not setup correctly). The folder where you run the schema init script is also important to use derby since it is a file based database.
Considering derby is not good for concurrency for Hive connections, I suggest to use a remote metastore like SQL Server: Configure a SQL Server Database as Remote Hive Metastore.
Alternatively, to save all the troubles, I highly recommend you follow my newly publish article Apache Hive 3.1.2 Installation on Linux Guide. That article will configure Hadoop 3.3.0 and Hive 3.1.2 in a WSL environment on Windows 10. It also includes steps to install MySQL as remote metastore for your Hive data warehouse.
Let me know if you encounter errors in WSL.
-Raymond
person Renganathan access_time 4 years ago
Hi, thanks for the article.
When I run the command "$HIVE_HOME/bin/schematool -dbType derby -initSchema", I am getting the error .. Error: Could not find or load main class ???. (??? - is my user name).
Can you please help how to resolve it.
Hi, thanks for the article.
When I run the command "$HIVE_HOME/bin/schematool -dbType derby -initSchema", I am getting the error .. Error: Could not find or load main class ???. (??? - is my user name).
Can you please help how to resolve it.
The sequence is fine. Hadoop needs to be installed first before you install Hive as Hive utilizes HDFS for data store in a on-premise cluster.
person Ankit access_time 4 years ago
Hi Raymond,
Thank you so much for your help.
I will first try to find out the issue on my own. If it doesn't work, would send email to enquiry mailbox.
I installed Hadoop 3.3.0 on my system and then directly tried to install Hive 3.1.2.
Could you please confirm if that sequence is fine or I need to install something in between Hadoop and hive?
Thanks again :)
Regards,
Ankit
Hi Raymond,
Thank you so much for your help.
I will first try to find out the issue on my own. If it doesn't work, would send email to enquiry mailbox.
I installed Hadoop 3.3.0 on my system and then directly tried to install Hive 3.1.2.
Could you please confirm if that sequence is fine or I need to install something in between Hadoop and hive?
Thanks again :)
Regards,
Ankit
person Raymond access_time 4 years ago
For some of my previous configurations, the logs were not printed out either.
Can you please check whether there is a folder named metastore_db created? It should be ok if the folder exists and have content.
To make everything easy, I would suggest to install SQL Server express or Developer edition and then configure your Hive metastore to use SQL Server:
https://kontext.tech/column/hadoop/302/configure-a-sql-server-database-as-remote-hive-metastore
If it still doesn't work, please send an email to enquiry[at]kontext.tech and I can arrange a Teams meeting with you to debug.
Hi Renganathan,
Can you please try the following actions?
*Remember to change the path to your own ones
ls /cygdrive/f/big-data/hadoop-3.3.0 ls /cygdrive/f/big-data/apache-hive-3.1.2-bin
*Remember to change the path to your own ones.
Examine all your Hadoop paths and configurations that there is no space in any path include DFS path in Hadoop configurations.
If you follow exactly all my steps in Hadoop 3.3.0 and Hive 3.1.2 setup, there should be no issues - I've tested it.
BTW, to answer one of your previous question, the script locates at: apache-hive-3.1.2-bin\bin\ext\schemaTool.sh.
-Raymond