Configure a SQL Server Database as Remote Hive Metastore

Raymond Raymond event 2019-04-14 visibility 7,798 comment 4
more_vert

In one of my previous post, I showed how to configure Apache Hive 3.0.0 in Windows 10.

Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

I didn’t configure Hive metastore thus by default Hive will use embedded mode for metastore. The metadata is stored in Apache Derby database.

However only one user can be active at a time thus won’t be suitable for scenarios that require multiple sessions.  In this post, I’m going to show you how to configure a SQL Server database as a remote metastore for HiveServer2.

Prerequisites

Please configure Hive 3.0.0 first by following the above post.

Create a metastore database in SQL Server

Run the following commands to create a database and also login and user for Hive:

create database hive_metastore;
go
create login hive with password='hive';
go 
use hive_metastore
go
create user hive for login hive;
go
alter role db_owner add member hive;
go

The above script will also add user hive into role db_owner.

Download SQL Server JDBC driver

The binary doesn’t include JDBC driver for SQL Server. Thus we need to install it ourselves.

Download SQL Server JDBC driver from Microsoft website:

Download Microsoft JDBC Driver for SQL Server

Copy the suitable JAR file (mssql-jdbc-*.jre8.jar) into $HIVE_HOME/lib folder.

For Ubuntu or other Linux users, use the following command to download:

wget https://github.com/microsoft/mssql-jdbc/releases/download/v7.2.2/mssql-jdbc-7.2.2.jre8.jar
cp mssql-jdbc-7.2.2.jre8.jar $HIVE_HOME/lib

* Replace the URL to your own download URL.

*auth folder is only required if you want to use Windows Authentication in Windows systems.

Configure Hive file hive-site.xml

Edit hive-default.xml file (or create one if not existing) in folder $HIVE_HOME/conf.

* You can create the configuration file by using template file hive-default.xml.template.

cp $HIVE_HOME/conf/hive-default.xml.template $HIVE_HOME/conf/hive-site.xml

The above command needs to run in Cygwin terminal.

We need configure the following properties accordingly.

  • javax.jdo.option.ConnectionDriverName
  • javax.jdo.option.ConnectionURL
  • javax.jdo.option.ConnectionUserName
  • javax.jdo.option.ConnectionPassword
  • hive.metastore.uris
  • hive.metastore.db.type

For my environment, they are configured as the following:

<property>
   <name>javax.jdo.option.ConnectionDriverName</name>
   <value>com.microsoft.sqlserver.jdbc.SQLServerDriver</value>
   <description>Driver class name for a JDBC metastore</description> </property>
<property>
   <name>javax.jdo.option.ConnectionURL</name>
   <value>jdbc:sqlserver://localhost:1433;DatabaseName=hive_metastore;</value>
   <description>
     JDBC connect string for a JDBC metastore.
     To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
     For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
   </description> </property>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>Username to use against metastore database</description>
  </property>
<property>
   <name>javax.jdo.option.ConnectionPassword</name>
   <value>hive</value>
   <description>password to use against metastore database</description> </property>
<property>
   <name>hive.metastore.uris</name>
   <value>thrift://127.0.0.1:9083</value>
   <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property>
<property>
   <name>hive.metastore.db.type</name>
   <value>mssql</value>
   <description>
     Expects one of [derby, oracle, mysql, mssql, postgres].
     Type of database used by the metastore. Information schema &amp; JDBCStorageHandler depend on it.
   </description> </property>

Initialize database structure

Run the following command in Cygwin:

$HIVE_HOME/bin/schematool -dbType mssql -initSchema

If your code runs successfully, the following output will be generated:

Metastore connection URL:        jdbc:sqlserver://localhost:1433;DatabaseName=hive_metastore;
Metastore Connection Driver :    com.microsoft.sqlserver.jdbc.SQLServerDriver
Metastore connection User:       hive
Starting metastore schema initialization to 3.0.0
Initialization script hive-schema-3.0.0.mssql.sql
Initialization script completed schemaTool completed

You can also verify the schema in database:

image

Start Hive metastore service

$HIVE_HOME/bin/hive --service metastore

And then you can use either beeline or hive CLI to interact with Hive databases:

$HIVE_HOME/bin/beeline -u jdbc:hive2:// 

Start HiveServer2 service

$HIVE_HOME/bin/hive --service hiveserver2 start

Have fun!

More from Kontext
comment Comments
SK Sathish Kumar R

Sathish Kumar access_time 3 years ago link more_vert

Hello sir,

I have this error 

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/sathish/hadoop/apache-hive-3.1.1-bin/lib/slf4j-reload4j-1.7.35.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/sathish/hadoop/apache-hive-3.1.1-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/sathish/hadoop/hadoop-3.2.4/share/hadoop/common/lib/slf4j-reload4j-1.7.35.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V

        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)

        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)

        at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)

        at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554)

        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:448)

        at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5135)

        at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5098)        at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java:96)

        at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

I have even deleted those jar files and only had jar file corresponding /home/sathish/hadoop/apache-hive-3.1.1-bin/lib/log4j-slf4j-impl-2.10.0.jar to this location though i get java.lang.NoSuchMethodError 

Can you please help me?

Raymond Raymond

Raymond access_time 3 years ago link more_vert

Hi Sathish,

Welcome to Kontext! The jar file you mentioned is just the warning messages of logging methods. The actual error occurred in another package: com.google.common.base.Preconditions.checkArgument.

I believe you already find out the resolution from post: https://kontext.tech/thread/415/hive-exception-in-thread-main-javalangnosuchmethoderror-comgooglecommon

If not, let me know.

Cheers,

Raymond

SK Sathish Kumar R

Sathish Kumar access_time 3 years ago link more_vert

Thankyou sir it was helpful like I found ur solution the moment I posted my comment😂 that helped me thank you 

Raymond Raymond

Raymond access_time 3 years ago link more_vert

You are welcome. I am glad it is helpful. 

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts