spark hive

Error: Failed to load class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

event 2020-12-27 visibility 3,528

more_vert

Solution 1 - Install Spark with Hadoop built-in
Solution 2 - Download the missing JAR file manually

When installing a vanilla Spark on Windows or Linux, you may encounter the following error to invoke spark-sql command:

Error: Failed to load class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

This error usually occurs when installing a Spark version without built-in Hadoop libraries (headless version) as the Spark hive and hive thrift server packages are not included.

There are two ways to fix this issue.

Solution 1 - Install Spark with Hadoop built-in

When downloading Spark, choose the version that has built-in Hadoop.

In the library folder, you should be able to find a JAR file named spark-hive-thriftserver_2.12-3.0.1.jar.

You can run spark-sql command successfully without errors.

warning Make sure Hadoop version in Spark binary package is consistent with your Hadoop version; otherwise you may encounter JAR file version issues. And also the dependent JAR package versions needs to be consistent with your Hive installation. When Spark is compiled with Hive enabled, there are three compile dependencies from Hive: hive-cli, hive-jdbc and hive-beeline. For Spark 3.0.1, it depends on Hive 3.1.2.

Solution 2 - Download the missing JAR file manually

warning This approach is not fully verified yet.

Another approach is to download the packages manually. For example, the missing JAR file is available on Maven Repository: Maven Repository: org.apache.spark » spark-hive-thriftserver_2.12 » 3.0.1 for Spark 3.0.1. If you are installing other versions of Spark, download the right package accordingly.

The following steps are for Spark 3.0.1.

Download the package:

wget https://repo1.maven.org/maven2/org/apache/spark/spark-hive-thriftserver_2.12/3.0.1/spark-hive-thriftserver_2.12-3.0.1.jar

Copy the package to $SPARK_HOME/jars folder.

mv spark-hive-thriftserver_2.12-3.0.1.jar $SPARK_HOME/jars/

Download another package spark-hive_2.11-2.4.3.jar as it is also required but missing in the headless version:
```
wget https://repo1.maven.org/maven2/org/apache/spark/spark-hive_2.12/3.0.1/spark-hive_2.12-3.0.1.jar
```

Copy the package to $SPARK_HOME/jars folder.

mv spark-hive_2.12-3.0.1.jar $SPARK_HOME/jars/

Now your spark-sql command should work properly.

warning Make sure HiveServer2 service is running before starting spark-sql.

info Last modified by Raymond 5 years ago copyright This page is subject to Site terms.

Spark & PySpark

Log in with external accounts

Error: Failed to load class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

Table of contents

Solution 1 - Install Spark with Hadoop built-in

Solution 2 - Download the missing JAR file manually

Log in with external accounts