Spark 3.0.1: Connect to HBase 2.4.1
Log in with external accounts
comment Comments
#1559 Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi cansın,
What is your version of HBase?
And also can you specify the full path to your spark-hbase connector jar file? For example, in the example I provided in this article, I am using ~/
spark-shell --jars ~/hbase-connectors/spark/hbase-spark/target/hbase-spark-1.0.1-SNAPSHOT.jar
person cansın access_time 5 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi Raymond thanks for the article.
I have managed to create my own jar and connect to shell with following command:
spark-shell --jars hbase-connectors/spark/hbase-spark/target/hbase-spark-1.0.1-SNAPSHOT.jar
but when I write my imports I get following error:
scala> import org.apache.hadoop.hbase.spark.HBaseContext
import org.apache.hadoop.hbase.spark.HBaseContext
scala> import org.apache.hadoop.hbase.HBaseConfiguration
<console>:24: error: object HBaseConfiguration is not a member of package org.apache.hadoop.hbase
import org.apache.hadoop.hbase.HBaseConfiguration
Do you have any idea that what what might be wrong?
#1558 Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi Raymond thanks for the article.
I have managed to create my own jar and connect to shell with following command:
spark-shell --jars hbase-connectors/spark/hbase-spark/target/hbase-spark-1.0.1-SNAPSHOT.jar
but when I write my imports I get following error:
scala> import org.apache.hadoop.hbase.spark.HBaseContext
import org.apache.hadoop.hbase.spark.HBaseContext
scala> import org.apache.hadoop.hbase.HBaseConfiguration
<console>:24: error: object HBaseConfiguration is not a member of package org.apache.hadoop.hbase
import org.apache.hadoop.hbase.HBaseConfiguration
Do you have any idea that what what might be wrong?
#1548 Re: Spark 3.0.1: Connect to HBase 2.4.1
Please contact us via: Contact us and we will try to arrange a Teams session for you.
person Pavan Kumar access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Yes, they all are in current directory. Can we connect if possible?
#1541 Re: Spark 3.0.1: Connect to HBase 2.4.1
Yes, they all are in current directory. Can we connect if possible?
person Raymond access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Are all those jars included in the current directory where you initiated the spark-shell?
You can manually put them into \jars directory in your Spark installation.
#1540 Re: Spark 3.0.1: Connect to HBase 2.4.1
Are all those jars included in the current directory where you initiated the spark-shell?
You can manually put them into \jars directory in your Spark installation.
person Pavan Kumar access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
I think there is no issue with the build. But I'm unable to connect to Hbase from Spark. I'm using a docker environment where Zookeeper, HDFS, Spark, and HBase run in different containers in the same network.
Here are the jars I'm using.
spark-shell --jars hbase-spark-protocol-shaded-1.0.0.7.2.12.0-291.jar,htrace-core4-4.2.0-incubating.jar,hbase-shaded-protobuf-3.5.1.jar,protobuf-java-2.5.0.jar,hbase-protocol-2.4.8.jar,hbase-shaded-miscellaneous-3.5.1.jar,hbase-mapreduce-2.4.8.jar,hbase-server-2.4.8.jar,hbase-client-2.4.8.jar,hbase-common-2.4.8.jar,hbase-spark-1.0.1-SNAPSHOT.jar,hadoop-common-2.8.5.jar --files hbase-site.xml
I have almost all the required jars but still seeing below error. I tried my best to debug the isue but didn't find a way to get rid of this. Please advise me how to resolve this or redirect me if there is any detailed documentation about prerequisites.
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/shaded/protobuf/generated/MasterProtos$MasterService$BlockingInterface
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
#1537 Re: Spark 3.0.1: Connect to HBase 2.4.1
I think there is no issue with the build. But I'm unable to connect to Hbase from Spark. I'm using a docker environment where Zookeeper, HDFS, Spark, and HBase run in different containers in the same network.
Here are the jars I'm using.
spark-shell --jars hbase-spark-protocol-shaded-1.0.0.7.2.12.0-291.jar,htrace-core4-4.2.0-incubating.jar,hbase-shaded-protobuf-3.5.1.jar,protobuf-java-2.5.0.jar,hbase-protocol-2.4.8.jar,hbase-shaded-miscellaneous-3.5.1.jar,hbase-mapreduce-2.4.8.jar,hbase-server-2.4.8.jar,hbase-client-2.4.8.jar,hbase-common-2.4.8.jar,hbase-spark-1.0.1-SNAPSHOT.jar,hadoop-common-2.8.5.jar --files hbase-site.xml
I have almost all the required jars but still seeing below error. I tried my best to debug the isue but didn't find a way to get rid of this. Please advise me how to resolve this or redirect me if there is any detailed documentation about prerequisites.
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/shaded/protobuf/generated/MasterProtos$MasterService$BlockingInterface
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
person Raymond access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
#1536 Re: Spark 3.0.1: Connect to HBase 2.4.1
person Pavan Kumar access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Awesome!
That worked. Thanks again for your help, Raymond.
#1535 Re: Spark 3.0.1: Connect to HBase 2.4.1
Awesome!
That worked. Thanks again for your help, Raymond.
person Raymond access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi Pavan,
The issue you encountered is the same one I mentioned in the article due to incompatible version of the HBase and connector code.
For HBase version, I have to use 2.2.4 as the latest hbase-connector code was based on that version.
So please try the following command:
mvn -Dspark.version=3.1.1 -Dscala.version=2.12.10 -Dscala.binary.version=2.12 -Dhbase.version=2.2.4 -Dhadoop.profile=3.0 -Dhadoop-three.version=3.2.1 -DskipTests -Dcheckstyle.skip -U clean package
The built package should still work with HBase 2.4.7.
Regards,
Raymond
#1534 Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi Pavan,
The issue you encountered is the same one I mentioned in the article due to incompatible version of the HBase and connector code.
For HBase version, I have to use 2.2.4 as the latest hbase-connector code was based on that version.
So please try the following command:
mvn -Dspark.version=3.1.1 -Dscala.version=2.12.10 -Dscala.binary.version=2.12 -Dhbase.version=2.2.4 -Dhadoop.profile=3.0 -Dhadoop-three.version=3.2.1 -DskipTests -Dcheckstyle.skip -U clean package
The built package should still work with HBase 2.4.7.
Regards,
Raymond
person Pavan Kumar access_time 7 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Thanks for pointing that @Raymond. My Hadoop, Spark, Scala, and Hbase versions are 3.2.1, 3.1.1,2.12, and 2.4.7 respectively.
Maven build:
mvn -Dspark.version=3.1.1 -Dscala.version=2.12.10 -Dscala.binary.version=2.12 -Dhbase.version=2.4.7 -Dhadoop.profile=3.0 -Dhadoop-three.version=3.2.1 -DskipTests -Dcheckstyle.skip -U clean package
I have upgraded Maven and the issue is resolved. But seeing a compilation error as below.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project hbase-kafka-proxy: Compilation failure
[ERROR] /home/ec2-user/git/spark-hbase/hbase-connectors/kafka/hbase-kafka-proxy/src/main/java/org/apache/hadoop/hbase/kafka/KafkaTableForBridge.java:[53,8] org.apache.hadoop.hbase.kafka.KafkaTableForBridge is not abstract and does not override abstract method getRegionLocator() in org.apache.hadoop.hbase.client.Table
I would be so grateful if you could help me with what I need to learn to resolve such issues.
Thank you so much for your help.
#1561 Re: Spark 3.0.1: Connect to HBase 2.4.1
Just to follow up on this one as I didn't hear back from you. Have you resolved this problem?
person cansın access_time 5 months ago
Re: Spark 3.0.1: Connect to HBase 2.4.1
Hi Raymond thanks for the article.
I have managed to create my own jar and connect to shell with following command:
spark-shell --jars hbase-connectors/spark/hbase-spark/target/hbase-spark-1.0.1-SNAPSHOT.jar
but when I write my imports I get following error:
scala> import org.apache.hadoop.hbase.spark.HBaseContext
import org.apache.hadoop.hbase.spark.HBaseContext
scala> import org.apache.hadoop.hbase.HBaseConfiguration
<console>:24: error: object HBaseConfiguration is not a member of package org.apache.hadoop.hbase
import org.apache.hadoop.hbase.HBaseConfiguration
Do you have any idea that what what might be wrong?