Read Data from Hive in Spark 1.x and 2.x

visibility 2,284 access_time 2 years ago languageEnglish timeline Stats
timeline Stats
Page index 1.95

Spark 2.x

Form Spark 2.0, you can use Spark session builder to enable Hive support directly.

The following example (Python) shows how to implement it.

from pyspark.sql import SparkSession

appName = "PySpark Hive Example"
master = "local"

# Create Spark session with Hive supported.
spark = SparkSession.builder \
    .appName(appName) \
    .master(master) \
    .enableHiveSupport() \

# Read data using SQL
df = spark.sql("show databases")

Spark 1.x

In previous versions, you need to use HiveContext to connect to Hive to manipulate data in Hive databases.

To initialize a HiveContext, you need to fist create a SparkContext. 

from pyspark import SparkContext, SparkConf, HiveContext

appName = "JSON Parse Example"
master = "local"
conf = SparkConf().setAppName(appName).setMaster(master)
sc = SparkContext(conf=conf)

# Construct a HiveContext object
sqlContext = HiveContext(sc)

# Read data using SQL
df = sqlContext.sql("show databases")
info Last modified by Raymond 2 years ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

More from Kontext
Convert string to date in Scala / Spark
visibility 9,521
thumb_up 0
access_time 4 years ago
.NET for Apache Spark v1.0.0 Released
visibility 108
thumb_up 0
access_time 2 years ago
.NET for Apache Spark v1.0.0 Released
PySpark - Read Data from Oracle Database
visibility 19
thumb_up 0
access_time 8 days ago