Read Data from Hive in Spark 1.x and 2.x
Spark 2.x
Form Spark 2.0, you can use Spark session builder to enable Hive support directly.
The following example (Python) shows how to implement it.
from pyspark.sql import SparkSession appName = "PySpark Hive Example" master = "local" # Create Spark session with Hive supported. spark = SparkSession.builder \ .appName(appName) \ .master(master) \ .enableHiveSupport() \ .getOrCreate() # Read data using SQL df = spark.sql("show databases") df.show()
Spark 1.x
In previous versions, you need to use HiveContext to connect to Hive to manipulate data in Hive databases.
To initialize a HiveContext, you need to fist create a SparkContext.
from pyspark import SparkContext, SparkConf, HiveContext appName = "JSON Parse Example" master = "local" conf = SparkConf().setAppName(appName).setMaster(master) sc = SparkContext(conf=conf) # Construct a HiveContext object sqlContext = HiveContext(sc) # Read data using SQL df = sqlContext.sql("show databases") df.show()
info Last modified by Raymond 5 years ago
copyright
This page is subject to Site terms.
comment Comments
No comments yet.