Scala: Read JSON file as Spark DataFrame

visibility 296 access_time 2 years ago languageEnglish

In article Scala: Parse JSON String as Spark DataFrame, it shows how to convert an in-memory JSON string object to a Spark DataFrame. This article shows how to read directly from a JSON file. In fact, this is even simpler. 

Read from local JSON file

The following code snippet reads from a local JSON file named test.json.

The content of the JSON file is:

[{"ID":1,"ATTR1":"ABC"},
{"ID":2,"ATTR1":"DEF"},
{"ID":3,"ATTR1":"GHI"}]

Code snippet

scala> spark.read.format("json").option("multiLine","true").load("file:///F:\\big-data/test.json").show()
+-----+---+
|ATTR1| ID|
+-----+---+
|  ABC|  1|
|  DEF|  2|
|  GHI|  3|
+-----+---+

Read from HDFS JSON file

The following code snippet reads from a path in HDFS (/big-data/test.json):

scala> spark.read.format("json").option("multiLine","true").load("/big-data/test.json").show()
info Last modified by Raymond 2 years ago copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

timeline Stats
Page index 0.57
More from Kontext
Scala: Parse JSON String as Spark DataFrame
visibility 7,524
thumb_up 1
access_time 2 years ago
Convert String to Date in Spark (Scala)
visibility 10,140
thumb_up 0
access_time 2 years ago
Load Data into HDFS from SQL Server via Sqoop
visibility 4,882
thumb_up 0
access_time 5 years ago