Scala: Read JSON file as Spark DataFrame
In article Scala: Parse JSON String as Spark DataFrame, it shows how to convert an in-memory JSON string object to a Spark DataFrame. This article shows how to read directly from a JSON file. In fact, this is even simpler.
Read from local JSON file
The following code snippet reads from a local JSON file named test.json.
The content of the JSON file is:
[{"ID":1,"ATTR1":"ABC"}, {"ID":2,"ATTR1":"DEF"}, {"ID":3,"ATTR1":"GHI"}]
Code snippet
scala> spark.read.format("json").option("multiLine","true").load("file:///F:\\big-data/test.json").show() +-----+---+ |ATTR1| ID| +-----+---+ | ABC| 1| | DEF| 2| | GHI| 3| +-----+---+
Read from HDFS JSON file
The following code snippet reads from a path in HDFS (/big-data/test.json):
scala> spark.read.format("json").option("multiLine","true").load("/big-data/test.json").show()
info Last modified by Raymond 5 years ago
copyright
This page is subject to Site terms.
comment Comments
No comments yet.