Write and read parquet files in Scala / Spark
insights Stats
warning Please login first to view stats information.
Raymond
Code Snippets & Tips
Code snippets and tips for various programming languages/frameworks. All code examples are under MIT or Apache 2.0 license unless specified otherwise.
Parquet is columnar store format published by Apache. It's commonly used in Hadoop ecosystem. There are many programming language APIs that have been implemented to support writing and reading parquet files.
You can easily use Spark to read or write Parquet files.
Code snippet
import org.apache.spark.sql.SparkSession val appName = "Scala Parquet Example" val master = "local" /*Create Spark session with Hive supported.*/ val spark = SparkSession.builder.appName(appName).master(master).getOrCreate() val df = spark.read.format("csv").option("header", "true").load("Sales.csv") /*Write parquet file*/ df.write.parquet("Sales.parquet") val df2 = spark.read.parquet("Sales.parquet") df2.show()
info Last modified by Raymond 6 years ago
copyright
This page is subject to Site terms.
comment Comments
No comments yet.