By using this site, you acknowledge that you have read and understand our Cookie policy, Privacy policy and Terms .
close

Code snippets for various programming languages/frameworks.

rss_feed Subscribe RSS

Parquet is columnar store format published by Apache. It's commonly used in Hadoop ecosystem. There are many programming language APIs that have been implemented to support writing and reading parquet files. 

You can also use PySpark to read or write parquet files.

Code snippet

from pyspark.sql import SparkSession

appName = "Scala Parquet Example"
master = "local"

spark = SparkSession.builder.appName(appName).master(master).getOrCreate()

df = spark.read.format("csv").option("header", "true").load("Sales.csv")

df.write.parquet("Sales.parquet")

df2 = spark.read.parquet("Sales.parquet")
df2.show()
info Last modified by Raymond at 8 months ago
info About author

info License/Terms

Please log in or register to comment. account_circle Log in person_add Register
comment Comments (0)
No comments yet.