PySpark - collect() method

visibility 18 access_time 2mo languageEnglish

collect() method is used to collect the data from the pyspark data frame. It will collect row by row and display it in the form of a list.

Syntax:

dataframe.collect()

Example Code:

import pyspark


from pyspark.sql import SparkSession


from pyspark.sql.types import StringType, DoubleType,IntegerType,StructType, StructField,FloatType


spark = SparkSession.builder.appName('kontexttech').getOrCreate()


values = [(1, "Gottumukkala Sravan Kumar",4500.00), (2, "Bobby",93445.000), (3, "Gnanesh",88900.000)]


schema = StructType([
StructField("rollno", IntegerType(), True),
StructField("name", StringType(), True),
 StructField("fee", FloatType(), True),
])


data = spark.createDataFrame(values, schema)


data.collect()

Output:

[Row(rollno=1, name='Gottumukkala Sravan Kumar', fee=4500.0),

 Row(rollno=2, name='Bobby', fee=93445.0),

 Row(rollno=3, name='Gnanesh', fee=88900.0)]

info Last modified by Gottumukkala Sravan Kumar 2mo copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

timeline Stats
Page index 0.46
local_offer Tags