PySpark - collect() method

collect() method is used to collect the data from the pyspark data frame. It will collect row by row and display it in the form of a list.



Example Code:

import pyspark

from pyspark.sql import SparkSession

from pyspark.sql.types import StringType, DoubleType,IntegerType,StructType, StructField,FloatType

spark = SparkSession.builder.appName('kontexttech').getOrCreate()

values = [(1, "Gottumukkala Sravan Kumar",4500.00), (2, "Bobby",93445.000), (3, "Gnanesh",88900.000)]

schema = StructType([
StructField("rollno", IntegerType(), True),
StructField("name", StringType(), True),
 StructField("fee", FloatType(), True),

data = spark.createDataFrame(values, schema)



[Row(rollno=1, name='Gottumukkala Sravan Kumar', fee=4500.0),

 Row(rollno=2, name='Bobby', fee=93445.0),

 Row(rollno=3, name='Gnanesh', fee=88900.0)]

