Spark SQL - Average (AVG) Calculation

visibility 15 comment 0 access_time 22 days ago languageEnglish

Spark provides flexible APIs to perform average calculation against a data set. You can either use Spark SQL or fluent APIs to implement it.

Spark SQL - AVG

The follow code snippet shows you how to use AVG directly via Spark SQL. You can run the query against Hive databases or directly in a Spark-SQL shell.

from pyspark.sql import SparkSession
from pyspark.sql.functions import avg

appName = "PySpark Average (AVG) Example"
master = "local"

# Create Spark session with Hive supported.
spark = SparkSession.builder \
    .appName(appName) \
    .master(master) \
    .getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

# GroupBY 
df = spark.sql("""SELECT TXN_DT, AVG(AMT) FROM VALUES 
(101,10.01, DATE'2021-01-01'),
(101,102.01, DATE'2021-01-01'),
(102,93., DATE'2021-01-01'),
(103,913.1, DATE'2021-01-02'),
(102,913.1, DATE'2021-01-02'),
(101,900.56, DATE'2021-01-03')
AS TXN(ACCT,AMT,TXN_DT) GROUP BY TXN_DT""")

df.show()

Result:

+----------+----------+
|    TXN_DT|  avg(AMT)|
+----------+----------+
|2021-01-01| 68.340000|
|2021-01-03|900.560000|
|2021-01-02|913.100000|
+----------+----------+

Use avg API

The above example can also be changed to use avg API directly. This is useful is you already have an dataframe and if you don't want to use Spark SQL:

# AVG
df = spark.sql("""SELECT ACCT,AMT,TXN_DT FROM VALUES 
(101,10.01, DATE'2021-01-01'),
(101,102.01, DATE'2021-01-01'),
(102,93., DATE'2021-01-01'),
(103,913.1, DATE'2021-01-02'),
(102,913.1, DATE'2021-01-02'),
(101,900.56, DATE'2021-01-03')
AS TXN(ACCT,AMT,TXN_DT) """)

df.groupBy("TXN_DT").agg(avg("AMT").alias("AVG_AMT")).show()

The result will be the same.

+----------+----------+
|    TXN_DT|   AVG_AMT|
+----------+----------+
|2021-01-01| 68.340000|
|2021-01-03|900.560000|
|2021-01-02|913.100000|
+----------+----------+

Remember to import avg function:

from pyspark.sql.functions import avg
copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

More from Kontext
visibility 889
thumb_up 0
access_time 10 months ago
visibility 1,884
thumb_up 0
access_time 11 months ago