Use expr() Function in PySpark DataFrame

Spark SQL function `expr()` can be used to evaluate a SQL expression and returns as a column (`pyspark.sql.column.Column`). Any operators or functions that can be used in Spark SQL can also be used with DataFrame operations. This code snippet provides an example of using `expr()` function directly with DataFrame. It also includes the snippet to derive a column without using this function. \* The code snippet assumes a `SparkSession `object already exists as '`spark`'. Output: ``` +---+-----+-----+ | id|id_v1|id_v2| +---+-----+-----+ | 1| 11| 11| | 2| 12| 12| | 3| 13| 13| | 4| 14| 14| | 5| 15| 15| | 6| 16| 16| | 7| 17| 17| | 8| 18| 18| | 9| 19| 19| +---+-----+-----+ ```

Kontext Kontext 0 838 0.81 index 9/2/2022

Code description

Spark SQL function expr() can be used to evaluate a SQL expression and returns as a column (pyspark.sql.column.Column). Any operators or functions that can be used in Spark SQL can also be used with DataFrame operations.

This code snippet provides an example of using expr() function directly with DataFrame. It also includes the snippet to derive a column without using this function.

  • The code snippet assumes a SparkSession object already exists as 'spark'.

Output:

    +---+-----+-----+
    | id|id_v1|id_v2|
    +---+-----+-----+
    |  1|   11|   11|
    |  2|   12|   12|
    |  3|   13|   13|
    |  4|   14|   14|
    |  5|   15|   15|
    |  6|   16|   16|
    |  7|   17|   17|
    |  8|   18|   18|
    |  9|   19|   19|
    +---+-----+-----+  
    

Code snippet

    from pyspark.sql.functions import *
    
    df = spark.range(1,10)
    df = df.withColumn('id_v1', expr("id+10"))
    df = df.withColumn('id_v2', df.id + 10)
    df.show()
pyspark spark-sql

Join the Discussion

View or add your thoughts below

Comments