spark-sql-function

46 items tagged with "spark-sql-function"

46 Articles

Articles

PySpark DataFrame - Convert JSON Column to Row using json_tuple

PySpark SQL functions json_tuple can be used to convert DataFrame JSON string columns to tuples (new rows in the DataFrame). Syntax of this function looks like the following: `` pyspark.sql.functions.json_tuple(col, *fields) ` The first parameter is the JSON string column name in the DataFrame and the second is the filed name list to extract. If you need to extract complex JSON documents like JSON arrays, you can follow this article - PySpark: Convert JSON String Column to Array of Object (StructType) in DataFrame. Output ` StructType([StructField('id', LongType(), True), StructField('c0', StringType(), True), StructField('c1', StringType(), True), StructField('c2', StringType(), True)]) +---+---+------+----------+ | id| c0| c1| c2| +---+---+------+----------+ | 1| 1|10.201|2021-01-01| | 2| 2|20.201|2022-01-01| +---+---+------+----------+ ``

spark-sql-function

Articles

PySpark DataFrame - Convert JSON Column to Row using json_tuple

PySpark DataFrame - Extract JSON Value using get_json_object Function

Replace Values via regexp_replace Function in PySpark DataFrame

Spark SQL - window Function

Spark SQL - session_window Function

Spark SQL - Left and Right Padding (lpad and rpad) Functions

Spark SQL - Check if String Contains a String

Spark SQL - isnull and isnotnull Functions

Spark SQL - Concatenate w/o Separator (concat_ws and concat)

Spark SQL - Create Map from Arrays via map_from_arrays Function

Spark Hash Functions Introduction - MD5 and SHA

Spark SQL - Get Next Monday, Tuesday, Wednesday, Thursday, etc.

Spark SQL - Make Date, Timestamp and Intervals

Spark SQL - Get Current Timezone

Spark SQL - Date and Timestamp Truncate Functions

Spark SQL - Extract Day, Month, Year and other Part from Date or Timestamp

Spark SQL - Add Day, Month and Year to Date

Spark SQL - Return JSON Array Length (json_array_length)

Spark SQL - Return JSON Object Keys (json_object_keys)

Spark SQL - Conversion between UTC and Timestamp with Time Zone

Spark SQL - Date/Timestamp Conversation from/to UNIX Date/Timestamp

Spark SQL - Convert Date/Timestamp to String via date_format Function

Spark SQL - Convert Delimited String to Map using str_to_map Function

Spark SQL - element_at Function

Spark SQL - flatten Function

Spark SQL - PERCENT_RANK Window Function

Spark SQL - Date Difference in Seconds, Minutes, Hours

Spark "ROW_ID"

Spark SQL - PIVOT Clause

Spark SQL - Calculate Covariance

Spark SQL - Standard Deviation Calculation

Spark SQL - FIRST_VALUE or LAST_VALUE

Spark SQL - Array Functions

Spark SQL - Map Functions

Spark SQL - Convert Object to JSON String

Spark SQL - Extract Value from JSON String

Spark SQL - Convert JSON String to Map

Spark SQL - Convert String to Timestamp

Spark SQL - UNIX timestamp functions

Spark SQL - Date and Timestamp Function

Spark SQL - LEAD Window Function

Spark SQL - LAG Window Function

Spark SQL - NTILE Window Function

Spark SQL - DENSE_RANK Window Function

Spark SQL - RANK Window Function

Spark SQL - ROW_NUMBER Window Functions