Spark SQL Functions

Articles tagged with spark-sql-function.
visibility 20
thumb_up 0
access_time 13 days ago

Like other SQL engines, Spark also supports PIVOT clause. PIVOT is usually used to calculated aggregated values for each value in a column and the calculated values will be included as columns in the result set. PIVOT ( { aggregate_expression [ AS aggregate_expression_alias ] } [ , ... ] FOR ...

visibility 19
thumb_up 0
access_time 20 days ago

ROW_NUMBER in Spark assigns a unique sequential number (starting from 1) to each record based on the ordering of rows in each window partition. It is commonly used to deduplicate data. The following sample SQL uses ROW_NUMBER function without PARTITION BY clause: SELECT TXN.*, ROW_NUMBER() OVER ...

visibility 14
thumb_up 0
access_time 14 days ago

Similar as  Convert String to Date using Spark SQL , you can convert string of timestamp to Spark SQL timestamp data type. Function  to_timestamp(timestamp_str[, fmt]) p arses the `timestamp_str` expression with the `fmt` expression to a timestamp data type in Spark.  Example ...

visibility 12
thumb_up 0
access_time 14 days ago

Function unix_timestamp() returns the UNIX timestamp of current time. You can also specify a input timestamp value.  Example: spark-sql> select unix_timestamp(); unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss) 1610174099 spark-sql> select unix_timestamp(current_timestamp ...

visibility 11
thumb_up 0
access_time 14 days ago

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple.  Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark ...

visibility 11
thumb_up 0
access_time 13 days ago

In Spark SQL, MapType is designed for key values, which is like dictionary object type in many other programming languages. This article summarize the commonly used map functions in Spark SQL. Function map is used to create a map.  Example: spark-sql> select ...

visibility 11
thumb_up 0
access_time 13 days ago

In Spark SQL, function std or   stddev or    stddev_sample  can be used to calculate sample standard deviation from values of a group.  std(expr) stddev(expr) stddev_samp(expr) The first two functions are the alias of stddev_sample function. SELECT ACCT ...

visibility 10
thumb_up 0
access_time 17 days ago

RANK in Spark calculates the rank of a value in a group of values. It returns one plus the number of rows proceeding or equals to the current row in the ordering of a partition. The returned values are not sequential.   The following sample SQL uses RANK function without PARTITION BY ...

visibility 10
thumb_up 0
access_time 17 days ago

DENSE_RANK is similar as  Spark SQL - RANK Window Function . It  calculates the rank of a value in a group of values. It returns one plus the number of rows proceeding or equals to the current row in the ordering of a partition. The returned values are sequential in each window thus no ...

visibility 10
thumb_up 0
access_time 17 days ago

Spark LEAD function provides access to a row at a given offset that follows the current row in a window. This analytic function can be used in a SELECT statement to compare values in the current row with values in a following row. This function is like  Spark SQL - LAG Window Function .

Read more

Find more tags on tag cloud.

launch Tag cloud