Spark SQL

Articles tagged with spark-sql.
visibility 10
thumb_up 0
access_time 5 days ago

Like other SQL engines, Spark also supports PIVOT clause. PIVOT is usually used to calculated aggregated values for each value in a column and the calculated values will be included as columns in the result set. PIVOT ( { aggregate_expression [ AS aggregate_expression_alias ] } [ , ... ] FOR ...

visibility 6
thumb_up 0
access_time 5 days ago

Spark SQL provides functions to calculate covariances of a set of number pairs. There are two functions:  covar_pop(expr1, expr2) and covar_samp(expr1, expr2) . The first one calculates population covariance while the second one calculates sample covariance.  Example: SELECT ...

visibility 10
thumb_up 0
access_time 5 days ago

In Spark SQL, function std or   stddev or    stddev_sample  can be used to calculate sample standard deviation from values of a group.  std(expr) stddev(expr) stddev_samp(expr) The first two functions are the alias of stddev_sample function. SELECT ACCT ...

visibility 5
thumb_up 0
access_time 5 days ago

In Spark SQL, function FIRST_VALUE (FIRST) and LAST_VALUE (LAST) can be used to to find the first or the last value of given column or expression for a group of rows. If parameter `isIgnoreNull` is specified as true, they return only non-null values (unless all values are null). first(expr[ ...

visibility 6
thumb_up 0
access_time 6 days ago

Unlike traditional RDBMS systems, Spark SQL supports complex types like array or map. There are a number of built-in functions to operate efficiently on array values. ArrayType columns can be created directly using array or array_repeat  function. The latter repeat one element multiple times ...

visibility 7
thumb_up 0
access_time 6 days ago

In Spark SQL, MapType is designed for key values, which is like dictionary object type in many other programming languages. This article summarize the commonly used map functions in Spark SQL. Function map is used to create a map.  Example: spark-sql> select ...

visibility 4
thumb_up 0
access_time 6 days ago

In article  Scala: Parse JSON String as Spark DataFrame , it shows how to convert JSON string to Spark DataFrame; this article show the other way around - convert complex columns to a JSON string using to_json function. Function ' to_json(expr[, options]) ' returns a JSON string with a ...

visibility 6
thumb_up 0
access_time 6 days ago

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple.  Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark ...

visibility 6
thumb_up 0
access_time 7 days ago

Spark SQL function from_json(jsonStr, schema[, options]) returns a struct value with the given JSON string and format. Parameter options is used to control how the json is parsed. It accepts the same options as the  json data source in Spark DataFrame reader APIs. The following code ...

visibility 11
thumb_up 0
access_time 7 days ago

Similar as  Convert String to Date using Spark SQL , you can convert string of timestamp to Spark SQL timestamp data type. Function  to_timestamp(timestamp_str[, fmt]) p arses the `timestamp_str` expression with the `fmt` expression to a timestamp data type in Spark.  Example ...

Read more

Find more tags on tag cloud.

launch Tag cloud