Spark SQL - Extract Value from JSON String

access_time 6 days ago visibility9 comment 0

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple. 

Function get_json_object

Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark DataFrame or Hive table; the second is the JSON path.

Example:

SELECT get_json_object(r.JSON, '$.Attr_INT') AS Attr_INT,
get_json_object(r.JSON, '$.ATTR_DOUBLE') AS ATTR_DOUBLE,
get_json_object(r.JSON, '$.ATTR_DATE') AS ATTR_DATE
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;

Output:

Attr_INT        ATTR_DOUBLE     ATTR_DATE
1       10.201  2021-01-01

Function json_tuple

This function json_tuple(json_txt, p1, p2, ..., pn)  returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.

Example:

SELECT json_tuple(r.JSON, 'Attr_INT', 'ATTR_DOUBLE', 'ATTR_DATE')
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;
Output:
c0      c1      c2
1       10.201  2021-01-01

Function from_json

Values can also be extracted directly using function from_json where JSON string are converted to object first and then are directly referenced in SELECT statement. Refer to article Spark SQL - Convert JSON String to Map for example. 

copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Follow Kontext

Get our latest updates on LinkedIn or Twitter.

Want to publish your article on Kontext?

Learn more

More from Kontext

visibility 7
thumb_up 0
access_time 9 days ago

Spark LEAD function provides access to a row at a given offset that follows the current row in a window. This analytic function can be used in a SELECT statement to compare values in the current row with values in a following row. This function is like  Spark SQL - LAG Window Function .

visibility 10
thumb_up 0
access_time 9 days ago

RANK in Spark calculates the rank of a value in a group of values. It returns one plus the number of rows proceeding or equals to the current row in the ordering of a partition. The returned values are not sequential.   The following sample SQL uses RANK function without PARTITION BY ...

visibility 9
thumb_up 0
access_time 6 days ago

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple.  Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark ...