Spark SQL - Extract Value from JSON String

visibility 2,331 comment 0 access_time 10m languageEnglish

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple. 

Function get_json_object

Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark DataFrame or Hive table; the second is the JSON path.

Example:

SELECT get_json_object(r.JSON, '$.Attr_INT') AS Attr_INT,
get_json_object(r.JSON, '$.ATTR_DOUBLE') AS ATTR_DOUBLE,
get_json_object(r.JSON, '$.ATTR_DATE') AS ATTR_DATE
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;

Output:

Attr_INT        ATTR_DOUBLE     ATTR_DATE
1       10.201  2021-01-01

Function json_tuple

This function json_tuple(json_txt, p1, p2, ..., pn)  returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.

Example:

SELECT json_tuple(r.JSON, 'Attr_INT', 'ATTR_DOUBLE', 'ATTR_DATE')
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;
Output:
c0      c1      c2
1       10.201  2021-01-01

Function from_json

Values can also be extracted directly using function from_json where JSON string are converted to object first and then are directly referenced in SELECT statement. Refer to article Spark SQL - Convert JSON String to Map for example. 

copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Tags
More from Kontext