Spark SQL - Extract Value from JSON String

visibility 5,627 access_time 2 years ago languageEnglish

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple. 

Function get_json_object

Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark DataFrame or Hive table; the second is the JSON path.


SELECT get_json_object(r.JSON, '$.Attr_INT') AS Attr_INT,
get_json_object(r.JSON, '$.ATTR_DOUBLE') AS ATTR_DOUBLE,
get_json_object(r.JSON, '$.ATTR_DATE') AS ATTR_DATE
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;


1       10.201  2021-01-01

Function json_tuple

This function json_tuple(json_txt, p1, p2, ..., pn)  returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.


SELECT json_tuple(r.JSON, 'Attr_INT', 'ATTR_DOUBLE', 'ATTR_DATE')
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;
c0      c1      c2
1       10.201  2021-01-01

Function from_json

Values can also be extracted directly using function from_json where JSON string are converted to object first and then are directly referenced in SELECT statement. Refer to article Spark SQL - Convert JSON String to Map for example. 

copyright This page is subject to Site terms.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

timeline Stats
Page index 11.30
More from Kontext
Spark SQL - Calculate Covariance
visibility 358
thumb_up 0
access_time 2 years ago
Spark SQL - PERCENT_RANK Window Function
visibility 243
thumb_up 0
access_time 8 months ago
Spark SQL - Group By
visibility 173
thumb_up 0
access_time 9 months ago