Spark SQL - Extract Value from JSON String

Raymond Raymond visibility 16,961 event 2021-01-09 access_time 3 years ago language English

JSON string values can be extracted using built-in Spark functions like get_json_object or json_tuple. 

Function get_json_object

Values can be extracted using get_json_object function. The function has two parameters: json_txt and path. The first is the JSON text itself, for example a string column in your Spark DataFrame or Hive table; the second is the JSON path.


SELECT get_json_object(r.JSON, '$.Attr_INT') AS Attr_INT,
get_json_object(r.JSON, '$.ATTR_DOUBLE') AS ATTR_DOUBLE,
get_json_object(r.JSON, '$.ATTR_DATE') AS ATTR_DATE
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;


1       10.201  2021-01-01

Function json_tuple

This function json_tuple(json_txt, p1, p2, ..., pn)  returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.


SELECT json_tuple(r.JSON, 'Attr_INT', 'ATTR_DOUBLE', 'ATTR_DATE')
from (SELECT '{"Attr_INT":1, "ATTR_DOUBLE":10.201, "ATTR_DATE": "2021-01-01"}' as JSON) r;
c0      c1      c2
1       10.201  2021-01-01

Function from_json

Values can also be extracted directly using function from_json where JSON string are converted to object first and then are directly referenced in SELECT statement. Refer to article Spark SQL - Convert JSON String to Map for example. 

More from Kontext
copyright This page is subject to Site terms.
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts