Kontext Kontext / Code Snippets & Tips

Spark SQL - Check if String Contains a String

event 2022-07-09 visibility 11,224 comment 0 insights toc
more_vert
insights Stats

Spark SQL functions contains and instr can be used to check if a string contains a string. It can also be used to filter data.

Use contains function

The syntax of this function is defined as:

contains(left, right) - This function returns a boolean. Retuns True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.

Code snippet

The following code snippet shows how to use this function.

spark-sql> SELECT t.col, contains(t.col,'text')
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Context!')
         > AS t(col);
Hello Kontext!  true
Hello Context!  true

Both records return true as 'text' exists in both values.

Use instr function 

Function instr(str, substr) returns the (1-based) index of the first occurrence of substr in str. Thus we could check the the returned value to decide whether the substring exists in the string.

Code snippet

spark-sql> SELECT t.col, case when instr(t.col,'text')>0 then true else false end
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col);
Hello Kontext!  true
Hello Contt!    false

Filter data

We can directly use these two functions to filter data to return rows that contains certain texts.

spark-sql> SELECT t.col
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col)
         > WHERE instr(t.col,'text')>0;
Hello Kontext!

spark-sql> SELECT t.col
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col)
         > WHERE contains(t.col,'text');
Hello Kontext!

Both SQL statements only return the first record as it has 'text' in column 'col'. 

infoThese functions can also be directly used in Spark DataFrame APIs. For example,
df.where(contains('col', 'text') == False).show()
More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts