Spark SQL - Check if String Contains a String

Kontext Kontext 0 13308 12.25 index 7/9/2022

Spark SQL functions contains and instr can be used to check if a string contains a string. It can also be used to filter data.

Use contains function

The syntax of this function is defined as:

contains(left, right) - This function returns a boolean. Retuns True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.

Code snippet

The following code snippet shows how to use this function.

spark-sql> SELECT t.col, contains(t.col,'text')
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Context!')
         > AS t(col);
Hello Kontext!  true
Hello Context!  true

Both records return true as 'text' exists in both values.

Use instr function

Function instr(str, substr) returns the (1-based) index of the first occurrence of substr in str. Thus we could check the the returned value to decide whether the substring exists in the string.

Code snippet

spark-sql> SELECT t.col, case when instr(t.col,'text')>0 then true else false end
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col);
Hello Kontext!  true
Hello Contt!    false

Filter data

We can directly use these two functions to filter data to return rows that contains certain texts.

spark-sql> SELECT t.col
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col)
         > WHERE instr(t.col,'text')>0;
Hello Kontext!

spark-sql> SELECT t.col
         > FROM VALUES
         > ('Hello Kontext!'),
         > ('Hello Contt!')
         > AS t(col)
         > WHERE contains(t.col,'text');
Hello Kontext!

Both SQL statements only return the first record as it has 'text' in column 'col'.

infoThese functions can also be directly used in Spark DataFrame APIs. For example,

df.where(contains('col', 'text') == False).show()
spark-sql-function

Join the Discussion

View or add your thoughts below

Comments