Teradata SUBSTRING / SUBSTR and REGEXP_SUBSTR Functions

access_time 7 months ago visibility213 comment 0

Extract sub string from a string is a common operation in data analytics. In Teradata, function SUBSTRING (SUBSTR) and REGEXP_SUBSTR are provided to achieve that. SUBSTR is used to extract string from a specified location while REGEXP_SUBSTR is used to extract string using regular expressions.

Code snippets

SUBSTR/SUBSTRING

-- returns World!
select	substring('Hello World!' from 7);
-- returns World
select	substring('Hello World!' from 7 for 5);

-- returns World!
select	substr('Hello World!', 7);
-- returns World
select	substr('Hello World!',7,5);

REGEXP_SUBSTR

-- returns 1234
select	REGEXP_SUBSTR('Hello 1234 World 5678 ! Kontext.tech 1010 !!!','[\d]+');
-- returns 5678
select	REGEXP_SUBSTR('Hello 1234 World 5678 ! Kontext.tech 1010 !!!','[\d]+',1,2);
-- return null
select	REGEXP_SUBSTR('Hello 1234 World 5678 ! Kontext.tech 1010 !!!','[\d]+',1,4);

-- returns Hello
select	REGEXP_SUBSTR('Hello World! HeLLO World!!!','Hello',1,1,'i');
-- returns HeLLO
select	REGEXP_SUBSTR('Hello World! HeLLO World!!!','Hello',1,2,'i');
-- returns null
select	REGEXP_SUBSTR('Hello World! HeLLO World!!!','Hello',1,2,'c');

In the above examples, occurrence argument can be used to return different values if there are multiple matches. The last argument can be used to specify match arguments. It can be the following values:

  • 'i' = case-insensitive matching
  • 'c' = case sensitive matching
  • 'n' = the period character (match any character) can match the newline character.
  • 'm' = multiple line match
  • 'l' = returns NULL if source string size is more than maximum allowed size (currently 16 MB). This can be used to avoid errors when match with big string.
  • 'x' = ignore whitespace.

The argument can contain more than one character. For example:

select	REGEXP_SUBSTR('Hello World! HeLLO World!!!','Hello',1,2,'im');

By default, the match is case sensitive and a period does not match the newline character and source string is treated as a single line.

info Last modified by Administrator at 22 days ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

More from Kontext

local_offer teradata local_offer teradata-utilities

visibility 15
thumb_up 0
access_time 15 days ago

BTEQ is a Teradata utility tool that can be used to run Teradata SQL statements incl. DDL, DML, etc. It can also be used to import data from text file into Teradata databases. It also works with XML and JSON files too. Like TPT and FASTLOAD, it can run in both batch and interactive modes. This ...

local_offer teradata local_offer SQL

visibility 708
thumb_up 0
access_time 11 months ago

This code snippet shows how to calculate time differences.

local_offer hive local_offer SQL local_offer hive-sql-ddl

visibility 25
thumb_up 1
access_time 2 months ago

This page shows how to create Hive tables with storage file format as Parquet, Orc and Avro via Hive SQL (HQL). The following examples show you how to create managed tables and similar syntax can be applied to create external tables if Parquet, Orc or Avro format already exist in HDFS. Example ...

About column

Code snippets for various programming languages/frameworks.

rss_feed Subscribe RSS