Skip to main content

REGEXP_EXTRACT_OR_NULL

Description

Extract the first substring that matches the target regular expression pattern from the text string, and extract a specific group from it based on the expression group index.

  • Character set matching requires the use of Unicode standard character types. For example, to match Chinese characters, use \p{Han}.

Syntax

REGEXP_EXTRACT_OR_NULL(<str>, <pattern>, <pos>)

Parameters

ParameterDescription
<str>String, a text string that needs to be matched with regular expressions.
<pattern>String, target pattern.
<pos>Integer, the index of the expression group to extract, counting starts from 1.

Return Value

Return a string type, with the result being the part that matches <pattern>.

  • If the input <pos> is 0, return the entire first matching substring.
  • If the input <pos> is invalid (negative or exceeds the number of expression groups), return NULL.
  • If the regular expression fails to match, return NULL.

Example

SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 1);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 1) |
+---------------------------------------------------------------------------+
| b |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 0);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 0) |
+---------------------------------------------------------------------------+
| bCd |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 5);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 5) |
+---------------------------------------------------------------------------+
| NULL |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('AbCdE', '([[:lower:]]+)C([[:upper:]]+)', 1);
+---------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('AbCdE', '([[:lower:]]+)C([[:upper:]]+)', 1) |
+---------------------------------------------------------------------+
| NULL |
+---------------------------------------------------------------------+
select REGEXP_EXTRACT_OR_NULL('这是一段中文 This is a passage in English 1234567', '(\\p{Han}+)(.+)', 2);
+---------------------------------------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('这是一段中文 This is a passage in English 1234567', '(\\p{Han}+)(.+)', 2) |
+---------------------------------------------------------------------------------------------------------+
| This is a passage in English 1234567 |
+---------------------------------------------------------------------------------------------------------+