REGEXP_EXTRACT_OR_NULL
Description
Extract the first substring that matches the target regular expression pattern from the text string, and extract a specific group from it based on the expression group index.
- Character set matching requires the use of Unicode standard character types. For example, to match Chinese characters, use
\p{Han}
.
Syntax
REGEXP_EXTRACT_OR_NULL(<str>, <pattern>, <pos>)
Parameters
Parameter | Description |
---|---|
<str> | String, a text string that needs to be matched with regular expressions. |
<pattern> | String, target pattern. |
<pos> | Integer, the index of the expression group to extract, counting starts from 1. |
Return Value
Return a string type, with the result being the part that matches <pattern>
.
- If the input
<pos>
is 0, return the entire first matching substring. - If the input
<pos>
is invalid (negative or exceeds the number of expression groups), return NULL. - If the regular expression fails to match, return NULL.
Example
SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 1);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 1) |
+---------------------------------------------------------------------------+
| b |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 0);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 0) |
+---------------------------------------------------------------------------+
| bCd |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 5);
+---------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('123AbCdExCx', '([[:lower:]]+)C([[:lower:]]+)', 5) |
+---------------------------------------------------------------------------+
| NULL |
+---------------------------------------------------------------------------+
SELECT REGEXP_EXTRACT_OR_NULL('AbCdE', '([[:lower:]]+)C([[:upper:]]+)', 1);
+---------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('AbCdE', '([[:lower:]]+)C([[:upper:]]+)', 1) |
+---------------------------------------------------------------------+
| NULL |
+---------------------------------------------------------------------+
select REGEXP_EXTRACT_OR_NULL('这是一段中文 This is a passage in English 1234567', '(\\p{Han}+)(.+)', 2);
+---------------------------------------------------------------------------------------------------------+
| REGEXP_EXTRACT_OR_NULL('这是一段中文 This is a passage in English 1234567', '(\\p{Han}+)(.+)', 2) |
+---------------------------------------------------------------------------------------------------------+
| This is a passage in English 1234567 |
+---------------------------------------------------------------------------------------------------------+