Skip to main content

REGEXP_COUNT

Description​

This is a function to count the number of characters in a string that match a given regular expression pattern. The input consists of a user-provided string and a regular expression pattern. The return value is n the total count of matching characters; if no matches are found, it returns 0.

  1. 'str' paratemer is 'string' type,it is the string of usr want to match by a regexp expression.

  2. 'pattern' paratemer is 'string' type, it is the string of a regexp regular which will be used to match the string;

  3. return value is 'int' type,it represent the number of the characters which be matched successfully;

Syntax​

REGEXP_COUNT(<str>, <pattern>)

Parameters​

ParameterDescription
<str>The parameter is 'string' type,it is the dest value which matched by the regexp expression
<pattern>The parameter is 'string' type, it is a regexp expression and it is used to match the string which meet the regular of the pattern

Return Value​

  • Returns number of matches for a regular expression 'pattern' within a 'str',it is 'int',if no character can be matched, return 0;
    1. if pattern is NULL or str is NULL or both are NULL,return NULL;
    1. if pattern is not allowed regexp regular , it will throw error,it`s a wrong for this action;

Examples​

Test the string region matching against an expression containing escape characters and return the result​

SELECT regexp_count('a.b:c;d', '[\\\\.:;]');
+--------------------------------------+
| regexp_count('a.b:c;d', '[\\\\.:;]') |
+--------------------------------------+
| 3 |
+--------------------------------------+

Test the string matching result of the regular expression for the ordinary character ':'.​

SELECT regexp_count('a.b:c;d', ':');
+------------------------------+
| regexp_count('a.b:c;d', ':') |
+------------------------------+
| 1 |
+------------------------------+

Test the return result when matching a string against a regular expression containing two square brackets.​

SELECT regexp_count('Hello, World!', '[[:punct:]]');
+----------------------------------------------+
| regexp_count('Hello, World!', '[[:punct:]]') |
+----------------------------------------------+
| 2 |
+----------------------------------------------+

Test pattern is NULL case​

SELECT regexp_count("abc",NULL);
+------------------------+
| regexp_count("abc",NULL) |
+------------------------+
| NULL |
+------------------------+

Test str is NULL case​

SELECT regexp_count(NULL,"abc");
+------------------------+
| regexp_count(NULL,"abc") |
+------------------------+
| NULL |
+------------------------+

Test both are NULL​

SELECT regexp_count(NULL,NULL);
+------------------------+
| regexp_count(NULL,NULL) |
+------------------------+
| NULL |
+------------------------+

Test the return result of inserting certain variable values and then retrieving the variables from the stored rows for matching.​


CREATE TABLE test_table_for_regexp_count (
id INT,
text_data VARCHAR(500),
pattern VARCHAR(100)
) PROPERTIES ("replication_num"="1");

INSERT INTO test_table_for_regexp_count VALUES
(1, 'HelloWorld', '[A-Z][a-z]+'),
(2, 'apple123', '[a-z]{5}[0-9]'),
(3, 'aabbcc', '(aa|bb|cc)'),
(4, '123-456-7890', '[0-9][0-9][0-9]'),
(5, 'test,data', ','),
(6, 'a1b2c3', '[a-z][0-9]'),
(7, 'book keeper', 'oo|ee'),
(8, 'ababab', '(ab)(ab)(ab)'),
(9, 'aabbcc', '(aa|bb|cc)'),
(10, 'apple,banana', '[aeiou][a-z]+');

SELECT id, regexp_count(text_data, pattern) as count_result FROM test_table_for_regexp_count ORDER BY id;

+------+--------------+
| id | count_result |
+------+--------------+
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
| 4 | 3 |
| 5 | 1 |
| 6 | 3 |
| 7 | 2 |
| 8 | 1 |
| 9 | 3 |
| 10 | 2 |
+------+--------------+

Test the return result of inserting certain variable values, retrieving the variables from stored rows for matching, with the regular expression being a constant.​

CREATE TABLE test_table_for_regexp_count (
id INT,
text_data VARCHAR(500),
pattern VARCHAR(100)
) PROPERTIES ("replication_num"="1");

INSERT INTO test_table_for_regexp_count VALUES
(1, 'HelloWorld', '[A-Z][a-z]+'),
(2, 'apple123', '[a-z]{5}[0-9]'),
(3, 'aabbcc', '(aa|bb|cc)'),
(4, '123-456-7890', '[0-9][0-9][0-9]'),
(5, 'test,data', ','),
(6, 'a1b2c3', '[a-z][0-9]'),
(7, 'book keeper', 'oo|ee'),
(8, 'ababab', '(ab)(ab)(ab)'),
(9, 'aabbcc', '(aa|bb|cc)'),
(10, 'apple,banana', '[aeiou][a-z]+');

SELECT id, regexp_count(text_data, 'e') as count_e FROM test_table_for_regexp_count WHERE text_data IS NOT NULL ORDER BY id;
+------+---------+
| id | count_e |
+------+---------+
| 1 | 1 |
| 2 | 1 |
| 3 | 0 |
| 4 | 0 |
| 5 | 1 |
| 6 | 0 |
| 7 | 3 |
| 8 | 0 |
| 9 | 0 |
| 10 | 1 |
+------+---------+