SOUNDEX
Description
The SOUNDEX function computes the Soundex encoding of a string. Soundex is a phonetic algorithm that encodes English words into codes representing their pronunciation, so words with similar pronunciation will have the same encoding.
Encoding rule: Returns a 4-character code consisting of one uppercase letter followed by three digits (e.g., S530).
Syntax
SOUNDEX(<expr>)
Parameters
| Parameter | Description |
|---|---|
<expr> | The string to compute Soundex encoding for (only supports ASCII characters). Type: VARCHAR |
Return Value
Returns VARCHAR(4) type, representing the Soundex encoding of the string.
Special cases:
- If the argument is NULL, returns NULL
- If the string is empty or contains no letters, returns an empty string
- Only processes ASCII letters, ignoring other characters
- Non-ASCII characters will cause the function to throw an error
Examples
- Basic usage: word encoding
SELECT soundex('Doris');
+------------------+
| soundex('Doris') |
+------------------+
| D620 |
+------------------+
- Words with similar pronunciation have the same encoding
SELECT soundex('Smith'), soundex('Smyth');
+------------------+------------------+
| soundex('Smith') | soundex('Smyth') |
+------------------+------------------+
| S530 | S530 |
+------------------+------------------+
- Empty string processing
SELECT soundex('');
+-------------+
| soundex('') |
+-------------+
| |
+-------------+
- NULL value handling
SELECT soundex(NULL);
+---------------+
| soundex(NULL) |
+---------------+
| NULL |
+---------------+
- Empty string returns empty string
SELECT soundex('');
+-------------+
| soundex('') |
+-------------+
| |
+-------------+
- Non-letter characters only return empty string
SELECT soundex('123@*%');
+-------------------+
| soundex('123@*%') |
+-------------------+
| |
+-------------------+
- Ignoring non-letter characters
SELECT soundex('R@b-e123rt'), soundex('Robert');
+-----------------------+-------------------+
| soundex('R@b-e123rt') | soundex('Robert') |
+-----------------------+-------------------+
| R163 | R163 |
+-----------------------+-------------------+
- Non-ASCII characters only error example
SELECT soundex('你好');
ERROR 1105 (HY000): errCode = 2, detailMessage = Not Supported: Not Supported: soundex only supports ASCII, but got: 你
SELECT soundex('Apache Doris 你好');
+--------------------------------+
| soundex('Apache Doris 你好') |
+--------------------------------+
| A123 |
+--------------------------------+
Keywords
SOUNDEX