BITMAP_HASH
Descriptionβ
Computes the 32-bit hash value of any input type and returns a Bitmap containing that hash value.
Syntaxβ
BITMAP_HASH(<expr>)
Parametersβ
Parameter | Description |
---|---|
<expr> | Any value or field expression |
Return Valueβ
Returns a Bitmap containing the 32-bit hash value of the parameter <expr>
.
::: note
The hash algorithm used is MurMur3.
MurMur3 is a high-performance, low-collision hashing algorithm that produces values close to a random distribution and can pass chi-squared distribution tests. Note that the hash values computed may differ across different hardware platforms and seed values.
For more details on the performance of this algorithm, see the Smhasher benchmark.
:::
Examplesβ
To compute the MurMur3 hash of a value, you can use:
select bitmap_to_array(bitmap_hash('hello'))[1];
The result will be:
+-------------------------------------------------------------+
| %element_extract%(bitmap_to_array(bitmap_hash('hello')), 1) |
+-------------------------------------------------------------+
| 1321743225 |
+-------------------------------------------------------------+
To count the distinct values in a column using bitmaps, which can be more efficient than count distinct
in some scenarios:
select bitmap_count(bitmap_union(bitmap_hash(`word`))) from `words`;
The result will be:
+-------------------------------------------------+
| bitmap_count(bitmap_union(bitmap_hash(`word`))) |
+-------------------------------------------------+
| 33263478 |
+-------------------------------------------------+