APPROX_COUNT_DISTINCT

Description

The APPROX_COUNT_DISTINCT function is implemented based on the HyperLogLog algorithm, which uses a fixed size of memory to estimate the column base. The algorithm is based on the assumption of a null distribution in the tails, and the accuracy depends on the data distribution. Based on the fixed bucket size used by Doris, the relative standard error of the algorithm is 0.8125%. For a more detailed and specific analysis, see related paper

Syntax

APPROX_COUNT_DISTINCT(<expr>)

Parameters

Parameters	Description
`<expr>`	The expression needs to be obtained

Return Value

Returns a value of type BIGINT.

Example

select approx_count_distinct(query_id) from log_statis group by datetime;

+-----------------+
| approx_count_distinct(`query_id`) |
+-----------------+
| 17721           |
+-----------------+

Description​

Syntax​

Parameters​

Return Value​

Example​

Description

Syntax

Parameters

Return Value

Example