Skip to main content

CORR_WELFORD

Description

Calculate the Pearson coefficient of two random variables using the Welford algorithm, which can effectively reduce calculation errors.

Syntax

CORR_WELFORD(<expr1>, <expr2>)

Parameters

ParameterDescription
<expr1>Double expression (column)
<expr2>Double expression (column)

Return Value

The return value is of type DOUBLE, the covariance of expr1 and expr2, except the product of the standard deviation of expr1 and expr2, special case:

  • If the standard deviation of expr1 or expr2 is 0, 0 will be returned.
  • If a column of expr1 or expr2 is NULL, the row data will not be counted in the final result.

Example

select * from test_corr;
+------+------+------+
| id | k1 | k2 |
+------+------+------+
| 1 | 20 | 22 |
| 1 | 10 | 20 |
| 2 | 36 | 21 |
| 2 | 30 | 22 |
| 2 | 25 | 20 |
| 3 | 25 | NULL |
| 4 | 25 | 21 |
| 4 | 25 | 22 |
| 4 | 25 | 20 |
+------+------+------+
select id,corr_welford(k1,k2) from test_corr group by id;
+------+---------------------+
| id | corr_welford(k1,k2) |
+------+---------------------+
| 2 | 0.4539206495016017 |
| 4 | 0 |
| 3 | NULL |
| 1 | 1 |
+------+---------------------+