Correlation Tests
Test and estimate correlations between variables.
Linear Correlation
Pearson Correlation
Linear correlation with significance test.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|---|---|
r | DOUBLE | Pearson correlation (-1 to 1) |
p_value | DOUBLE | p-value |
t_statistic | DOUBLE | t-statistic |
ci_lower | DOUBLE | CI lower bound |
ci_upper | DOUBLE | CI upper bound |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_pearson_agg(x, y) as result
FROM data;
Interpretation:
| r value | Strength |
|---|---|
| 0.0 - 0.3 | Weak |
| 0.3 - 0.7 | Moderate |
| 0.7 - 1.0 | Strong |
Rank Correlations
Spearman Rank Correlation
Monotonic relationship (robust to outliers).
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|---|---|
rho | DOUBLE | Spearman's rho |
p_value | DOUBLE | p-value |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_spearman_agg(x, y) as result
FROM data;
Kendall's Tau
Rank correlation (handles ties well).
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|---|---|
tau | DOUBLE | Kendall's tau |
p_value | DOUBLE | p-value |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_kendall_agg(x, y) as result
FROM data;
Nonlinear Correlation
Distance Correlation
Detects nonlinear relationships.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
Output
| Field | Type | Description |
|---|---|---|
dcor | DOUBLE | Distance correlation (0 to 1) |
dcov | DOUBLE | Distance covariance |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_distance_cor_agg(x, y) as result
FROM data;
Key property: Distance correlation = 0 if and only if X and Y are independent (unlike Pearson which only detects linear relationships).
Reliability Measures
Intraclass Correlation (ICC)
Reliability/agreement between raters.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
value | DOUBLE | Yes | - | Rating value |
rater_id | INTEGER | Yes | - | Rater identifier |
subject_id | INTEGER | Yes | - | Subject identifier |
options | MAP | No | - | Model configuration |
Options MAP:
| Option | Type | Default | Description |
|---|---|---|---|
model | VARCHAR | two_way_random | one_way, two_way_random, two_way_mixed |
Example
SELECT anofox_stats_icc_agg(
value,
rater_id,
subject_id,
MAP {'model': 'two_way_random'}
) as result
FROM rating_data;
Interpretation:
| ICC value | Reliability |
|---|---|
| < 0.5 | Poor |
| 0.5 - 0.75 | Moderate |
| 0.75 - 0.9 | Good |
| > 0.9 | Excellent |
Choosing a Correlation Method
| Scenario | Recommended |
|---|---|
| Linear relationship, normal data | Pearson |
| Outliers present | Spearman |
| Ordinal data | Spearman or Kendall |
| Many ties | Kendall |
| Nonlinear relationship | Distance Correlation |
| Rater agreement | ICC |