Skip to main content

Regularized Models

Penalty-based regression for multicollinearity and high-dimensional data: Ridge and Elastic Net.


Ridge - L2 Regularization

Handle multicollinearity by shrinking coefficients toward zero.

Variants

  • Scalar Fit: anofox_stats_ridge_fit(y, x, alpha, [options]) -> STRUCT
  • Aggregate Fit: anofox_stats_ridge_fit_agg(y, x, alpha, [options]) -> STRUCT
  • Window Predict: anofox_stats_ridge_fit_predict(y, x, [options]) OVER (...) -> STRUCT
  • Batch Predict: anofox_stats_ridge_predict_agg(y, x, [options]) -> LIST(STRUCT)

Parameters

ParameterTypeRequiredDefaultDescription
yLIST(DOUBLE) / DOUBLEYes-Target values
xLIST(LIST(DOUBLE)) / LIST(DOUBLE)Yes-Predictor matrix
alphaDOUBLEYes-Regularization strength (0.01-10.0)
optionsMAPNo-Configuration options

Options MAP:

OptionTypeDefaultDescription
fit_interceptBOOLEANtrueInclude intercept term
compute_inferenceBOOLEANfalseCompute inference statistics
confidence_levelDOUBLE0.95Confidence level

Example

SELECT
region,
(model).r_squared as fit,
(model).coefficients[2] as price_elasticity
FROM (
SELECT
region,
anofox_stats_ridge_fit_agg(
sales,
[price, promotion],
0.5,
MAP {'compute_inference': 'true'}
) as model
FROM regional_sales
GROUP BY region
);

When to use Ridge:

  • VIF > 5 for any predictor
  • More predictors than observations
  • Coefficients unstable across samples

Elastic Net - Combined L1+L2

Feature selection with regularization for high-dimensional data.

Variants

  • Scalar Fit: anofox_stats_elasticnet_fit(y, x, alpha, l1_ratio, [options]) -> STRUCT
  • Aggregate Fit: anofox_stats_elasticnet_fit_agg(y, x, alpha, l1_ratio, [options]) -> STRUCT
  • Window Predict: anofox_stats_elasticnet_fit_predict(y, x, [options]) OVER (...) -> STRUCT
  • Batch Predict: anofox_stats_elasticnet_predict_agg(y, x, [options]) -> LIST(STRUCT)

Parameters

ParameterTypeRequiredDefaultDescription
yLIST(DOUBLE) / DOUBLEYes-Target values
xLIST(LIST(DOUBLE)) / LIST(DOUBLE)Yes-Predictor matrix
alphaDOUBLEYes-Overall regularization (0.01-10.0)
l1_ratioDOUBLEYes-L1/L2 balance (0=Ridge, 1=Lasso)
optionsMAPNo-Configuration options

Options MAP:

OptionTypeDefaultDescription
fit_interceptBOOLEANtrueInclude intercept term
max_iterationsINTEGER1000Convergence limit
toleranceDOUBLE1e-4Convergence threshold

Example

SELECT anofox_stats_elasticnet_fit_agg(
y,
[x1, x2, x3, x4, x5],
0.5, -- alpha: regularization strength
0.7 -- l1_ratio: 70% L1, 30% L2
) as model
FROM high_dim_data;

When to use Elastic Net:

  • High-dimensional data (many predictors)
  • Feature selection needed (sparse solutions)
  • Correlated predictors with variable selection
🍪 Cookie Settings