| 4 | |
| 5 | |
| 6 | class RidgeRegression: |
| 7 | def __init__(self, alpha=1, fit_intercept=True): |
| 8 | r""" |
| 9 | A ridge regression model with maximum likelihood fit via the normal |
| 10 | equations. |
| 11 | |
| 12 | Notes |
| 13 | ----- |
| 14 | Ridge regression is a biased estimator for linear models which adds an |
| 15 | additional penalty proportional to the L2-norm of the model |
| 16 | coefficients to the standard mean-squared-error loss: |
| 17 | |
| 18 | .. math:: |
| 19 | |
| 20 | \mathcal{L}_{Ridge} = (\mathbf{y} - \mathbf{X} \beta)^\top |
| 21 | (\mathbf{y} - \mathbf{X} \beta) + \alpha ||\beta||_2^2 |
| 22 | |
| 23 | where :math:`\alpha` is a weight controlling the severity of the |
| 24 | penalty. |
| 25 | |
| 26 | Given data matrix **X** and target vector **y**, the maximum-likelihood |
| 27 | estimate for ridge coefficients, :math:`\beta`, is: |
| 28 | |
| 29 | .. math:: |
| 30 | |
| 31 | \hat{\beta} = |
| 32 | \left(\mathbf{X}^\top \mathbf{X} + \alpha \mathbf{I} \right)^{-1} |
| 33 | \mathbf{X}^\top \mathbf{y} |
| 34 | |
| 35 | It turns out that this estimate for :math:`\beta` also corresponds to |
| 36 | the MAP estimate if we assume a multivariate Gaussian prior on the |
| 37 | model coefficients, assuming that the data matrix **X** has been |
| 38 | standardized and the target values **y** centered at 0: |
| 39 | |
| 40 | .. math:: |
| 41 | |
| 42 | \beta \sim \mathcal{N}\left(\mathbf{0}, \frac{1}{2M} \mathbf{I}\right) |
| 43 | |
| 44 | Parameters |
| 45 | ---------- |
| 46 | alpha : float |
| 47 | L2 regularization coefficient. Larger values correspond to larger |
| 48 | penalty on the L2 norm of the model coefficients. Default is 1. |
| 49 | fit_intercept : bool |
| 50 | Whether to fit an additional intercept term. Default is True. |
| 51 | |
| 52 | Attributes |
| 53 | ---------- |
| 54 | beta : :py:class:`ndarray <numpy.ndarray>` of shape `(M, K)` or None |
| 55 | Fitted model coefficients. |
| 56 | """ |
| 57 | self.beta = None |
| 58 | self.alpha = alpha |
| 59 | self.fit_intercept = fit_intercept |
| 60 | |
| 61 | def fit(self, X, y): |
| 62 | """ |
| 63 | Fit the regression coefficients via maximum likelihood. |