Fit the regression coefficients via gradient descent on the negative log likelihood. Parameters ---------- X : :py:class:`ndarray ` of shape `(N, M)` A dataset consisting of `N` examples, each of dimension `M`. y : :py:clas
(self, X, y, lr=0.01, tol=1e-7, max_iter=1e7)
| 88 | self.fit_intercept = fit_intercept |
| 89 | |
| 90 | def fit(self, X, y, lr=0.01, tol=1e-7, max_iter=1e7): |
| 91 | """ |
| 92 | Fit the regression coefficients via gradient descent on the negative |
| 93 | log likelihood. |
| 94 | |
| 95 | Parameters |
| 96 | ---------- |
| 97 | X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)` |
| 98 | A dataset consisting of `N` examples, each of dimension `M`. |
| 99 | y : :py:class:`ndarray <numpy.ndarray>` of shape `(N,)` |
| 100 | The binary targets for each of the `N` examples in `X`. |
| 101 | lr : float |
| 102 | The gradient descent learning rate. Default is 1e-7. |
| 103 | max_iter : float |
| 104 | The maximum number of iterations to run the gradient descent |
| 105 | solver. Default is 1e7. |
| 106 | """ |
| 107 | # convert X to a design matrix if we're fitting an intercept |
| 108 | if self.fit_intercept: |
| 109 | X = np.c_[np.ones(X.shape[0]), X] |
| 110 | |
| 111 | l_prev = np.inf |
| 112 | self.beta = np.random.rand(X.shape[1]) |
| 113 | for _ in range(int(max_iter)): |
| 114 | y_pred = _sigmoid(X @ self.beta) |
| 115 | loss = self._NLL(X, y, y_pred) |
| 116 | if l_prev - loss < tol: |
| 117 | return |
| 118 | l_prev = loss |
| 119 | self.beta -= lr * self._NLL_grad(X, y, y_pred) |
| 120 | |
| 121 | def _NLL(self, X, y, y_pred): |
| 122 | r""" |
no test coverage detected