Backprop from layer outputs to inputs. Parameters ---------- dLdy : :py:class:`ndarray ` of shape `(n_ex, n_out)` or list of arrays The gradient(s) of the loss wrt. the layer output(s). retain_grads : bool Whether to in
(self, dLdy, retain_grads=True)
| 2129 | return Y, Z |
| 2130 | |
| 2131 | def backward(self, dLdy, retain_grads=True): |
| 2132 | """ |
| 2133 | Backprop from layer outputs to inputs. |
| 2134 | |
| 2135 | Parameters |
| 2136 | ---------- |
| 2137 | dLdy : :py:class:`ndarray <numpy.ndarray>` of shape `(n_ex, n_out)` or list of arrays |
| 2138 | The gradient(s) of the loss wrt. the layer output(s). |
| 2139 | retain_grads : bool |
| 2140 | Whether to include the intermediate parameter gradients computed |
| 2141 | during the backward pass in the final parameter update. Default is |
| 2142 | True. |
| 2143 | |
| 2144 | Returns |
| 2145 | ------- |
| 2146 | dLdX : :py:class:`ndarray <numpy.ndarray>` of shape `(n_ex, n_in)` or list of arrays |
| 2147 | The gradient of the loss wrt. the layer input(s) `X`. |
| 2148 | """ # noqa: E501 |
| 2149 | assert self.trainable, "Layer is frozen" |
| 2150 | if not isinstance(dLdy, list): |
| 2151 | dLdy = [dLdy] |
| 2152 | |
| 2153 | dX = [] |
| 2154 | X = self.X |
| 2155 | for dy, x in zip(dLdy, X): |
| 2156 | dx, dw, db = self._bwd(dy, x) |
| 2157 | dX.append(dx) |
| 2158 | |
| 2159 | if retain_grads: |
| 2160 | self.gradients["W"] += dw |
| 2161 | self.gradients["b"] += db |
| 2162 | |
| 2163 | return dX[0] if len(X) == 1 else dX |
| 2164 | |
| 2165 | def _bwd(self, dLdy, X): |
| 2166 | """Actual computation of gradient of the loss wrt. X, W, and b""" |