Backprop from layer outputs to inputs. Parameters ---------- dLdy : :py:class:`ndarray ` of shape `(n_ex, n_out)` or list of arrays The gradient(s) of the loss wrt. the layer output(s). retain_grads : bool Whether to in
(self, dLdy, retain_grads=True)
| 2298 | return e_X / e_X.sum(axis=self.dim, keepdims=True) |
| 2299 | |
| 2300 | def backward(self, dLdy, retain_grads=True): |
| 2301 | """ |
| 2302 | Backprop from layer outputs to inputs. |
| 2303 | |
| 2304 | Parameters |
| 2305 | ---------- |
| 2306 | dLdy : :py:class:`ndarray <numpy.ndarray>` of shape `(n_ex, n_out)` or list of arrays |
| 2307 | The gradient(s) of the loss wrt. the layer output(s). |
| 2308 | retain_grads : bool |
| 2309 | Whether to include the intermediate parameter gradients computed |
| 2310 | during the backward pass in the final parameter update. Default is |
| 2311 | True. |
| 2312 | |
| 2313 | Returns |
| 2314 | ------- |
| 2315 | dLdX : :py:class:`ndarray <numpy.ndarray>` of shape `(n_ex, n_in)` |
| 2316 | The gradient of the loss wrt. the layer input `X`. |
| 2317 | """ # noqa: E501 |
| 2318 | assert self.trainable, "Layer is frozen" |
| 2319 | if not isinstance(dLdy, list): |
| 2320 | dLdy = [dLdy] |
| 2321 | |
| 2322 | dX = [] |
| 2323 | X = self.X |
| 2324 | for dy, x in zip(dLdy, X): |
| 2325 | dx = self._bwd(dy, x) |
| 2326 | dX.append(dx) |
| 2327 | |
| 2328 | return dX[0] if len(X) == 1 else dX |
| 2329 | |
| 2330 | def _bwd(self, dLdy, X): |
| 2331 | """ |