MCPcopy Index your code
hub / github.com/ddbourgin/numpy-ml / _bwd

Method _bwd

numpy_ml/neural_nets/layers/layers.py:2330–2349  ·  view source on GitHub ↗

Actual computation of the gradient of the loss wrt. the input X. The Jacobian, J, of the softmax for input x = [x1, ..., xn] is: J[i, j] = softmax(x_i) * (1 - softmax(x_j)) if i = j -softmax(x_i) * softmax(x_j) if i != j

(self, dLdy, X)

Source from the content-addressed store, hash-verified

2328 return dX[0] if len(X) == 1 else dX
2329
2330 def _bwd(self, dLdy, X):
2331 """
2332 Actual computation of the gradient of the loss wrt. the input X.
2333
2334 The Jacobian, J, of the softmax for input x = [x1, ..., xn] is:
2335 J[i, j] =
2336 softmax(x_i) * (1 - softmax(x_j)) if i = j
2337 -softmax(x_i) * softmax(x_j) if i != j
2338 where
2339 x_n is input example n (ie., the n'th row in X)
2340 """
2341 dX = []
2342 for dy, x in zip(dLdy, X):
2343 dxi = []
2344 for dyi, xi in zip(*np.atleast_2d(dy, x)):
2345 yi = self._fwd(xi.reshape(1, -1)).reshape(-1, 1)
2346 dyidxi = np.diagflat(yi) - yi @ yi.T # jacobian wrt. input sample xi
2347 dxi.append(dyi @ dyidxi)
2348 dX.append(dxi)
2349 return np.array(dX).reshape(*X.shape)
2350
2351
2352class SparseEvolution(LayerBase):

Callers 1

backwardMethod · 0.95

Calls 1

_fwdMethod · 0.95

Tested by

no test coverage detected