hub / github.com/ddbourgin/numpy-ml / _bwd

Method _bwd

numpy_ml/neural_nets/layers/layers.py:2330–2349 · view source on GitHub ↗

Actual computation of the gradient of the loss wrt. the input X. The Jacobian, J, of the softmax for input x = [x1, ..., xn] is: J[i, j] = softmax(x_i) * (1 - softmax(x_j)) if i = j -softmax(x_i) * softmax(x_j) if i != j

(self, dLdy, X)

Source from the content-addressed store, hash-verified

2328	return dX[0] if len(X) == 1 else dX
2329
2330	def _bwd(self, dLdy, X):
2331	"""
2332	Actual computation of the gradient of the loss wrt. the input X.
2333
2334	The Jacobian, J, of the softmax for input x = [x1, ..., xn] is:
2335	J[i, j] =
2336	softmax(x_i) * (1 - softmax(x_j)) if i = j
2337	-softmax(x_i) * softmax(x_j) if i != j
2338	where
2339	x_n is input example n (ie., the n'th row in X)
2340	"""
2341	dX = []
2342	for dy, x in zip(dLdy, X):
2343	dxi = []
2344	for dyi, xi in zip(*np.atleast_2d(dy, x)):
2345	yi = self._fwd(xi.reshape(1, -1)).reshape(-1, 1)
2346	dyidxi = np.diagflat(yi) - yi @ yi.T # jacobian wrt. input sample xi
2347	dxi.append(dyi @ dyidxi)
2348	dX.append(dxi)
2349	return np.array(dX).reshape(*X.shape)
2350
2351
2352	class SparseEvolution(LayerBase):

Callers 1

backwardMethod · 0.95

Calls 1

_fwdMethod · 0.95

Tested by

no test coverage detected