hub / github.com/ddbourgin/numpy-ml / LayerNorm2D

Class LayerNorm2D

numpy_ml/neural_nets/layers/layers.py:1444–1631 · view source on GitHub ↗

Source from the content-addressed store, hash-verified

1442
1443
1444	class LayerNorm2D(LayerBase):
1445	def __init__(self, epsilon=1e-5, optimizer=None):
1446	"""
1447	A layer normalization layer for 2D inputs with an additional channel
1448	dimension.
1449
1450	Notes
1451	-----
1452	In contrast to :class:`BatchNorm2D`, the LayerNorm layer calculates the
1453	mean and variance across features rather than examples in the batch
1454	ensuring that the mean and variance estimates are independent of batch
1455	size and permitting straightforward application in RNNs.
1456
1457	Equations [train & test]::
1458
1459	Y = scaler * norm(X) + intercept
1460	norm(X) = (X - mean(X)) / sqrt(var(X) + epsilon)
1461
1462	Also in contrast to :class:`BatchNorm2D`, `scaler` and `intercept` are applied
1463	elementwise to ``norm(X)``.
1464
1465	Parameters
1466	----------
1467	epsilon : float
1468	A small smoothing constant to use during computation of ``norm(X)``
1469	to avoid divide-by-zero errors. Default is 1e-5.
1470	optimizer : str, :doc:`Optimizer <numpy_ml.neural_nets.optimizers>` object, or None
1471	The optimization strategy to use when performing gradient updates
1472	within the :meth:`update` method. If None, use the :class:`SGD
1473	<numpy_ml.neural_nets.optimizers.SGD>` optimizer with
1474	default parameters. Default is None.
1475
1476	Attributes
1477	----------
1478	X : list
1479	Running list of inputs to the :meth:`forward <numpy_ml.neural_nets.LayerBase.forward>` method since the last call to :meth:`update <numpy_ml.neural_nets.LayerBase.update>`. Only updated if the `retain_derived` argument was set to True.
1480	gradients : dict
1481	Dictionary of loss gradients with regard to the layer parameters
1482	parameters : dict
1483	Dictionary of layer parameters
1484	hyperparameters : dict
1485	Dictionary of layer hyperparameters
1486	derived_variables : dict
1487	Dictionary of any intermediate values computed during
1488	forward/backward propagation.
1489	""" # noqa: E501
1490	super().__init__(optimizer)
1491
1492	self.in_ch = None
1493	self.out_ch = None
1494	self.epsilon = epsilon
1495	self.parameters = {"scaler": None, "intercept": None}
1496	self.is_initialized = False
1497
1498	def _init_params(self, X_shape):
1499	n_ex, in_rows, in_cols, in_ch = X_shape
1500
1501	scaler = np.random.rand(in_rows, in_cols, in_ch)

Callers 1

test_LayerNorm2DFunction · 0.90

Calls

no outgoing calls

Tested by 1

test_LayerNorm2DFunction · 0.72