| 1442 | |
| 1443 | |
| 1444 | class LayerNorm2D(LayerBase): |
| 1445 | def __init__(self, epsilon=1e-5, optimizer=None): |
| 1446 | """ |
| 1447 | A layer normalization layer for 2D inputs with an additional channel |
| 1448 | dimension. |
| 1449 | |
| 1450 | Notes |
| 1451 | ----- |
| 1452 | In contrast to :class:`BatchNorm2D`, the LayerNorm layer calculates the |
| 1453 | mean and variance across *features* rather than examples in the batch |
| 1454 | ensuring that the mean and variance estimates are independent of batch |
| 1455 | size and permitting straightforward application in RNNs. |
| 1456 | |
| 1457 | Equations [train & test]:: |
| 1458 | |
| 1459 | Y = scaler * norm(X) + intercept |
| 1460 | norm(X) = (X - mean(X)) / sqrt(var(X) + epsilon) |
| 1461 | |
| 1462 | Also in contrast to :class:`BatchNorm2D`, `scaler` and `intercept` are applied |
| 1463 | *elementwise* to ``norm(X)``. |
| 1464 | |
| 1465 | Parameters |
| 1466 | ---------- |
| 1467 | epsilon : float |
| 1468 | A small smoothing constant to use during computation of ``norm(X)`` |
| 1469 | to avoid divide-by-zero errors. Default is 1e-5. |
| 1470 | optimizer : str, :doc:`Optimizer <numpy_ml.neural_nets.optimizers>` object, or None |
| 1471 | The optimization strategy to use when performing gradient updates |
| 1472 | within the :meth:`update` method. If None, use the :class:`SGD |
| 1473 | <numpy_ml.neural_nets.optimizers.SGD>` optimizer with |
| 1474 | default parameters. Default is None. |
| 1475 | |
| 1476 | Attributes |
| 1477 | ---------- |
| 1478 | X : list |
| 1479 | Running list of inputs to the :meth:`forward <numpy_ml.neural_nets.LayerBase.forward>` method since the last call to :meth:`update <numpy_ml.neural_nets.LayerBase.update>`. Only updated if the `retain_derived` argument was set to True. |
| 1480 | gradients : dict |
| 1481 | Dictionary of loss gradients with regard to the layer parameters |
| 1482 | parameters : dict |
| 1483 | Dictionary of layer parameters |
| 1484 | hyperparameters : dict |
| 1485 | Dictionary of layer hyperparameters |
| 1486 | derived_variables : dict |
| 1487 | Dictionary of any intermediate values computed during |
| 1488 | forward/backward propagation. |
| 1489 | """ # noqa: E501 |
| 1490 | super().__init__(optimizer) |
| 1491 | |
| 1492 | self.in_ch = None |
| 1493 | self.out_ch = None |
| 1494 | self.epsilon = epsilon |
| 1495 | self.parameters = {"scaler": None, "intercept": None} |
| 1496 | self.is_initialized = False |
| 1497 | |
| 1498 | def _init_params(self, X_shape): |
| 1499 | n_ex, in_rows, in_cols, in_ch = X_shape |
| 1500 | |
| 1501 | scaler = np.random.rand(in_rows, in_cols, in_ch) |
no outgoing calls