MCPcopy Index your code
hub / github.com/tensorpack/tensorpack / BatchNorm

Function BatchNorm

tensorpack/models/batch_norm.py:130–393  ·  view source on GitHub ↗

A more powerful version of `tf.layers.batch_normalization`. It differs from the offical one in the following aspects: 1. Accepts an alternative ``data_format`` option when ``axis`` is None. For 2D input, this argument will be ignored. 2. Default value for ``momentum`` and ``epsilon

(inputs, axis=None, *, training=None, momentum=0.9, epsilon=1e-5,
              center=True, scale=True,
              beta_initializer=tf.zeros_initializer(),
              gamma_initializer=tf.ones_initializer(),
              virtual_batch_size=None,
              data_format='channels_last',
              ema_update='default',
              sync_statistics=None)

Source from the content-addressed store, hash-verified

128 })
129@disable_autograph()
130def BatchNorm(inputs, axis=None, *, training=None, momentum=0.9, epsilon=1e-5,
131 center=True, scale=True,
132 beta_initializer=tf.zeros_initializer(),
133 gamma_initializer=tf.ones_initializer(),
134 virtual_batch_size=None,
135 data_format='channels_last',
136 ema_update='default',
137 sync_statistics=None):
138 """
139 A more powerful version of `tf.layers.batch_normalization`. It differs from
140 the offical one in the following aspects:
141
142 1. Accepts an alternative ``data_format`` option when ``axis`` is None. For 2D input, this argument will be ignored.
143 2. Default value for ``momentum`` and ``epsilon`` is different.
144 3. Default value for ``training`` is automatically obtained from tensorpack's ``TowerContext``.
145 User-provided value can overwrite this behavior.
146 4. Support the ``ema_update`` option, which covers broader use cases than the standard EMA update.
147 5. Support the ``sync_statistics`` option, which implements "SyncBN" and is very useful in small-batch models.
148 6. Better support of the ``virtual_batch_size`` option that does not have the bugs in ``tf.layers``.
149
150 Args:
151 training (bool): if True, use per-batch statistics to normalize. Otherwise, use stored EMA
152 to normalize. By default, it is equal to `get_current_tower_context().is_training`.
153 This is not a good argument name, but it is what the Tensorflow layer uses.
154 virtual_batch_size (int): implement "Ghost BatchNorm" that normalizes
155 the data with a smaller batch size than the input. Only effective when training is True.
156 The value has to be a divisor of the actual batch size.
157
158 It does not use the buggy TensorFlow implementation which has the
159 problems of (1) wrong behavior at inference; (2) create variables with unnecessary size=1 dimensions.
160 Corresponding TF issue: https://github.com/tensorflow/tensorflow/issues/23050
161 ema_update (str): Only effective when ``training=True``. It has the following options:
162
163 * "default": same as "collection". Because this is the default behavior in TensorFlow.
164 * "skip": do not update EMA. This can be useful when you reuse a batch norm layer in several places
165 but do not want them to all update your EMA.
166 * "collection": Add EMA update ops to collection `tf.GraphKeys.UPDATE_OPS` in the first training tower.
167 The ops in the collection will be run automatically by the callback :class:`RunUpdateOps`, along with
168 your training iterations. This can waste compute if your training iterations do not always depend
169 on the BatchNorm layer.
170 * "internal": EMA is updated in the first training tower inside this layer itself by control dependencies.
171 In standard scenarios, it has similar speed to "collection". But it supports more scenarios:
172
173 1. BatchNorm is used inside dynamic control flow.
174 The collection-based update does not support dynamic control flows.
175 2. BatchNorm layer is sometimes unused (e.g., in GANs you have two networks to train alternatively).
176 Putting all update ops into a single collection will waste a lot of compute.
177 3. Other part of the model relies on the "updated" EMA. The collection-based method does not update
178 EMA immediately.
179 4. It has less chance to cause TensorFlow bugs in a graph with complicated control flow.
180
181 Therefore this option is preferred over TensorFlow default.
182 Corresponding TF issue: https://github.com/tensorflow/tensorflow/issues/14699
183 sync_statistics (str or None): one of None, "nccl", or "horovod". It determines how to compute the
184 "per-batch statistics" when ``training==True``.
185
186 * None: it uses statistics of the input tensor to normalize during training.
187 This is the standard way BatchNorm was implemented in most frameworks.

Callers 9

get_bnFunction · 0.90
get_bnFunction · 0.90
BNReLUFunction · 0.70
BNLReLUFunction · 0.50
BNLReLUFunction · 0.50
convnormreluFunction · 0.50
shufflenet_unitFunction · 0.50
shufflenet_unit_v2Function · 0.50
resblockMethod · 0.50

Calls 12

get_data_formatFunction · 0.85
log_onceFunction · 0.85
backup_collectionFunction · 0.85
rename_get_variableFunction · 0.85
restore_collectionFunction · 0.85
VariableHolderClass · 0.85
get_sync_bn_mean_varFunction · 0.85
internal_update_bn_emaFunction · 0.85
shapeMethod · 0.80
get_bn_variablesFunction · 0.70
applyMethod · 0.45

Tested by

no test coverage detected