Mostly equivalent to `tf.layers.batch_normalization`, but difference in the following: 1. Accepts `data_format` rather than `axis`. For 2D input, this argument will be ignored. 2. Default value for `momentum` and `epsilon` is different. 3. Default value for `training` is automat
(inputs, training=None, momentum=0.9, epsilon=1e-5,
center=True, scale=True,
gamma_initializer=tf.ones_initializer(),
data_format='channels_last',
internal_update=False)
| 65 | 'use_local_stat': 'training' |
| 66 | }) |
| 67 | def BatchNorm(inputs, training=None, momentum=0.9, epsilon=1e-5, |
| 68 | center=True, scale=True, |
| 69 | gamma_initializer=tf.ones_initializer(), |
| 70 | data_format='channels_last', |
| 71 | internal_update=False): |
| 72 | """ |
| 73 | Mostly equivalent to `tf.layers.batch_normalization`, but difference in |
| 74 | the following: |
| 75 | 1. Accepts `data_format` rather than `axis`. For 2D input, this argument will be ignored. |
| 76 | 2. Default value for `momentum` and `epsilon` is different. |
| 77 | 3. Default value for `training` is automatically obtained from `TowerContext`. |
| 78 | 4. Support the `internal_update` option. |
| 79 | Args: |
| 80 | internal_update (bool): if False, add EMA update ops to |
| 81 | `tf.GraphKeys.UPDATE_OPS`. If True, update EMA inside the layer |
| 82 | by control dependencies. |
| 83 | Variable Names: |
| 84 | * ``beta``: the bias term. Will be zero-inited by default. |
| 85 | * ``gamma``: the scale term. Will be one-inited by default. Input will be transformed by ``x * gamma + beta``. |
| 86 | * ``mean/EMA``: the moving average of mean. |
| 87 | * ``variance/EMA``: the moving average of variance. |
| 88 | Note: |
| 89 | 1. About multi-GPU training: moving averages across GPUs are not aggregated. |
| 90 | Batch statistics are computed independently. This is consistent with most frameworks. |
| 91 | 2. Combinations of ``training`` and ``ctx.is_training``: |
| 92 | * ``training == ctx.is_training``: standard BN, EMA are |
| 93 | maintained during training and used during inference. This is |
| 94 | the default. |
| 95 | * ``training and not ctx.is_training``: still use batch statistics in inference. |
| 96 | * ``not training and ctx.is_training``: use EMA to normalize in |
| 97 | training. This is useful when you load a pre-trained BN and |
| 98 | don't want to fine tune the EMA. EMA will not be updated in |
| 99 | this case. |
| 100 | """ |
| 101 | data_format = get_data_format(data_format, keras_mode=False) |
| 102 | shape = inputs.get_shape().as_list() |
| 103 | ndims = len(shape) |
| 104 | assert ndims in [2, 4] |
| 105 | if ndims == 2: |
| 106 | data_format = 'NHWC' |
| 107 | if data_format == 'NCHW': |
| 108 | n_out = shape[1] |
| 109 | else: |
| 110 | n_out = shape[-1] # channel |
| 111 | assert n_out is not None, "Input to BatchNorm cannot have unknown channels!" |
| 112 | beta, gamma, moving_mean, moving_var = get_bn_variables(n_out, scale, center, gamma_initializer) |
| 113 | |
| 114 | ctx = get_current_tower_context() |
| 115 | use_local_stat = training |
| 116 | if use_local_stat is None: |
| 117 | use_local_stat = ctx.is_training |
| 118 | use_local_stat = bool(use_local_stat) |
| 119 | |
| 120 | if use_local_stat: |
| 121 | if ndims == 2: |
| 122 | inputs = tf.reshape(inputs, [-1, 1, 1, n_out]) # fused_bn only takes 4D input |
| 123 | # fused_bn has error using NCHW? (see #190) |
| 124 |
nothing calls this directly
no test coverage detected