当前位置:网站首页>tf. contrib. layers. batch_ norm
tf. contrib. layers. batch_ norm
2022-06-24 10:18:00 【Wanderer001】
Reference resources tf.contrib.layers.batch_norm - cloud + Community - Tencent cloud
Adds a Batch Normalization layer from http://arxiv.org/abs/1502.03167
tf.contrib.layers.batch_norm(
inputs,
decay=0.999,
center=True,
scale=False,
epsilon=0.001,
activation_fn=None,
param_initializers=None,
param_regularizers=None,
updates_collections=tf.GraphKeys.UPDATE_OPS,
is_training=True,
reuse=None,
variables_collections=None,
outputs_collections=None,
trainable=True,
batch_weights=None,
fused=None,
data_format=DATA_FORMAT_NHWC,
zero_debias_moving_mean=False,
scope=None,
renorm=False,
renorm_clipping=None,
renorm_decay=0.99,
adjustment=None
)
"Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift"
Sergey Ioffe, Christian Szegedy
Can be used as a normalizer function for conv2d and fully_connected. The normalization is over all but the last dimension if data_format is NHWC and all but the second dimension if data_format is NCHW. In case of a 2D tensor this corresponds to the batch dimension, while in case of a 4D tensor this corresponds to the batch and space dimensions.
Note: when training, the moving_mean and moving_variance need to be updated. By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. For example:
update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize(loss)
One can set updates_collections=None to force the updates in place, but that can have a speed penalty, especially in distributed settings.
Args:
inputs: A tensor with 2 or more dimensions, where the first dimension hasbatch_size. The normalization is over all but the last dimension ifdata_formatisNHWCand the second dimension ifdata_formatisNCHW.decay: Decay for the moving average. Reasonable values fordecayare close to 1.0, typically in the multiple-nines range: 0.999, 0.99, 0.9, etc. Lowerdecayvalue (recommend tryingdecay=0.9) if model experiences reasonably good training performance but poor validation and/or test performance. Try zero_debias_moving_mean=True for improved stability.center: If True, add offset ofbetato normalized tensor. If False,betais ignored.scale: If True, multiply bygamma. If False,gammais not used. When the next layer is linear (also e.g.nn.relu), this can be disabled since the scaling can be done by the next layer.epsilon: Small float added to variance to avoid dividing by zero.activation_fn: Activation function, default set to None to skip it and maintain a linear activation.param_initializers: Optional initializers for beta, gamma, moving mean and moving variance.param_regularizers: Optional regularizer for beta and gamma.updates_collections: Collections to collect the update ops for computation. The updates_ops need to be executed with the train_op. If None, a control dependency would be added to make sure the updates are computed in place.is_training: Whether or not the layer is in training mode. In training mode it would accumulate the statistics of the moments intomoving_meanandmoving_varianceusing an exponential moving average with the givendecay. When it is not in training mode then it would use the values of themoving_meanand themoving_variance.reuse: Whether or not the layer and its variables should be reused. To be able to reuse the layer scope must be given.variables_collections: Optional collections for the variables.outputs_collections: Collections to add the outputs.trainable: IfTruealso add variables to the graph collectionGraphKeys.TRAINABLE_VARIABLES(see tf.Variable).batch_weights: An optional tensor of shape[batch_size], containing a frequency weight for each batch item. If present, then the batch normalization uses weighted mean and variance. (This can be used to correct for bias in training example selection.)fused: ifNoneorTrue, use a faster, fused implementation if possible. IfFalse, use the system recommended implementation.data_format: A string.NHWC(default) andNCHWare supported.zero_debias_moving_mean: Use zero_debias for moving_mean. It creates a new pair of variables 'moving_mean/biased' and 'moving_mean/local_step'.scope: Optional scope forvariable_scope.renorm: Whether to use Batch Renormalization (https://arxiv.org/abs/1702.03275). This adds extra variables during training. The inference is the same for either value of this parameter.renorm_clipping: A dictionary that may map keys 'rmax', 'rmin', 'dmax' to scalarTensorsused to clip the renorm correction. The correction(r, d)is used ascorrected_value = normalized_value * r + d, withrclipped to [rmin, rmax], anddto [-dmax, dmax]. Missing rmax, rmin, dmax are set to inf, 0, inf, respectively.renorm_decay: Momentum used to update the moving means and standard deviations with renorm. Unlikemomentum, this affects training and should be neither too small (which would add noise) nor too large (which would give stale estimates). Note thatdecayis still applied to get the means and variances for inference.adjustment: A function taking theTensorcontaining the (dynamic) shape of the input tensor and returning a pair (scale, bias) to apply to the normalized values (before gamma and beta), only during training. For example,adjustment = lambda shape: ( tf.random.uniform(shape[-1:], 0.93, 1.07), tf.random.uniform(shape[-1:], -0.1, 0.1))will scale the normalized value by up to 7% up or down, then shift the result by up to 0.1 (with independent scaling and bias for each feature but shared across all examples), and finally apply gamma and/or beta. IfNone, no adjustment is applied.
Returns:
- A
Tensorrepresenting the output of the operation.
Raises:
ValueError: Ifdata_formatis neitherNHWCnorNCHW.ValueError: If the rank ofinputsis undefined.ValueError: If rank or channels dimension ofinputsis undefined.
边栏推荐
- canvas无限扫描js特效代码
- 微信小程序rich-text图片宽高自适应的方法介绍(rich-text富文本)
- Distributed | how to make "secret calls" with dble
- leetCode-面试题 01.05: 一次编辑
- Regular matching mailbox
- [db2] sql0805n solution and thinking
- Symbol. Iterator iterator
- 416-二叉树(前中后序遍历—迭代法)
- Cookie encryption 4 RPC method determines cookie encryption
- 整理接口性能优化技巧,干掉慢代码
猜你喜欢

413-二叉树基础

解决Deprecated: Methods with the same name as their class will not be constructors in报错方案

oracle池式连接请求超时问题排查步骤

canvas管道动画js特效

Machine learning perceptron and k-nearest neighbor

uniapp开发微信小程序,显示地图功能,且点击后打开高德或腾讯地图。

p5.js实现的炫酷交互式动画js特效

SQL Sever中的窗口函数row_number()rank()dense_rank()

411 stack and queue (20. valid parentheses, 1047. delete all adjacent duplicates in the string, 150. inverse Polish expression evaluation, 239. sliding window maximum, 347. the first k high-frequency

leetCode-498: 对角线遍历
随机推荐
tf.errors
Graffiti smart brings a variety of heavy smart lighting solutions to the 2022 American International Lighting Exhibition
Cicflowmeter source code analysis and modification to meet requirements
Regular matching mailbox
3. addition, deletion, modification and query of employees
uniapp实现禁止video拖拽快进
SQL Sever关于like操作符(包括字段数据自动填充空格问题)
正规方程、、、
简单的价格表样式代码
MYSQL数据高级
Internet of things? Come and see Arduino on the cloud
411-栈和队列(20. 有效的括号、1047. 删除字符串中的所有相邻重复项、150. 逆波兰表达式求值、239. 滑动窗口最大值、347. 前 K 个高频元素)
uniapp开发微信小程序,显示地图功能,且点击后打开高德或腾讯地图。
413-二叉树基础
Getting user information for applet learning (getuserprofile and getUserInfo)
411 stack and queue (20. valid parentheses, 1047. delete all adjacent duplicates in the string, 150. inverse Polish expression evaluation, 239. sliding window maximum, 347. the first k high-frequency
学习使用KindEditor富文本编辑器,点击上传图片遮罩太大或白屏解决方案
How large and medium-sized enterprises build their own monitoring system
dedecms模板文件讲解以及首页标签替换
Operator details