函数定义如下:
def moments(x, axes, name=None, keep_dims=False)
x: 输入数据,格式一般为:[batchsize, height, width, kernels] axes: List,在哪个维度上计算,比如:[0, 1, 2] name: 操作的名称 keep_dims: 是否保持维度
mean: 均值 variance: 方差
img = tf.Variable(tf.random_normal([128, 32, 32, 64]))
axis = list(range(len(img.get_shape()) - 1))
mean, variance = tf.nn.moments(img, axis)
def batch_normalization(x, mean, variance, offset, scale, variance_epsilon, name=None)
在使用batch_normalization的时候,需要去除网络中的bias
x: 输入的Tensor数据 mean: Tensor的均值 variance: Tensor的方差 offset: offset Tensor, 一般初始化为0,可训练 scale: scale Tensor,一般初始化为1,可训练 variance_epsilon: 一个小的浮点数,避免除数为0,一般取值0.001 name: 操作的名称

def conv_layer(prev_layer, layer_depth, is_training):
strides = 2 if layer_depth % 3 == 0 else 1
in_channels = prev_layer.get_shape().as_list()[3]
out_channels = layer_depth*4
weights = tf.Variable(
tf.truncated_normal([3, 3, in_channels, out_channels], stddev=0.05))
layer = tf.nn.conv2d(prev_layer, weights, strides=[1,strides, strides, 1], padding='SAME')
gamma = tf.Variable(tf.ones([out_channels]))
beta = tf.Variable(tf.zeros([out_channels]))
pop_mean = tf.Variable(tf.zeros([out_channels]), trainable=False)
pop_variance = tf.Variable(tf.ones([out_channels]), trainable=False)
epsilon = 1e-3
def batch_norm_training():
batch_mean, batch_variance = tf.nn.moments(layer, [0,1,2], keep_dims=False)
decay = 0.99
train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay))
train_variance = tf.assign(pop_variance, pop_variance * decay + batch_variance * (1 - decay))
with tf.control_dependencies([train_mean, train_variance]):
return tf.nn.batch_normalization(layer, batch_mean, batch_variance, beta, gamma, epsilon)
def batch_norm_inference():
return tf.nn.batch_normalization(layer, pop_mean, pop_variance, beta, gamma, epsilon)
batch_normalized_output = tf.cond(is_training, batch_norm_training, batch_norm_inference)
return tf.nn.relu(batch_normalized_output)