xs, ys 一个batch的输入图像像素点
假设一个BATCH_SIZE是100,那么xs,ys都是[100, 784]的100张图,每张图有784个点的输入。
xs = [[784个点][784个点][][784个点]]
y, y_ 一个batch的输出向量与标签向量
经过前向传播sess.run()之后,y(也就是a)就是[100, 10]的100张图,每张图有10个点的输出。
y_由于有onehot=true,因此,读出来也是的100张图,每张图有10个点的输出。
y = [[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] [10个输出][][10个输出]]
y_ = [[ 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.][10个输出][][10个输出]]
tf.argmax(y_, 1) tf.argmax(y, 1) 一个batch的输出值与标签值
tf.argmax(y_, 1) 就是,对一张图来说,找出[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]中最大的下标为6。
tf.argmax(y_, 1) = [6,7,8,9,0,1,4,5,2,5,…6,7]
cross_entropy 一个batch的交叉熵loss(一个向量)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
cross_entropy就是100张图,对于每张图来说,y向量与y_向量算交叉熵得到一个值。
cross_entropy = [3.8, 1.2, 15, 0.5, …, 5.6]
cross_entropy_mean 一个batch的交叉熵loss的均值(一个值)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
cross_entropy_mean只有1个值,是100张图的cross_entropy的均值。即[3.8, 1.2, 15, 0.5, …, 5.6]的均值。
cross_entropy_mean = 10.5086
相关代码
上述例子的代码来自这里
for i in range(TRAINING_STEPS):
xs, ys = mnist.train.next_batch(BATCH_SIZE)
_, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: xs, y_: ys})
a, b, c, d, e = sess.run([logits_op, y__op, labels_op, cross_entroy_op, cross_entroy_mean_op], feed_dict={x: xs, y_: ys})
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
cross_entropy_mean = tf.reduce_mean(cross_entropy)
logits_op = y
y__op = y_
labels_op = tf.argmax(y_, 1)
cross_entroy_op = cross_entropy
cross_entroy_mean_op = cross_entropy_mean
结论
因为cross_entropy_mean加入了loss,而train就是最小化loss,因此联系了起来。
即通过训练可以每次算出一个batch的100张图片的loss均值,使其最小。
因此如果打印单张图的识别结果的话,算 tf.argmax(y, 1)即可。
完整代码
方便分析
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import mnist_inference
import os
BATCH_SIZE = 100
LEARNING_RATE_BASE = 0.8
LEARNING_RATE_DECAY = 0.99
REGULARIZATION_RATE = 0.0001
TRAINING_STEPS = 100
MOVING_AVERAGE_DECAY = 0.99
MODEL_SAVE_PATH="MNIST_model/"
MODEL_NAME="mnist_model"
def train(mnist):
x = tf.placeholder(tf.float32, [None, mnist_inference.INPUT_NODE], name='x-input')
y_ = tf.placeholder(tf.float32, [None, mnist_inference.OUTPUT_NODE], name='y-input')
regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
y = mnist_inference.inference(x, regularizer)
global_step = tf.Variable(0, trainable=False)
variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
variables_averages_op = variable_averages.apply(tf.trainable_variables())
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
cross_entropy_mean = tf.reduce_mean(cross_entropy)
logits_op = y
y__op = y_
labels_op = tf.argmax(y_, 1)
cross_entroy_op = cross_entropy
cross_entroy_mean_op = cross_entropy_mean
loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
learning_rate = tf.train.exponential_decay(
LEARNING_RATE_BASE,
global_step,
mnist.train.num_examples / BATCH_SIZE, LEARNING_RATE_DECAY,
staircase=True)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
with tf.control_dependencies([train_step, variables_averages_op]):
train_op = tf.no_op(name='train')
saver = tf.train.Saver()
with tf.Session() as sess:
tf.global_variables_initializer().run()
for i in range(TRAINING_STEPS):
xs, ys = mnist.train.next_batch(BATCH_SIZE)
_, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: xs, y_: ys})
a, b, c, d, e = sess.run([logits_op, y__op, labels_op, cross_entroy_op, cross_entroy_mean_op], feed_dict={x: xs, y_: ys})
if i % 1000 == 0:
print("After %d training step(s), loss on training batch is %g." % (step, loss_value))
saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)
def main(argv=None):
mnist = input_data.read_data_sets("../../../datasets/MNIST_data", one_hot=True)
train(mnist)
if __name__ == '__main__':
tf.app.run()
参考
http://blog.csdn.net/hejunqing14/article/details/52397824
http://www.jianshu.com/p/fb119d0ff6a6