tf.argmax(y_, 1) 与cross_entropy 与cross_entropy_mean的含义解释

xs, ys 一个batch的输入图像像素点

假设一个BATCH_SIZE是100,那么xs,ys都是[100, 784]的100张图,每张图有784个点的输入。

xs = [[784个点][784个点][][784个点]]

y, y_ 一个batch的输出向量与标签向量

经过前向传播sess.run()之后,y(也就是a)就是[100, 10]的100张图,每张图有10个点的输出。
y_由于有onehot=true,因此,读出来也是的100张图,每张图有10个点的输出。

y = [[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] [10个输出][][10个输出]]
y_ = [[ 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.][10个输出][][10个输出]]

tf.argmax(y_, 1) tf.argmax(y, 1) 一个batch的输出值与标签值

tf.argmax(y_, 1) 就是,对一张图来说,找出[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]中最大的下标为6。

tf.argmax(y_, 1) = [6,7,8,9,0,1,4,5,2,5,…6,7]

cross_entropy 一个batch的交叉熵loss(一个向量)

cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
cross_entropy就是100张图,对于每张图来说,y向量与y_向量算交叉熵得到一个值。

cross_entropy = [3.8, 1.2, 15, 0.5, …, 5.6]

cross_entropy_mean 一个batch的交叉熵loss的均值(一个值)

cross_entropy_mean = tf.reduce_mean(cross_entropy)
cross_entropy_mean只有1个值,是100张图的cross_entropy的均值。即[3.8, 1.2, 15, 0.5, …, 5.6]的均值。

cross_entropy_mean = 10.5086

相关代码

上述例子的代码来自这里

for i in range(TRAINING_STEPS):
	xs, ys = mnist.train.next_batch(BATCH_SIZE)
            _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: xs, y_: ys})

		a, b, c, d, e = sess.run([logits_op, y__op, labels_op, cross_entroy_op, cross_entroy_mean_op], feed_dict={x: xs, y_: ys})
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
cross_entropy_mean = tf.reduce_mean(cross_entropy)
logits_op                   = y
y__op                       = y_
labels_op                   = tf.argmax(y_, 1)
cross_entroy_op             = cross_entropy
cross_entroy_mean_op        = cross_entropy_mean

结论

因为cross_entropy_mean加入了loss,而train就是最小化loss,因此联系了起来。
即通过训练可以每次算出一个batch的100张图片的loss均值,使其最小。
因此如果打印单张图的识别结果的话,算 tf.argmax(y, 1)即可。

完整代码

方便分析

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import mnist_inference
import os

BATCH_SIZE = 100
LEARNING_RATE_BASE = 0.8
LEARNING_RATE_DECAY = 0.99
REGULARIZATION_RATE = 0.0001
TRAINING_STEPS = 100
MOVING_AVERAGE_DECAY = 0.99
MODEL_SAVE_PATH="MNIST_model/"
MODEL_NAME="mnist_model"


def train(mnist):

    x = tf.placeholder(tf.float32, [None, mnist_inference.INPUT_NODE], name='x-input')
    y_ = tf.placeholder(tf.float32, [None, mnist_inference.OUTPUT_NODE], name='y-input')

    regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
    y = mnist_inference.inference(x, regularizer)
    global_step = tf.Variable(0, trainable=False)


    variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
    variables_averages_op = variable_averages.apply(tf.trainable_variables())
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax(y_, 1))
    cross_entropy_mean = tf.reduce_mean(cross_entropy)
    logits_op                   = y
    y__op                       = y_
    labels_op                   = tf.argmax(y_, 1)
    cross_entroy_op             = cross_entropy
    cross_entroy_mean_op        = cross_entropy_mean

    loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
    learning_rate = tf.train.exponential_decay(
        LEARNING_RATE_BASE,
        global_step,
        mnist.train.num_examples / BATCH_SIZE, LEARNING_RATE_DECAY,
        staircase=True)
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
    with tf.control_dependencies([train_step, variables_averages_op]):
        train_op = tf.no_op(name='train')


    saver = tf.train.Saver()
    with tf.Session() as sess:
        tf.global_variables_initializer().run()

        for i in range(TRAINING_STEPS):
            xs, ys = mnist.train.next_batch(BATCH_SIZE)
            _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: xs, y_: ys})

            a, b, c, d, e = sess.run([logits_op, y__op, labels_op, cross_entroy_op, cross_entroy_mean_op], feed_dict={x: xs, y_: ys})


            if i % 1000 == 0:
                print("After %d training step(s), loss on training batch is %g." % (step, loss_value))
                saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)


def main(argv=None):
    mnist = input_data.read_data_sets("../../../datasets/MNIST_data", one_hot=True)
    train(mnist)

if __name__ == '__main__':
    tf.app.run()

参考

http://blog.csdn.net/hejunqing14/article/details/52397824
http://www.jianshu.com/p/fb119d0ff6a6


版权声明:本文为daska110原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。