Introduction

训练深度学习模型地目的只有一个，尽可能的学习到训练数据的分布。像往常的考试一样，考试成绩会有一个分数作为评判标准，评价你对于知识点地掌握情况，以便后续针对性地去学习。深度学习也一样，需要一个数值对训练的效果进行评价。

在有监督学习中，这个数值也就是评价标准被称作loss值，即损失值。对于给定的输入 $x$ ，对应标签值为 $y$ 。定义深度学习模型的映射为 $f(x;\theta)$ 。输出的预测值为 $\hat{y}$ ，损失函数 $L o s s$ 是用来计算便签值即真实值（Ground-Truth） $y$ 和预测值 $\hat{y}$ 之间的差距。根据 $l o s s$ 值，模型通过反向传播算法，不断的调整 $\theta$ 来拟合训练集的分布。
注意

代价函数是整个训练集上所有样本误差的平均，本质上看和损失函数是同一个东西
目标函数就是训练模型去拟合的映射

常见的损失函数

首先定义网络

一个hidden layer，100个神经单元
in_channel = 10 , out_channels = 10 即输出10个类别
ReLU作为激活函数

class model(nn.Module):
    def __init__(self):
        super(model, self).__init__()
        self.fc1 = nn.Linear(10,100)
        self.fc2 = nn.Linear(100,10)

    def forward(self,x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        out = F.relu(out)

        return out

准备input 和 target即真实值y

net = model()
print(net)
x = torch.randn(20,10)
y = torch.randn(20,10)

MSELoss

y_pred = net(x)
mse_loss1 = ((y_pred - y).pow(2).sum()) # mse loss
print(mse_loss1.item())
# 调用torch nn模块中写好的MSEloss 函数 
mse_loss2 = torch.nn.MSELoss(reduction='mean') # 设定输出的loss是mean形式或者是sum
output = mse_loss2(y_pred,y)
print(output.item())

model(
  (fc1): Linear(in_features=10, out_features=100, bias=True)
  (fc2): Linear(in_features=100, out_features=10, bias=True)
)
0.8617805242538452
0.8617805242538452

可以发现两个输出完全一样

交叉熵损失函数

在二分类问题中，不论输出是向量或者标量，模型只会被预测为正负两类，即0或1。
分类问题中最常用的损失函数是交叉熵损失函数（Cross Entropy Loss）

CE可表示为BCE（Binary CE）：
$-\sum_{i=1}^N{y_i\log \left( \hat{y}_i \right)}+\left( 1-y_i \right) \log \left( 1-\hat{y}_i \right)$
代码实现

import torch
import numpy as np

out = torch.randn(2,3) # 输入为2张图片，数据集共3个类别
print(out)
label = torch.tensor([[1,0,0],[0,0,1]])# 真实的label值

my_sigmoid = torch.nn.Sigmoid() # 实例化sigmoid
print(my_sigmoid(out))  # 将输出映射到(0,1)之间
my_bce_loss = torch.nn.BCELoss(reduction='none')
BCE_out = my_bce_loss(my_sigmoid(out),label.float())
print(BCE_out)

输出为：

OUT：
tensor([[ 1.1360,  0.3006, -1.0718],
        [ 1.2148, -2.8384, -0.8209]])
sigmoid_OUT:
tensor([[0.7569, 0.5746, 0.2551],
        [0.7711, 0.0553, 0.3056]])
BCE_OUT:
tensor([[0.2785, 0.8547, 0.2944],
        [1.4747, 0.0569, 1.1856]])

原文链接：https://blog.csdn.net/weixin_40756000/article/details/118017564