BCEWithLogitsLoss – 源码巴士

bce = BCEWithLogitsLoss()

pred为网络输出，y为标签值。

第一种情况。pred和y都只有一个值

pred1.shape:[1],pred1.dtype:torch.float32

y1.shape:[1],pred1.dtype:torch.float32

>>> pred1 = torch.Tensor([0.3])
>>> y1 = torch.Tensor([1])
>>> bce(pred1,y1)
tensor(0.5544)

根据交叉熵公式：loss = -(1-y)*log(1-pred)-y*log(pred),当y取0的时候只剩下-(1-y)*log(1-pred)，当y取1的时候剩下-y*log(pred)
需要注意，首先需要将0.3进行sigmoid处理，1/(1+1/math.pow(e,0.3))就是将0.3进入sigmoid网络。所以pred不是0.3，是1/(1+1/math.pow(e,0.3))=0.5744

>>> import math
>>> e = math.e
>>> log = math.log
>>> -log(1/(1+1/math.pow(e,0.3)))
0.5543552444685272

第二种情况。pred和y不是一个值，是列表

pred1.shape:[10],pred1.dtype:torch.float32

y1.shape:[10],pred1.dtype:torch.float32

>>> y2 = torch.ones([10], dtype=torch.float32)
>>> pred2 = torch.full([10], 1.5)
>>> criterion(pred2, y2)
tensor(0.2014)

因为都是1.5，-log(1/(1+1/math.pow(e,1.5)))=0.2014，加起来再平均一下，还是0.2014

第三种情况。pred和y不是一个值，是矩阵

pred1.shape:[10,64],pred1.dtype:torch.float32

y1.shape:[10,64],pred1.dtype:torch.float32

>>> y3 = torch.ones([4, 3], dtype=torch.float32)
>>> pred3 = torch.full([4, 3], 1.5)
>>> criterion = torch.nn.BCEWithLogitsLoss()
>>> y3[0] = 0
>>> pred3
tensor([[1.5000, 1.5000, 1.5000],
        [1.5000, 1.5000, 1.5000],
        [1.5000, 1.5000, 1.5000],
        [1.5000, 1.5000, 1.5000]])
>>> y3
tensor([[0., 0., 0.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
>>> criterion(pred3, y3)
tensor(0.5764)
'''
若标签为1，且输出为1.5,则结果为：-log(1/(1+1/math.pow(e,1.5)))=0.2014
若标签为0，且输出为1.5,则结果为：-log(1-1/(1+1/math.pow(e,1.5)))=1.7014
y3矩阵中，有3个0，9个1，表示(3*1.7014+9*0.2014)/12=0.5764
'''

原文链接：https://blog.csdn.net/zhuhuigege/article/details/124197998

第一种情况。pred和y都只有一个值

第二种情况。pred和y不是一个值，是列表

第三种情况。pred和y不是一个值，是矩阵