relu 函数小记 – 源码巴士

torch.nn.ReLU(inplace=False)：output = max(0, x)

torch.nn.PReLU(num_parameters=1, init=0.25)：$PReLU(x) = max(0,x) + a * min(0,x)

a是一个可学习参数。当没有声明时，nn.PReLU()在所有的输入中只有一个参数a；如果是nn.PReLU(nChannels)，a将应用到每个输入。

注意：当为了表现更佳的模型而学习参数a时不要使用权重衰减（weight decay）

参数：

num_parameters：需要学习的a的个数，默认等于1
init：a的初始值，默认等于0.25

torch.nn.LeakyReLU(negative_slope=0.01, inplace=False)

对输入的每一个元素运用$f(x) = max(0, x) + {negative_slope} * min(0, x)$

参数：

torch.nn.ReLU6(inplace=False): output = min(max(0,x), 6)

torch.nn.RReLU(lower=0.125, upper=0.3333333333333333, inplace=False)

$RReLU(x)=\left\{\begin{matrix} x (if x>0)& & \\ ax (other) & & \end{matrix}\right.$

where aa is randomly sampled from uniform distribution U(lower,upper)U(lower,upper).

See: https://arxiv.org/pdf/1505.00853.pdf