keras搬砖系列-Resnet-34详解

keras搬砖系列-Resnet-34详解

残差网络与传统网络相比加入了一个y=x层,主要作用是随着网络深度的增加,而不断退化。还有比较好的收敛效果。

其实我觉得出发点应该就是防止由于网络过深导致梯度爆炸/消失

之前做过一个实验对于某些物体8层与16层效果相同,这时候你应该查看权重看是否梯度爆炸或者消失,我觉得直线型的网络其实很优,简单粗暴,但是面对复杂的数据集,或者想要让精度更加高,或者运行速度更加快就得通过简单的东西复杂化。

对于残差网络与inception流一致,都有一个相同的module。

我们姑且称之为残差module:

其中F(x)表示残差,H(x)表示的是映射输出(网络输出),所以可以得到网络输出为H(x)=F(x,wi)+x

由于基本组成中间有两个隐藏层所以可以得到网络的输出为H(x)=y=w2Z(w1X)+wsX

对于普通网络而言,引入了快捷链接,当输入与输出具有相同维度的时候采用实线连接,可以直接使用快捷链接。维度增加的时候才用虚线连接。

参考:http://blog.csdn.net/circleyuanquan/article/details/60875016

代码:

#coding=utf-8
from keras.models import Model
from keras.layers import Input,Dense,Dropout,BatchNormalization,Conv2D,MaxPooling2D,AveragePooling2D,concatenate,Activation,ZeroPadding2D
from keras.layers import add,Flatten
#from keras.layers.convolutional import Conv2D,MaxPooling2D,AveragePooling2D
import numpy as np
seed = 7
np.random.seed(seed)

def Conv2d_BN(x, nb_filter,kernel_size, strides=(1,1), padding='same',name=None):
    if name is not None:
        bn_name = name + '_bn'
        conv_name = name + '_conv'
    else:
        bn_name = None
        conv_name = None

    x = Conv2D(nb_filter,kernel_size,padding=padding,strides=strides,activation='relu',name=conv_name)(x)
    x = BatchNormalization(axis=3,name=bn_name)(x)
    return x

def Conv_Block(inpt,nb_filter,kernel_size,strides=(1,1), with_conv_shortcut=False):
    x = Conv2d_BN(inpt,nb_filter=nb_filter,kernel_size=kernel_size,strides=strides,padding='same')
    x = Conv2d_BN(x, nb_filter=nb_filter, kernel_size=kernel_size,padding='same')
    if with_conv_shortcut:
        shortcut = Conv2d_BN(inpt,nb_filter=nb_filter,strides=strides,kernel_size=kernel_size)
        x = add([x,shortcut])
        return x
    else:
        x = add([x,inpt])
        return x

inpt = Input(shape=(224,224,3))
x = ZeroPadding2D((3,3))(inpt)
x = Conv2d_BN(x,nb_filter=64,kernel_size=(7,7),strides=(2,2),padding='valid')
x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x)
#(56,56,64)
x = Conv_Block(x,nb_filter=64,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=64,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=64,kernel_size=(3,3))
#(28,28,128)
x = Conv_Block(x,nb_filter=128,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True)
x = Conv_Block(x,nb_filter=128,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=128,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=128,kernel_size=(3,3))
#(14,14,256)
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True)
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=256,kernel_size=(3,3))
#(7,7,512)
x = Conv_Block(x,nb_filter=512,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True)
x = Conv_Block(x,nb_filter=512,kernel_size=(3,3))
x = Conv_Block(x,nb_filter=512,kernel_size=(3,3))
x = AveragePooling2D(pool_size=(7,7))(x)
x = Flatten()(x)
x = Dense(1000,activation='softmax')(x)

model = Model(inputs=inpt,outputs=x)
model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy'])
model.summary()

参考:http://blog.csdn.net/wmy199216/article/details/71171401

如果觉得这篇文章帮到你可以打赏嘛?


版权声明:本文为googler_offer原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。