,

简介ResNet是何mndys大神于2015年提出的网络结构，获得ILSVRC-2015分类任务之首，同时还获得了ImageNet detection、ImageNet localization、COCO detection

ResNet又名残差神经网络，它将残差学习(residual learning )思想引入传统的卷积神经网络中，解决深层网络中的梯度方差和精度下降)问题，使网络越来越深

出发点随着网络的深入，梯度方差问题越来越严重，网络很难收敛或收敛。梯度方差问题目前有许多解决方案，如网络初始标准化、数据标准化、中间层标准化等。但是，网络加深还有另一个问题。随着网络的加深，出现训练集精度降低的现象。如下图所示，

很多同学肯定都作出了第一反应：“这不是过度拟合吗？” 其实，这不是过拟合造成的。过拟合通常意味着模型在训练集中工作良好，而在测试集中工作不顺利。大神针对这个问题提出了残差学习的思想。

残差学习指的是什么？

残差学习的思想在上图中，可以将其视为一个块，定义如下。

残差学习块总共包含两个分支或两个映射。

1. identity mapping是上图右侧的曲线。 yldsg，identity mapping是指自身的映射，也就是指自身；

2. residual mapping是另一个分支，也就是部分，这一部分称为残差映射。

为什么残差学习可以解决“网络加深准确率下降”的问题？

对于给定的神经网络模型，如果该模型最优，训练可以很容易地将逼真映射优化为0。此时，如果只剩下identity mapping，无论增加多少深度，理论上网络总是处于最佳状态。由于所有后续添加的网络都相当于沿着identity mapping (自身)传输信息，因此可以理解为最佳网络的所有后续层数都已被丢弃，实际上并不怎么起作用。这样，网络的性能也不会随着深度的增加而降低。

网的结构文中出现了“Shortcut Connection”这个名词，实际上是指identity mapping。我在这里说明一下，被免除的大家稍后会进行confuse。针对不同深度的ResNet，作者提出了两种Residual Block :

对上图作如下说明。

1 .左图为基本的residual block，residual mapping是两个64通道的3x3卷积，输入输出也是64通道，可以直接相加。该block主要用于相对浅层的网络中，如ResNet-34；

2 .右图为针对深层网络提出的block，称为“bottle neck”block，主要目的是降维首先通过1x1卷积将256维通道(channel )转换为64个通道

从上面的介绍中可以看到，重建映射和身份映射沿着通道维相加。通道维度不相同怎么办？

作者在identity mapping部分提出使用1x1卷积进行处理，说明如下。

这里，是指1x1卷积操作。

下图为VGG-19、plain-34 (不使用residual结构)和ResNet-34网络结构的比较。

对上图作如下说明。

与VGG-19相比，ResNet可以通过使用全局平均池化层而不使用所有连接层来减少大量参数。 VGG-19大量参数集中在所有连接层；

2. ResNet-34中“实线”表示identity mapping和residual mapping的通道数相同，“虚线”的部分表示两者的通道数不同

论文共提出5种ResNet网络，网络参数统计表如下。

代码实现在本节中，您将使用keras实现ResNet-18。

from keras.layersimportinputfromkeras.layersimportconv 2d，MaxPool2D，Dense，BatchNor

malization, Activation, add, GlobalAvgPool2Dfrom keras.models import Modelfrom keras import regularizersfrom keras.utils import plot_modelfrom keras import backend as Kdef conv2d_bn(x, nb_filter, kernel_size, strides=(1, 1), padding='same'): """ conv2d -> batch normalization -> relu activation """ x = Conv2D(nb_filter, kernel_size=kernel_size, strides=strides, padding=padding, kernel_regularizer=regularizers.l2(0.0001))(x) x = BatchNormalization()(x) x = Activation('relu')(x) return xdef shortcut(input, residual): """ shortcut连接，也就是identity mapping部分。 """ input_shape = K.int_shape(input) residual_shape = K.int_shape(residual) stride_height = int(round(input_shape[1] / residual_shape[1])) stride_width = int(round(input_shape[2] / residual_shape[2])) equal_channels = input_shape[3] == residual_shape[3] identity = input # 如果维度不同，则使用1x1卷积进行调整 if stride_width > 1 or stride_height > 1 or not equal_channels: identity = Conv2D(filters=residual_shape[3], kernel_size=(1, 1), strides=(stride_width, stride_height), padding="valid", kernel_regularizer=regularizers.l2(0.0001))(input) return add([identity, residual])def basic_block(nb_filter, strides=(1, 1)): """ 基本的ResNet building block，适用于ResNet-18和ResNet-34. """ def f(input): conv1 = conv2d_bn(input, nb_filter, kernel_size=(3, 3), strides=strides) residual = conv2d_bn(conv1, nb_filter, kernel_size=(3, 3)) return shortcut(input, residual) return fdef residual_block(nb_filter, repetitions, is_first_layer=False): """ 构建每层的residual模块，对应论文参数统计表中的conv2_x -> conv5_x """ def f(input): for i in range(repetitions): strides = (1, 1) if i == 0 and not is_first_layer: strides = (2, 2) input = basic_block(nb_filter, strides)(input) return input return fdef resnet_18(input_shape=(224,224,3), nclass=1000): """ build resnet-18 model using keras with TensorFlow backend. :param input_shape: input shape of network, default as (224,224,3) :param nclass: numbers of class(output shape of network), default as 1000 :return: resnet-18 model """ input_ = Input(shape=input_shape) conv1 = conv2d_bn(input_, 64, kernel_size=(7, 7), strides=(2, 2)) pool1 = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv1) conv2 = residual_block(64, 2, is_first_layer=True)(pool1) conv3 = residual_block(128, 2, is_first_layer=True)(conv2) conv4 = residual_block(256, 2, is_first_layer=True)(conv3) conv5 = residual_block(512, 2, is_first_layer=True)(conv4) pool2 = GlobalAvgPool2D()(conv5) output_ = Dense(nclass, activation='softmax')(pool2) model = Model(inputs=input_, outputs=output_) model.summary() return modelif __name__ == '__main__': model = resnet_18() plot_model(model, 'ResNet-18.png') # 保存模型图