CNN发展史脉络 概述图整理

2023-12-13 12:37:38




CNN发展史脉络概述图整理,学习心得,供参考,错误请批评指正。







相关论文:

LeNet:Handwritten Digit Recognition with a Back-Propagation Network;
Gradient-Based Learning Applied to Document Recognition(CNN的起点);

AlexNet:ImageNet Classification with Deep Convolutional Neural Networks(奠定CNN的基础);

OverFeat:OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks;

ZFNet:isualizing and Understanding Convolutional Networks(在AlexNet基础上做可视化、可解释
相关工作);

VGG:VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION(将模块堆叠到极致);

Inception V1/GoogLeNet:Going deeper with convolutions(开始剑走偏锋,提出一些非常规的分解、并行模块,Inception架构的基础);

BN-Inception:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(Inception+Batch Normalization);

Inception V2/Inception V3:Rethinking the Inception Architecture for Computer Vision(上承Inception-V1,下启Inception-V4和Xception,继续对模块进行分解);

Inception-V4, Inception-ResNet:Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning(纯Inception block、结合ResNet和Inception);

Xception:Deep Learning with Depthwise Separable Convolutions(Xception:extreme inception,分解到极致的Inception);

ResNet V1:Deep Residual Learning for Image Recognition(何凯明,提出残差连接概念 ResNet系列开山之作);

ResNet V2:Identity Mappings in Deep Residual Networks(何凯明,在V1的基础上进行改进,和V1同一个作者);

DenseNet:Densely Connected Convolutional Networks;

ResNeXt:Aggregated Residual Transformations for Deep Neural Networks(何凯明团队);

DualPathNet:Dual Path Networks;

SENet:queeze-and-Excitation Networks(提出SE模块,可以便捷的插入其他网络,由此有了一系列SE-X网络);

Res2Net:Res2Net: A New Multi-scale Backbone Architecture;

ResNeSt:ResNeSt:Split-Attention Networks(集大成者);

NAS:NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING(神经网络搜索的开山作之 有人工智能设计网络);

NASNet:Learning Transferable Architectures for Scalable Image Recognition(将预测Layer参数改为预测block参数);

MnasNet:Platform-Aware Neural Architecture Search for Mobile(适用于算力受限的设备——移动端等);

MobileNets系列:
MobileNet V1: Efficient Convolutional Neural Networks for Mobile Vision Applications;

MobileNetV2:Inverted Residuals and Linear Bottlenecks;

MobileNetV3:Searching for MobileNetV3(用人工智能搜索出的架构);

SqueezeNet:ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE(与AlexNet同等精度,参数量比AlexNet小50倍,模型尺寸< 0.5MB的网络);

ShuffleNet V1:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices;

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design;

EfficientNet V1:EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks;

EfficientNetV2: Smaller Models and Faster Training;

Transformer:Attention Is All You Need(开山之作);

ViT:AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE(transformer在CV领域应用的里程碑著作);

Swin:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows(视觉Transformer);

VAN:Visual Attention Network(不是Transformer、只是将Transformer的思想借鉴入CNN中);

PVT:Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(金字塔结构+Transformer);

TNT:Transformer in Transformer;

MLP-Mixer:MLP-Mixer: An all-MLP Architecture for Vision;

ConvMixer:ConvMixer:Patches Are All You Need( 证明 ViT 性能主要归因于使用Patchs作为输入表示的假设);

文章来源:https://blog.csdn.net/COINVK/article/details/134893627
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。