Classification Algorithms

Links to important articles for quick revision:

Residual Netwoks- ResNet vs ResNeXt , ResNet Variants

ResNet - Vanishing gradient problem is addressed for deeper networks. Identity shortcut connections and pre-activation residual connection are best as per ablation studies
ResNeXt- Cardinality introduced,

split transform aggregate strategy (Layers with 64 width are broken down into 32 cardinality with 4 width), kind of ensemble technique.
Number of FLOPS and Parameters remain same for ResNet50 and ResNeXt-50
Width vs Cardinality - More cardinality preferred over width
ResNeXt-32*4d and ResNeXt-64*4d are mostly used in pytorch

Inception: Inception Blog

Inception V1 - Multiple sized kernels for convolution (More width than depth)
Inception V2 - Factorized Convolutions (3x3 rather than 5x5, nx1 and 1xn rather than nxn), filter banks in the module were expanded
Inception v3 - [2015 CVPR] RMSProp optimizer, factorized 7x7 convolutions, batchnorm in auxillary classifier, label smoothing
Inception v4 - Stem changed to inception blocks, explicit reduction blocks introduced, more uniform Inception A, B, C blocks
Inception Resnet (v1 and v2) - Skip connections were added to each inception block, InceptionResnet-v1 similar to v3 and InceptionResnet-v2 similar to v4 in terms of computational cost
Xception - Xception Blog [2017 CVPR]

Extreme Version of Inception
Reversed Depthwise separable convolution layers used (Pointwise + Depthwise Convolution)
Beats Inception v3 in ImageNet challenge accuracy

MobileNet: MobileNet Blog [2018 CVPR]

MobileNet v1 - Computationally efficient, Depthwise Separable Convolution , width and resolution scaling multipliers
MobileNet v2 - Inverted residual structure, linear bottlenecks, shortcut between bottlenecks
MobileNetv3 [ICCV 2019] Explanation

EfficientNet: Medium Blog Blog2 [2019 ICML]

Efficent Net [B0-B7] - Scale depth, width and resolution as per constrained rule

Data Science