Contrasting Convolutional Neural Networks with alternative architectures for transformation invariance
Abstract
Convolutional Neural Networks (CNNs) have become the standard for image classification tasks, however, they are not completely invariant to transformations of the input image. We empirically investigate to which degree CNNs can handle transformed input images, and also compare their abilities to multilayer perceptrons (MLPs) and spatial transformer networks (STNs). We measure invariance to three affine transformations, namely: translation, rotation and scale; and specifically focus on translation.
The lack of translation invariance in CNNs is attributed to the use of stride which sub-samples the input, resulting in a loss of information, and fully connected layers which lack spatial reasoning. We first theoretically show that stride can greatly benefit translation invariance given that it is combined with sufficient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We then empirically verify this hypothesis, and also observe that this characteristic is dataset-specific, which dictates the required relationship between pooling kernel size and stride for translation invariance. Furthermore we find that a trade-off exists between generalization and translation invariance in the case of pooling kernel size and stride, as larger kernel sizes and strides lead to better invariance but poorer generalization. We then compare the translation, scale, and rotation invariance of CNNs to STN-CNNs and MLPs. As expected, we find that MLPs fair far worse than CNNs and STN-CNNs in terms of transformation invariance and generalization. We find that STNs can improve the transformation invariance of a CNN architecture, given that it is exposed to enough transformed samples during the training process. Furthermore, we observe that without explicit regularization, STNs do not provide any benefits over CNNs in terms of generalization ability.
Collections
- Engineering [1418]
Related items
Showing items related by title, author, creator and subject.
-
Structures matrices in indefinite inner product spaces : simple forms, invariant subspaces and rank–one perturbations
Van Rensburg, Dawid Benjamin Janse (2012)The (definite) inner product between P two vectors x; y 2 Rn is defined by (x,y) = [not able to show]. The length of a vector x 2 Rn is then described by the inner product as [not able to show]. In this thesis the definite ... -
Invariant solutions and conservation laws for soil water redistribution and extraction flow models
Mokgatle, Patrick H.K. (2003)In this dissertation we use Lie symmetry analysis to obtain invariant solutions for certain soil water equations. These solutions are invariant under two-parameter symmetry groups obtained by the group classification of ... -
Non–linear wave equations and their invariant solutions
Botolo, Enock Willy Lesego (2003)We carry out a preliminary group classification of the following family of non-linear wave equations u_tt =f(u_x)u_xx+g(u_x)+x. We first re-obtain the principal Lie algebra obtained by Ibragimov et al[3) and then construct ...