NWU Institutional Repository

Contrasting Convolutional Neural Networks with alternative architectures for transformation invariance

dc.contributor.advisorDavel, M.H.en_US
dc.contributor.authorMouton, Coenraaden_US
dc.contributor.researchID23607955 - Davel, Marelie Hattingh (Supervisor)en_US
dc.date.accessioned2021-11-09T14:08:02Z
dc.date.available2021-11-09T14:08:02Z
dc.date.issued2021en_US
dc.descriptionMEng (Computer and Electronic Engineering), North-West University, Potchefstroom Campus
dc.description.abstractConvolutional Neural Networks (CNNs) have become the standard for image classification tasks, however, they are not completely invariant to transformations of the input image. We empirically investigate to which degree CNNs can handle transformed input images, and also compare their abilities to multilayer perceptrons (MLPs) and spatial transformer networks (STNs). We measure invariance to three affine transformations, namely: translation, rotation and scale; and specifically focus on translation. The lack of translation invariance in CNNs is attributed to the use of stride which sub-samples the input, resulting in a loss of information, and fully connected layers which lack spatial reasoning. We first theoretically show that stride can greatly benefit translation invariance given that it is combined with sufficient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We then empirically verify this hypothesis, and also observe that this characteristic is dataset-specific, which dictates the required relationship between pooling kernel size and stride for translation invariance. Furthermore we find that a trade-off exists between generalization and translation invariance in the case of pooling kernel size and stride, as larger kernel sizes and strides lead to better invariance but poorer generalization. We then compare the translation, scale, and rotation invariance of CNNs to STN-CNNs and MLPs. As expected, we find that MLPs fair far worse than CNNs and STN-CNNs in terms of transformation invariance and generalization. We find that STNs can improve the transformation invariance of a CNN architecture, given that it is exposed to enough transformed samples during the training process. Furthermore, we observe that without explicit regularization, STNs do not provide any benefits over CNNs in terms of generalization ability.
dc.description.thesistypeMastersen_US
dc.identifier.urihttps://orcid.org/0000-0001-8610-2478en_US
dc.identifier.urihttp://hdl.handle.net/10394/37744
dc.language.isoenen_US
dc.publisherNorth-West University (South Africa)en_US
dc.subjectConvolutional Neural Network
dc.subjectSpatial Transformer Network
dc.subjecttransformation invariance
dc.subjectscale invariance
dc.subjectrotation invariance
dc.subjecttranslation invariance
dc.subjectarchitectural comparison
dc.subjectsubsampling
dc.titleContrasting Convolutional Neural Networks with alternative architectures for transformation invarianceen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mouton_C.pdf
Size:
1.84 MB
Format:
Adobe Portable Document Format
Description:

Collections