Contrasting Convolutional Neural Networks with alternative architectures for transformation invariance

Mouton, Coenraad

Contrasting Convolutional Neural Networks with alternative architectures for transformation invariance

dc.contributor.advisor	Davel, M.H.	en_US
dc.contributor.author	Mouton, Coenraad	en_US
dc.contributor.researchID	23607955 - Davel, Marelie Hattingh (Supervisor)	en_US
dc.date.accessioned	2021-11-09T14:08:02Z
dc.date.available	2021-11-09T14:08:02Z
dc.date.issued	2021	en_US
dc.description	MEng (Computer and Electronic Engineering), North-West University, Potchefstroom Campus
dc.description.abstract	Convolutional Neural Networks (CNNs) have become the standard for image classiﬁcation tasks, however, they are not completely invariant to transformations of the input image. We empirically investigate to which degree CNNs can handle transformed input images, and also compare their abilities to multilayer perceptrons (MLPs) and spatial transformer networks (STNs). We measure invariance to three aﬃne transformations, namely: translation, rotation and scale; and speciﬁcally focus on translation. The lack of translation invariance in CNNs is attributed to the use of stride which sub-samples the input, resulting in a loss of information, and fully connected layers which lack spatial reasoning. We ﬁrst theoretically show that stride can greatly beneﬁt translation invariance given that it is combined with suﬃcient similarity between neighbouring pixels, a characteristic which we refer to as local homogeneity. We then empirically verify this hypothesis, and also observe that this characteristic is dataset-speciﬁc, which dictates the required relationship between pooling kernel size and stride for translation invariance. Furthermore we ﬁnd that a trade-oﬀ exists between generalization and translation invariance in the case of pooling kernel size and stride, as larger kernel sizes and strides lead to better invariance but poorer generalization. We then compare the translation, scale, and rotation invariance of CNNs to STN-CNNs and MLPs. As expected, we ﬁnd that MLPs fair far worse than CNNs and STN-CNNs in terms of transformation invariance and generalization. We ﬁnd that STNs can improve the transformation invariance of a CNN architecture, given that it is exposed to enough transformed samples during the training process. Furthermore, we observe that without explicit regularization, STNs do not provide any beneﬁts over CNNs in terms of generalization ability.
dc.description.thesistype	Masters	en_US
dc.identifier.uri	https://orcid.org/0000-0001-8610-2478	en_US
dc.identifier.uri	http://hdl.handle.net/10394/37744
dc.language.iso	en	en_US
dc.publisher	North-West University (South Africa)	en_US
dc.subject	Convolutional Neural Network
dc.subject	Spatial Transformer Network
dc.subject	transformation invariance
dc.subject	scale invariance
dc.subject	rotation invariance
dc.subject	translation invariance
dc.subject	architectural comparison
dc.subject	subsampling
dc.title	Contrasting Convolutional Neural Networks with alternative architectures for transformation invariance	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mouton_C.pdf
Size:: 1.84 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Engineering