Show simple item record

dc.contributor.advisorBarnard, E.
dc.contributor.advisorDavel, M.H.
dc.contributor.authorTheunissen, Marthinus Wilhelmus
dc.date.accessioned2021-11-30T12:08:59Z
dc.date.available2021-11-30T12:08:59Z
dc.date.issued2021
dc.identifier.urihttps://orcid.org/0000-0002-7456-7769
dc.identifier.urihttp://hdl.handle.net/10394/38073
dc.descriptionPhD (Computer and Electronic Engineering), North-West University, Potchefstroom Campusen_US
dc.description.abstractWe present an investigation of how simple artificial neural networks (specifcally, feed-forward networks with full connections between each successive pair of layers) generalize to out-of-sample data. By emphasizing the substructures formed within these networks we are able to shed light on several phenomena and relevant open questions in the literature. Specifically, we show that hidden units with piecewise linear activation functions are optimized on the train set in a distributed manner, meaning each sub-unit is only optimized to reduce the loss of a specific sub-population of the train set. This mechanism gives rise to a type of modularity that is not often considered in investigations of artificial neural networks and generalization. We are able to uncover informative regularity in sub-unit behavior and elucidate known phenomena such as: different artificial neural networks tend to prioritize similar samples, over-parametization does not necessarily lead to poor generalization, artificial neural networks are able to interpolate large amounts of noise and still generalize appropriately, and generalization error as a function of representational capacity undergoes a second descent beyond the point of interpolation (a.k.a the double descent phenomenon). We motivate a perspective of generalization in deep learning that is less focused on the complexity of hypothesis spaces, and looks to substructures and the manner by which training data is compartmentalized as a method of understanding the observed ability of these networks to generalize. This perspective contradicts classical ideas of generalization and complexity under certain conditions.en_US
dc.language.isoenen_US
dc.publisherNorth-West University (South Africa).en_US
dc.subjectDeep learningen_US
dc.subjectGeneralizationen_US
dc.subjectLearning theoryen_US
dc.subjectInterpolationen_US
dc.titleGeneralization in deep learning : bilateral synergies in MLP learningen_US
dc.typeThesisen_US
dc.description.thesistypeDoctoralen_US
dc.contributor.researchID21021287 - Barnard, Etienne (Supervisor)
dc.contributor.researchID23607955 - Davel, Marelie Hattingh (Supervisor)


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record