Insights regarding overfitting on noise in deep learning
Date
2019-12Author
Theunissen, Marthinus W.
Davel, Marelie H.
Barnard, Etienne
Metadata
Show full item recordAbstract
The understanding of generalization in machine learning is in a state of flux. This is partly due to the relatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contra-dicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24].
We expand upon this work by discussing local attributes of neural net-work training within the context of a relatively simple and generalizable
framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep
learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments in-
volving overparameterized multilayer perceptrons and controlled noise in the training data.
The main insights are that deep learning models are optimized for train-ing data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.
Collections
- Faculty of Engineering [1123]