Insights regarding overfitting on noise in deep learning

Theunissen, Marthinus W.; Davel, Marelie H.; Barnard, Etienne

Date

2019-12

Author

Theunissen, Marthinus W.

Davel, Marelie H.

Barnard, Etienne

Metadata

Show full item record

Abstract

The understanding of generalization in machine learning is in a state of flux. This is partly due to the relatively recent revelation that deep learning models are able to completely memorize training data and still perform appropriately on out-of-sample data, thereby contra-dicting long-held intuitions about generalization. The phenomenon was brought to light and discussed in a seminal paper by Zhang et al. [24]. We expand upon this work by discussing local attributes of neural net-work training within the context of a relatively simple and generalizable framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the global deep learning model to generalize in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments in- volving overparameterized multilayer perceptrons and controlled noise in the training data. The main insights are that deep learning models are optimized for train-ing data modularly, with different regions in the function space dedicated to fitting distinct kinds of sample information. Detrimental overfitting is largely prevented by the fact that different regions in the function space are used for prediction based on the similarity between new input data and that which has been optimized for.

URI

http://hdl.handle.net/10394/33955

Collections

Faculty of Engineering [1123]