Exploring neural network training dynamics through binary node activations
Abstract
Each node in a neural network is trained to activate for a specific region in the input domain. Any training samples that fall within
this domain are therefore implicitly clustered together. Recent work has highlighted the importance of these clusters during the training process
but has not yet investigated their evolution during training. Towards this goal, we train several ReLU-activated MLPs on a simple classification
task (MNIST) and show that a consistent training process emerges: (1) sample clusters initially increase in size and then decrease as training
progresses, (2) the size of sample clusters in the first layer decreases more rapidly than in deeper layers, (3) binary node activations, especially of
nodes in deeper layers, become more sensitive to class membership as training progresses, (4) individual nodes remain poor predictors of class
membership, even if accurate when applied as a group. We report on the detail of these findings and interpret them from the perspective of a
high-dimensional clustering process.
Collections
- Faculty of Engineering [1122]