Document Type : Research article

Authors

Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany

Abstract

The predictive performance of a neural network depends on the one hand on the difficulty of a problem, defined by the number of classes and complexity of the visual domain, and on the other hand on the capacity of the model, determined by the number of parameters and its structure. By applying layer saturation and logistic regression probes, we confirm that these factors influence the inference process in an antagonistic manner. This analysis allows the detection of over- and under-parameterization of convolutional neural networks. We show that the observed effects are independent of previously reported pathological patterns, like the “tail pattern”. In addition, we study the emergence of saturation patterns during training, showing that saturation patterns emerge early in the optimization process. This allows for quick detection of problems and potentially decreased cycle time during experiments. We also demonstrate that the emergence of tail patterns is independent of the capacity of the networks. Finally, we show that information processing within a tail of unproductive layers is different, depending on the topology of the neural network architecture.

Highlights

  • Study on the properties of principal eigenfeatures and the saturation metric – an architecture-independent approach that allows for quick detection of over- and under- parametrization of neural networks
  • The effectiveness of principal eigenfeatures is demonstrated using an autoencoder
  • Model capacity and problem difficulty have opposite effects on the saturation value
  • The final saturation patterns are visible very early in the training process of fully connected networks, while in convolutional neural networks, there is a converging behavior towards a final pattern
  • The neural networks are unable to shift processing to other layers when some layers have higher or lower capacity

Keywords

Main Subjects