Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

Kristoffer K. Wickstrøm*, Sigurd Løkse, Michael C. Kampffmeyer, Shujian Yu, José C. Príncipe, Robert Jenssen

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.

Original languageEnglish
Article number899
Pages (from-to)1-21
Number of pages21
JournalEntropy
Volume25
Issue number6
Early online date3 Jun 2023
DOIs
Publication statusPublished - Jun 2023

Bibliographical note

Funding Information:
This work was supported by The Research Council of Norway (RCN) through its Centre for Research-based Innovation funding scheme (grant number 309439) and Consortium Partners; RCN FRIPRO (grant number 315029); RCN IKTPLUSS (grant number 303514) and the UiT Thematic Initiative.

Publisher Copyright:
© 2023 by the authors.

Funding

This work was supported by The Research Council of Norway (RCN) through its Centre for Research-based Innovation funding scheme (grant number 309439) and Consortium Partners; RCN FRIPRO (grant number 315029); RCN IKTPLUSS (grant number 303514) and the UiT Thematic Initiative.

Keywords

  • deep learning
  • information plane
  • information theory
  • kernels methods

Fingerprint

Dive into the research topics of 'Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy'. Together they form a unique fingerprint.

Cite this