The Role of Statistical Theory in Understanding Deep Learning

Research Seminar

Time and Place: February 13, 2024; 15:45h; SR 2.059, Mathematics Building 20.30

Abstract:

In recent years, there has been a surge of interest across different research areas to improve the theoretical understanding of deep learning. A very promising approach is the statistical one, which interprets deep learning as a nonlinear or nonparametric generalization of existing statistical models. For instance, a simple fully connected neural network is equivalent to a recursive generalized linear model with a hierarchical structure. Given this connection, many papers in recent years derived convergence rates of neural networks in a nonparametric regression or classification setting. Nevertheless, phenomena like overparameterization seem to contradict the statistical principle of bias-variance trade-off. Therefore, deep learning cannot only be explained by existing techniques of mathematical statistics but also requires a radical overthinking. In this talk we will explore both, the importance of statistics for the understanding of deep learning, as well as its limitations, i.e., the necessity to connect with other research areas.