Deep Learning Generalization, Extrapolation, and Over-parameterization

R Yousefzadeh - arXiv preprint arXiv:2203.10366, 2022 - arxiv.org
… The training loss function of these models has infinite … models to explain the generalization
of deep networks [4, 3, 2, 11, 20… on the output of a neural network in under-parameterized and …

The benefits of over-parameterization at initialization in deep ReLU networks

D Arpit, Y Bengio - arXiv preprint arXiv:1901.03611, 2019 - arxiv.org
… ) networks are popular in deep learning due to their ease of training and stateof-the-art
generalization… However for our purpose, we do not need to restrict l(.) to a specific choice, we only …

Methods and analysis of the first competition in predicting generalization of deep learning

…, S Yak, H Mobahi, B Neyshabur… - NeurIPS 2020 …, 2021 - proceedings.mlr.press
… the generalization measures are functions that are evaluated on a trained neural network and
… data points, we will refer to a set of neural networks trained on the same dataset as a task. …

Regularization matters: Generalization and optimization of neural nets vs their induced kernel

C Wei, JD Lee, Q Liu, T Ma - Advances in Neural …, 2019 - proceedings.neurips.cc
neural network generalization defies conventional explanations and requires new ones.
Neyshabur … NTK prediction function and apply even with infinite over-parametrization for both …

Fast convergence of natural gradient descent for over-parameterized neural networks

G Zhang, J Martens, RB Grosse - Advances in Neural …, 2019 - proceedings.neurips.cc
… order optimization to speed up training [Becker and LeCun… -convex function is an NP-complete
problem, and neural network … • We analyze the generalization properties of NGD, showing …

Finite versus infinite neural networks: an empirical study

J Lee, S Schoenholz, J Pennington… - … in Neural …, 2020 - proceedings.neurips.cc
neural networks. Because of this, we believe they will continue to play a transformative role
in … We quantified phenomena having to do with generalization, architecture dependendence, …

Generalization bounds for deep convolutional neural networks

PM Long, H Sedghi - arXiv preprint arXiv:1905.12600, 2019 - arxiv.org
… on the generalization error of convolutional networks. The … to role of overparametrization
on generalization (Neyshabur2019). An explanation of this phenomenon that is consistent …

Random deep neural networks are biased towards simple functions

G De Palma, B Kiani, S Lloyd - Advances in Neural …, 2019 - proceedings.neurips.cc
… wide deep neural networks with ReLU activation function are biased towards simple functions.
… [10] explores the generalization properties of deep neural networks trained on partially …

Observational overfitting in reinforcement learning

X Song, Y Jiang, S Tu, Y Du, B Neyshabur - arXiv preprint arXiv …, 2019 - arxiv.org
Generalization for RL has recently grown to be an important … -linear function approximator
such as a neural network. On … in a synthetic environment and neural networks such as multi-…

The generalization-stability tradeoff in neural network pruning

B Bartoldson, A Morcos, A Barbu… - Advances in Neural …, 2020 - proceedings.neurips.cc
… Consistent with the nature of the pruning algorithm playing a role in generalization, we …
generalization, 2019. [13] Zeyuan Allen-Zhu, Yuanzhi Li, and Yingyu Liang. Learning and …