On Excess Risk Convergence Rates of Neural Network Classifiers

Event Date: 

Monday, March 18, 2024 - 3:30pm to 4:45pm

Event Date Details: 

Monday March 18th, 2024 

Event Location: 

  • Sobel Seminar Room & Zoom

Event Price: 

FREE

Event Contact: 

Dr. Xiaoming Huo

School of Industrial and Systems Engineering

Georgia Institute of Technology (Georgia Tech) 
 
 
  • Department Seminar

The recent success of neural networks (a.k.a. deep learning) in classification problems suggests that neural networks possess qualities distinct from other more classical classifiers, such as SVMs or boosting classifiers. We study the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. Compared to the typical settings imposed in the literature, we consider a more general scenario that resembles actual practice in two respects: first, the function class to be approximated includes the functions with known Kolmogorov-Donoho optimal exponent as a proper subset, hence smooth functions, and second, the neural network classifier constructed is the minimizer of a surrogate loss instead of the 0-1 loss so that gradient descent-based numerical optimizations can be easily applied. We study the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence. In the analysis of the estimation error, we obtain a novel result that relates the approximate excess risk to the approximate excess ϕ-risk, which is of interest on its own. Finally, we show that the rate obtained is, in fact, minimax optimal up to a logarithmic factor, and the lower bound obtained shows the effect of the margin assumption in this regime. This is a joint work with Andy Ko.