Statistics Seminar - 11-27-17 Mondays, 11-12 in SH304
Yuhong Yang, University of Minnesota
"Cross-Validation for Optimal and Reproducible Statistical Learning"
In data mining and statistical learning, we frequently encounter the task of comparing different methods/algorithms to reach a final choice for pure prediction or a scientific understanding/interpretation of a regression relationship. Cross-validation provides a powerful tool to address the matter. Unfortunately, there are seemingly widespread misconceptions on its use, which can lead to unreliable conclusions. In this talk, we will address the subtle issues involved and present results of minimax optimal regression learning and consistent selection of the best method for the data. In addition, we will propose proper cross-validation tools for model selection diagnostics that will cry foul at an impressive-looking but not really reproducible outcome from a sparse-pattern-hunting method in the wild west of learning with a huge number of covariates.
Yuhong Yang received his Ph.D from Yale in statistics in 1996. He then joined
the Department of Statistics at Iowa State University and moved to the University of Minnesota in 2004.
He has been full professor there since 2007. His research interests include model selection, multi-armed
bandit problems, forecasting, high-dimensional data analysis, and machine learning. He has published
in journals in several fields, including Annals of Statistics, JASA, Biometrika, JRSSB, IEEE Transaction on Information Theory,
Journal of Econometrics, Journal of Approximation Theory, Proceedings of AMS, Journal of Machine Learning Research, and
International Journal of Forecasting. He has served on editorial boards of several journals, including Annals of Statistics, Annals of Institute of Statistical Mathematics,
Statistica Sinica, and Statistics Survey. He is a fellow of Institute of Mathematical Statistics.