Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Approximate l-Fold Cross-Validation with Least Squares SVM and Kernel Ridge Regression

Conference · · 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1

Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers' scalability. However, Least Squares Support Vector Machines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyper parameters are still major problems. We address these problems by introducing an O(nlog n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm's computational complexity and present empirical runtimes on data sets with approximately one million data points. We also validate our approximate method's effectiveness at selecting hyper parameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi level circulant kernel approximation to solve LS-SVM problems with hyper parameters selected using our method.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
OSTI ID:
1567338
Journal Information:
2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, Journal Name: 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1
Country of Publication:
United States
Language:
English

Similar Records

Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression
Conference · Mon Dec 31 23:00:00 EST 2012 · OSTI ID:1111451

Scalable Hyper-parameter Estimation for Gaussian Process Based Time Series Analysis
Conference · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:1081657

Randomized Sampling for Large Data Applications of SVM
Conference · Sat Dec 31 23:00:00 EST 2011 · OSTI ID:1059336

Related Subjects