Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression

Conference ·
OSTI ID:1111451

Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.

Research Organization:
Oak Ridge National Laboratory (ORNL)
Sponsoring Organization:
EE USDOE - Office of Energy Efficiency and Renewable Energy (EE)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1111451
Country of Publication:
United States
Language:
English

Similar Records

Approximate l-Fold Cross-Validation with Least Squares SVM and Kernel Ridge Regression
Conference · Thu Apr 10 00:00:00 EDT 2014 · 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1 · OSTI ID:1567338

Randomized Sampling for Large Data Applications of SVM
Conference · Sat Dec 31 23:00:00 EST 2011 · OSTI ID:1059336

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression
Journal Article · Mon Aug 06 00:00:00 EDT 2018 · Proceedings - IEEE International Parallel and Distributed Processing Symposium (IPDPS) · OSTI ID:1563957

Related Subjects