Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression
Conference
·
OSTI ID:1111451
- ORNL
Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.
- Research Organization:
- Oak Ridge National Laboratory (ORNL)
- Sponsoring Organization:
- EE USDOE - Office of Energy Efficiency and Renewable Energy (EE)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1111451
- Country of Publication:
- United States
- Language:
- English
Similar Records
Approximate l-Fold Cross-Validation with Least Squares SVM and Kernel Ridge Regression
Randomized Sampling for Large Data Applications of SVM
A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression
Conference
·
Thu Apr 10 00:00:00 EDT 2014
· 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1
·
OSTI ID:1567338
Randomized Sampling for Large Data Applications of SVM
Conference
·
Sat Dec 31 23:00:00 EST 2011
·
OSTI ID:1059336
A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression
Journal Article
·
Sun Aug 05 20:00:00 EDT 2018
· Proceedings - IEEE International Parallel and Distributed Processing Symposium (IPDPS)
·
OSTI ID:1563957