Abstract
Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets
- Release Date:
- 2014-03-30
- Project Type:
- Open Source, Publicly Available Repository
- Software Type:
- Scientific
- Licenses:
-
Other (Commercial or Open-Source): https://github.com/matex-org/matex/blob/master/LICENSE
- Sponsoring Org.:
-
USDOEPrimary Award/Contract Number:AC05-76RL01830
- Code ID:
- 2638
- Site Accession Number:
- 5118
- Research Org.:
- Pacific Northwest National Laboratory
- Country of Origin:
- United States
Citation Formats
Machine Learning Toolkit for Extreme Scale.
Computer Software.
https://github.com/matex-org/matex.
USDOE.
30 Mar. 2014.
Web.
doi:10.11578/dc.20171025.1458.
(2014, March 30).
Machine Learning Toolkit for Extreme Scale.
[Computer software].
https://github.com/matex-org/matex.
https://doi.org/10.11578/dc.20171025.1458.
"Machine Learning Toolkit for Extreme Scale." Computer software.
March 30, 2014.
https://github.com/matex-org/matex.
https://doi.org/10.11578/dc.20171025.1458.
@misc{
doecode_2638,
title = {Machine Learning Toolkit for Extreme Scale},
author = ,
abstractNote = {Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets},
doi = {10.11578/dc.20171025.1458},
url = {https://doi.org/10.11578/dc.20171025.1458},
howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20171025.1458}},
year = {2014},
month = {mar}
}