Machine Learning Toolkit for Extreme Scale

RESOURCE

Abstract

Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets
Release Date:
2014-03-30
Project Type:
Open Source, Publicly Available Repository
Software Type:
Scientific
Licenses:
Other (Commercial or Open-Source): https://github.com/matex-org/matex/blob/master/LICENSE
Sponsoring Org.:
Code ID:
2638
Site Accession Number:
5118
Research Org.:
Pacific Northwest National Laboratory
Country of Origin:
United States

RESOURCE

Citation Formats

Machine Learning Toolkit for Extreme Scale. Computer Software. https://github.com/matex-org/matex. USDOE. 30 Mar. 2014. Web. doi:10.11578/dc.20171025.1458.
(2014, March 30). Machine Learning Toolkit for Extreme Scale. [Computer software]. https://github.com/matex-org/matex. https://doi.org/10.11578/dc.20171025.1458.
"Machine Learning Toolkit for Extreme Scale." Computer software. March 30, 2014. https://github.com/matex-org/matex. https://doi.org/10.11578/dc.20171025.1458.
@misc{ doecode_2638,
title = {Machine Learning Toolkit for Extreme Scale},
author = ,
abstractNote = {Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets},
doi = {10.11578/dc.20171025.1458},
url = {https://doi.org/10.11578/dc.20171025.1458},
howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20171025.1458}},
year = {2014},
month = {mar}
}