skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Imputing data that are missing at high rates using a boosting algorithm.

Abstract

Abstract not provided.

Authors:
; ; ;
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sandia National Laboratories, Livermore, CA
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1373460
Report Number(s):
SAND2016-7230C
646167
DOE Contract Number:
AC04-94AL85000
Resource Type:
Conference
Resource Relation:
Conference: Proposed for presentation at the Joint Statistical Meetings.
Country of Publication:
United States
Language:
English

Citation Formats

Cauthen, Katherine Regina, Lambert, Gregory Joseph, Ray, Jaideep, and Lefantzi, Sophia. Imputing data that are missing at high rates using a boosting algorithm.. United States: N. p., 2016. Web.
Cauthen, Katherine Regina, Lambert, Gregory Joseph, Ray, Jaideep, & Lefantzi, Sophia. Imputing data that are missing at high rates using a boosting algorithm.. United States.
Cauthen, Katherine Regina, Lambert, Gregory Joseph, Ray, Jaideep, and Lefantzi, Sophia. 2016. "Imputing data that are missing at high rates using a boosting algorithm.". United States. doi:. https://www.osti.gov/servlets/purl/1373460.
@article{osti_1373460,
title = {Imputing data that are missing at high rates using a boosting algorithm.},
author = {Cauthen, Katherine Regina and Lambert, Gregory Joseph and Ray, Jaideep and Lefantzi, Sophia},
abstractNote = {Abstract not provided.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 7
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • There has been much recent interest in the application of artificial intelligence systems to real world problems. Substantial interest has been shown in their application to investment markets. Artificial Neural Networks are the most common technique here. This paper is concerned with the use of ANNs in forecasting exchange rates. Much research has been carried out in currency markets. However, many of the studies use end of day or average quotes for currencies as a basis for prediction. A growing school of thought propose that markets are non-random in the short-term and can be shown to follow patterns. This short-termmore » time span can be described as being a period when the markets are inefficient at price adjustments. The use of intraday data is an ideal testing ground for ANNs based research. This paper aims to study the intraday forecasting of the US Dollar/German Deutschmark and to address the question of whether ANNs can make acceptable predictions. The problems of forecasting in such a complex environment will be addressed.« less
  • A serious problem in mining industrial data bases is that they are often incomplete, and a significant amount of data is missing, or erroneously entered. This paper explores the use of machine-learning based alternatives to standard statistical data completion (data imputation) methods, for dealing with missing data. We have approached the data completion problem using two well-known machine learning techniques. The first is an unsupervised clustering strategy which uses a Bayesian approach to cluster the data into classes. The classes so obtained are then used to predict multiple choices for the attribute of interest. The second technique involves modeling missingmore » variables by supervised induction of a decision tree-based classifier. This predicts the most likely value for the attribute of interest. Empirical tests using extracts from industrial databases maintained by Honeywell customers have been done in order to compare the two techniques. These tests show both approaches are useful and have advantages and disadvantages. We argue that the choice between unsupervised and supervised classification techniques should be influenced by the motivation for solving the missing data problem, and discuss potential applications for the procedures we are developing.« less
  • Abstract not provided.
  • One of the limitations in achieving high energy resolution in X and gamma-ray spectrometers at high counting rates is incomplete charge collection in the germanium detector. In this paper the authors discuss some of the limiting factors which are relevant in detector fabrication and in the optical reset configurations in the input integrator circuit and the associated electronic processing. Experimental results demonstrate input counting rates up to 1.5 million counts per sec without any serious degradation in energy resolution.