skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data base simulator

Technical Report ·
DOI:https://doi.org/10.2172/5221737· OSTI ID:5221737

This document describes the features of and input to a computer program written for the purpose of generating data bases whose data values contain deterministically known errors. The development of the computer program was motivated by the need to assess automatic data editing procedures for data validation of real data bases. The observed values in the simulated data are the sum of generated true values and generated error values. For a given variable, true data values may be generated by any of the following six methods: frequency distribution, conditional frequency distribution, analysis of variance model, multiple regression model, ARIMA time series model, membership within a defined constrained region. The error values for a given variable may be simulated from an independent distribution or from a distribution dependent upon the error values of other specified variables. The computer program described can be used to satisfy other needs in the area of data simulation beyond the specific need expressed above. Since the addition of errors to the true values is optional, one may readily simulate observed data for variables using one or more of the six previously listed methods.

Research Organization:
Union Carbide Corp., Oak Ridge, TN (USA). Computer Sciences Div.
DOE Contract Number:
W-7405-ENG-26
OSTI ID:
5221737
Report Number(s):
ORNL/CSD/TM-171; ON: DE82013342
Resource Relation:
Other Information: Portions of document are illegible
Country of Publication:
United States
Language:
English