skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SchemaOnRead: A Package for Schema-on-Read in R

Abstract

Schema-on-read is an agile approach to data storage and retrieval that defers investments in data organization until production queries need to be run by working with data directly in native form. Schema-on-read functions have been implemented in a wide range of analytical systems, most notably Hadoop. SchemaOnRead is a CRAN package that uses R’s flexible data representations to provide transparent and convenient support for the schema-on-read paradigm in R. The schema-on- read tools within the package include a single function call that recursively reads folders with text, comma separated value, raster image, R data, HDF5, NetCDF, spreadsheet, Weka, Epi Info, Pajek network, R network, HTML, SPSS, Systat, and Stata files. The provided tools can be used as-is or easily adapted to implement customized schema-on-read tool chains in R. This paper’s contribution is that it introduces and describes SchemaOnRead, the first R package specifically focused on providing explicit schema-on-read support in R.

Authors:
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE; Argonne National Laboratory
OSTI Identifier:
1392324
DOE Contract Number:
AC02-06CH11357
Resource Type:
Journal Article
Resource Relation:
Journal Name: The R journal; Journal Volume: 8; Journal Issue: 1
Country of Publication:
United States
Language:
English
Subject:
Data Science; R; schema-on-read

Citation Formats

North, Michael J. SchemaOnRead: A Package for Schema-on-Read in R. United States: N. p., 2016. Web.
North, Michael J. SchemaOnRead: A Package for Schema-on-Read in R. United States.
North, Michael J. 2016. "SchemaOnRead: A Package for Schema-on-Read in R". United States. doi:.
@article{osti_1392324,
title = {SchemaOnRead: A Package for Schema-on-Read in R},
author = {North, Michael J.},
abstractNote = {Schema-on-read is an agile approach to data storage and retrieval that defers investments in data organization until production queries need to be run by working with data directly in native form. Schema-on-read functions have been implemented in a wide range of analytical systems, most notably Hadoop. SchemaOnRead is a CRAN package that uses R’s flexible data representations to provide transparent and convenient support for the schema-on-read paradigm in R. The schema-on- read tools within the package include a single function call that recursively reads folders with text, comma separated value, raster image, R data, HDF5, NetCDF, spreadsheet, Weka, Epi Info, Pajek network, R network, HTML, SPSS, Systat, and Stata files. The provided tools can be used as-is or easily adapted to implement customized schema-on-read tool chains in R. This paper’s contribution is that it introduces and describes SchemaOnRead, the first R package specifically focused on providing explicit schema-on-read support in R.},
doi = {},
journal = {The R journal},
number = 1,
volume = 8,
place = {United States},
year = 2016,
month = 8
}
  • Schema-on-read is an agile approach to data storage and retrieval that defers investments in data organization until production queries need to be run by working with data directly in native form. Schema-on-read functions have been implemented in a wide range of analytical systems, most notably Hadoop. SchemaOnRead is a CRAN package that uses R’s flexible data representations to provide transparent and convenient support for the schema-on-read paradigm in R. The schema-on- read tools within the package include a single function call that recursively reads folders with text, comma separated value, raster image, R data, HDF5, NetCDF, spreadsheet, Weka, Epi Info,more » Pajek network, R network, HTML, SPSS, Systat, and Stata files. The provided tools can be used as-is or easily adapted to implement customized schema-on-read tool chains in R. This paper’s contribution is that it introduces and describes SchemaOnRead, the first R package specifically focused on providing explicit schema-on-read support in R.« less
  • This article presents one form of computational schema for recognition algorithms in the form of modified Petri nets. Associated concepts are introduced and a description is given of the fundamental properties of the proposed nets that make it possible to treat them as schema for programs for artificial intelligence systems. 13 refs., 4 figs.