skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: gcamdata: An R Package for Preparation, Synthesis, and Tracking of Input Data for the GCAM Integrated Human-Earth Systems Model

Abstract

The increasing data requirements of complex models demand robust, reproducible, and transparent systems to track and prepare models’ inputs. Here we describe version 1.0 of the gcamdata R package that processes raw inputs to produce the hundreds of XML files needed by the GCAM integrated human-earth systems model. It features extensive functional and unit testing, data tracing and visualization, and enforces metadata, documentation, and flexibility in its component data-processing subunits. Although this package is specific to GCAM, many of its structural pieces and approaches should be broadly applicable to, and reusable by, other complex model/data systems aiming to improve transparency, reproducibility, and flexibility.

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1];  [2]; ORCiD logo [1]; ORCiD logo [1];  [3];  [1]; ORCiD logo [4]; ORCiD logo [1];  [1]; ORCiD logo [1]; ORCiD logo [1];  [1]; ORCiD logo [1];  [1];  [5]; ORCiD logo [1];  [1];  [1] more »;  [1];  [6];  [1];  [1];  [1]; ORCiD logo [1];  [1];  [1] « less
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Joint Global Change Research Inst.
  2. Univ. of California, Los Angeles, CA (United States)
  3. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Joint Global Change Research Inst.; American Univ., Washington, DC (United States)
  4. Stanford Univ., CA (United States)
  5. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Joint Global Change Research Inst.; Pontifical Catholic Univ. of Valparaiso (Chile)
  6. Columbia Univ., New York, NY (United States)
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); USDOE Office of Fossil Energy (FE); USDOE Office of Nuclear Energy (NE); USDOE Office of Energy Efficiency and Renewable Energy (EERE); USEPA
OSTI Identifier:
1503168
Report Number(s):
PNNL-SA-135553
Journal ID: ISSN 2049-9647
Grant/Contract Number:  
AC05-76RL01830
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Open Research Software
Additional Journal Information:
Journal Volume: 7; Journal Issue: 1; Journal ID: ISSN 2049-9647
Publisher:
Software Sustainability Institute
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 54 ENVIRONMENTAL SCIENCES; 58 GEOSCIENCES

Citation Formats

Bond-Lamberty, Ben, Dorheim, Kalyn, Cui, Ryna, Horowitz, Russell, Snyder, Abigail, Calvin, Katherine, Feng, Leyang, Hoesly, Rachel, Horing, Jill, Kyle, G. Page, Link, Robert, Patel, Pralit, Roney, Christopher, Staniszewski, Aaron, Turner, Sean, Chen, Min, Feijoo, Felipe, Hartin, Corinne, Hejazi, Mohamad, Iyer, Gokul, Kim, Sonny, Liu, Yaling, Lynch, Cary, McJeon, Haewon, Smith, Steven, Waldhoff, Stephanie, Wise, Marshall, and Clarke, Leon. gcamdata: An R Package for Preparation, Synthesis, and Tracking of Input Data for the GCAM Integrated Human-Earth Systems Model. United States: N. p., 2019. Web. doi:10.5334/jors.232.
Bond-Lamberty, Ben, Dorheim, Kalyn, Cui, Ryna, Horowitz, Russell, Snyder, Abigail, Calvin, Katherine, Feng, Leyang, Hoesly, Rachel, Horing, Jill, Kyle, G. Page, Link, Robert, Patel, Pralit, Roney, Christopher, Staniszewski, Aaron, Turner, Sean, Chen, Min, Feijoo, Felipe, Hartin, Corinne, Hejazi, Mohamad, Iyer, Gokul, Kim, Sonny, Liu, Yaling, Lynch, Cary, McJeon, Haewon, Smith, Steven, Waldhoff, Stephanie, Wise, Marshall, & Clarke, Leon. gcamdata: An R Package for Preparation, Synthesis, and Tracking of Input Data for the GCAM Integrated Human-Earth Systems Model. United States. doi:10.5334/jors.232.
Bond-Lamberty, Ben, Dorheim, Kalyn, Cui, Ryna, Horowitz, Russell, Snyder, Abigail, Calvin, Katherine, Feng, Leyang, Hoesly, Rachel, Horing, Jill, Kyle, G. Page, Link, Robert, Patel, Pralit, Roney, Christopher, Staniszewski, Aaron, Turner, Sean, Chen, Min, Feijoo, Felipe, Hartin, Corinne, Hejazi, Mohamad, Iyer, Gokul, Kim, Sonny, Liu, Yaling, Lynch, Cary, McJeon, Haewon, Smith, Steven, Waldhoff, Stephanie, Wise, Marshall, and Clarke, Leon. Thu . "gcamdata: An R Package for Preparation, Synthesis, and Tracking of Input Data for the GCAM Integrated Human-Earth Systems Model". United States. doi:10.5334/jors.232. https://www.osti.gov/servlets/purl/1503168.
@article{osti_1503168,
title = {gcamdata: An R Package for Preparation, Synthesis, and Tracking of Input Data for the GCAM Integrated Human-Earth Systems Model},
author = {Bond-Lamberty, Ben and Dorheim, Kalyn and Cui, Ryna and Horowitz, Russell and Snyder, Abigail and Calvin, Katherine and Feng, Leyang and Hoesly, Rachel and Horing, Jill and Kyle, G. Page and Link, Robert and Patel, Pralit and Roney, Christopher and Staniszewski, Aaron and Turner, Sean and Chen, Min and Feijoo, Felipe and Hartin, Corinne and Hejazi, Mohamad and Iyer, Gokul and Kim, Sonny and Liu, Yaling and Lynch, Cary and McJeon, Haewon and Smith, Steven and Waldhoff, Stephanie and Wise, Marshall and Clarke, Leon},
abstractNote = {The increasing data requirements of complex models demand robust, reproducible, and transparent systems to track and prepare models’ inputs. Here we describe version 1.0 of the gcamdata R package that processes raw inputs to produce the hundreds of XML files needed by the GCAM integrated human-earth systems model. It features extensive functional and unit testing, data tracing and visualization, and enforces metadata, documentation, and flexibility in its component data-processing subunits. Although this package is specific to GCAM, many of its structural pieces and approaches should be broadly applicable to, and reusable by, other complex model/data systems aiming to improve transparency, reproducibility, and flexibility.},
doi = {10.5334/jors.232},
journal = {Journal of Open Research Software},
number = 1,
volume = 7,
place = {United States},
year = {2019},
month = {3}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: High level view of the code-data dependencies in the gcamdata package. This plot of the system architecture shows nodes (“chunks”, units of code charged with processing data and producing specific outputs) and edges (data flows between chunks). Nodes are colored by discipline, e.g., agriculture and land use-related codemore » is black, energy system code is blue, etc. For clarity neither the initial data inputs nor the final XML outputs (i.e. the GCAM input files) are shown; this means that seemingly isolated nodes or groups of nodes actually contribute data directly into the model.« less

Save / Share:

Works referenced in this record:

The case for open computer programs
journal, February 2012

  • Ince, Darrel C.; Hatton, Leslie; Graham-Cumming, John
  • Nature, Vol. 482, Issue 7386
  • DOI: 10.1038/nature10836

Archiving numerical models of biogeochemical dynamics
journal, January 2005

  • Thornton, Peter E.; Cook, Robert B.; Braswell, Bobby H.
  • Eos, Transactions American Geophysical Union, Vol. 86, Issue 44
  • DOI: 10.1029/2005EO440003

The representative concentration pathways: an overview
journal, August 2011


Global energy-climate scenarios and models: a review: Global energy-climate scenarios and models
journal, February 2014

  • Krey, Volker
  • Wiley Interdisciplinary Reviews: Energy and Environment, Vol. 3, Issue 4
  • DOI: 10.1002/wene.98

The SSP4: A world of deepening inequality
journal, January 2017


Programming tools: Adventures with R
journal, December 2014


Historical (1750–2014) anthropogenic emissions of reactive gases and aerosols from the Community Emissions Data System (CEDS)
journal, January 2018

  • Hoesly, Rachel M.; Smith, Steven J.; Feng, Leyang
  • Geoscientific Model Development, Vol. 11, Issue 1
  • DOI: 10.5194/gmd-11-369-2018

Reproducible Research in Computational Science
journal, December 2011


Shedding Light on the Dark Data in the Long Tail of Science
journal, January 2008


How well do integrated assessment models simulate climate change?
journal, December 2009


The Global Land Data Assimilation System
journal, March 2004

  • Rodell, M.; Houser, P. R.; Jambor, U.
  • Bulletin of the American Meteorological Society, Vol. 85, Issue 3
  • DOI: 10.1175/BAMS-85-3-381

A functional test platform for the Community Land Model
journal, May 2014


The rise of research networks
journal, October 2012


    Works referencing / citing this record:

    GCAM v5.1: representing the linkages between energy, water, land, climate, and economic systems
    journal, January 2019

    • Calvin, Katherine; Patel, Pralit; Clarke, Leon
    • Geoscientific Model Development, Vol. 12, Issue 2
    • DOI: 10.5194/gmd-12-677-2019

    GCAM v5.1: representing the linkages between energy, water, land, climate, and economic systems
    journal, January 2019

    • Calvin, Katherine; Patel, Pralit; Clarke, Leon
    • Geoscientific Model Development, Vol. 12, Issue 2
    • DOI: 10.5194/gmd-12-677-2019

      Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.