skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information
  1. The Materials Data Facility: Data Services to Advance Materials Science Research

    With increasingly strict data management requirements from funding agencies and institutions, expanding focus on the challenges of research replicability, and growing data sizes and heterogeneity, new data needs are emerging in the materials community. The materials data facility (MDF) operates two cloudhosted services, data publication and data discovery, with features to promote open data sharing, self-service data publication and curation, and encourage data reuse, layered with powerful data discovery tools. The data publication service simplifies the process of copying data to a secure storage location, assigning data a citable persistent identifier, and recording custom (e.g., material, technique, or instrument specific)andmore » automatically-extractedmetadata in a registrywhile the data discovery service will provide advanced search capabilities (e.g., faceting, free text range querying, and full text search) against the registered data and metadata. TheMDF services empower individual researchers, research projects, and institutions to (I) publish research datasets, regardless of size, from local storage, institutional data stores, or cloud storage, without involvement of thirdparty publishers; (II) build, share, and enforce extensible domain-specific custom metadata schemas; (III) interact with published data and metadata via representational state transfer (REST) application program interfaces (APIs) to facilitate automation, analysis, and feedback; and (IV) access a data discovery model that allows researchers to search, interrogate, and eventually build on existing published data. We describe MDF’s design, current status, and future plans.« less
  2. Machine learning algorithms for modeling groundwater level changes in agricultural regions of the U.S.

    Climate, groundwater extraction, and surface water flows have complex nonlinear relationships with groundwater level in agricultural regions. To better understand the relative importance of each driver and predict groundwater level change, we develop a new ensemble modeling framework based on spectral analysis, machine learning, and uncertainty analysis, as an alternative to complex and computationally expensive physical models. We apply and evaluate this new approach in the context of two aquifer systems supporting agricultural production in the United States: the High Plains aquifer (HPA) and the Mississippi River Valley alluvial aquifer (MRVA). We select input data sets by using a combinationmore » of mutual information, genetic algorithms, and lag analysis, and then use the selected data sets in a Multilayer Perceptron network architecture to simulate seasonal groundwater level change. As expected, model results suggest that irrigation demand has the highest influence on groundwater level change for a majority of the wells. The subset of groundwater observations not used in model training or cross-validation correlates strongly (R > 0.8) with model results for 88 and 83% of the wells in the HPA and MRVA, respectively. In both aquifer systems, the error in the modeled cumulative groundwater level change during testing (2003-2012) was less than 2 m over a majority of the area. Here, we conclude that our modeling framework can serve as an alternative approach to simulating groundwater level change and water availability, especially in regions where subsurface properties are unknown.« less
  3. High-Performance data flows using analytical models and measurements

    The combination of analytical models and measurements provide practical configurations and parameters to achieve high data transport rates: (a) buffer sizes and number of parallel streams for improved memory and file transfer rates, (b) Hamilton and Scalable TCP congestion control modules for memory transfers in place of default CUBIC, and (c) direct IO mode for Lustre file systems for wide-area transfers. Conventional parameter selection using full sweeps is impractical in many cases since it takes months. By exploiting the unimodality of throughput profiles, we developed the d-w method that significantly reduces the number of measurements needed for parameter identification. Thismore » heuristic method was effective in practice in reducing the measurements by about 90% for Lustre and XFS file transfers.« less
  4. Accelerating and democratizing science through cloud-based services.

    Many businesses today save time and money, and increase their agility, by outsourcing mundane IT tasks to cloud providers. The author argues that similar methods can be used to overcome the complexities inherent in increasingly data-intensive, computational, and collaborative scientific research. He describes Globus Online, a system that he and his colleagues are developing to realize this vision. he scientific community today has unprecedented opportunities to effect transformational change in how individuals and teams engage in discovery. The driving force is a set of interrelated new capabilities that, when harnessed, can enable dramatic acceleration in the discovery process: greater availabilitymore » of massive data, exponentially faster computers, ultra-high-speed networks, and deep interdisciplinary collaboration. The opportunity - and challenge - is to make these capabilities accessible not just to a few 'big science' projects but to every researcher at every level. Here, I argue that the key to seizing this opportunity is embracing software delivery methods that haven't been widely adopted in research, notably software as a service (SaaS) - a technology that forms an important part of what people refer to as the cloud. I also describe projects in the Computation Institute at the University of Chicago and Argonne National Laboratory that aim to realize this vision, focusing initially on data movement and management.« less
  5. Climate Science for a Sustainable Energy Future Test Bed and Data Infrastructure Final Report

    The collaborative Climate Science for a Sustainable Energy Future (CSSEF) project started in July 2011 with the goal of accelerating the development of climate model components (i.e., atmosphere, ocean and sea ice, and land surface) and enhancing their predictive capabilities while incorporating uncertainty quantification (UQ). This effort required accessing and converting observational data sets into specialized model testing and verification data sets and building a model development test bed, where model components and sub-models can be rapidly evaluated. CSSEF’s prototype test bed demonstrated, how an integrated testbed could eliminate tedious activities associated with model development and evaluation, by providing themore » capability to constantly compare model output—where scientists store, acquire, reformat, regrid, and analyze data sets one-by-one—to observational measurements in a controlled test bed.« less
  6. Propagation of data error and parametric sensitivity in computable general equilibrium model.

    While computable general equilibrium (CGE) models are a well-established tool in economic analyses, it is often difficult to disentangle the effects of policies of interest from that of the assumptions made regarding the underlying calibration data and model parameters. To characterize the behavior of a CGE model of carbon output with respect to two of these assumptions, we perform a large-scale Monte Carlo experiment to examine its sensitivity to base year calibration data and elasticity of substitution parameters in the absence of a policy change. By examining a variety of output variables at different levels of economic and geographic aggregation,more » we assess how these forms of uncertainty impact the conclusions that can be drawn from the model simulations. We find greater sensitivity to uncertainty in the elasticity of substitution parameters than to uncertainty in the base-year data as the projection period increases. While many model simulations were conducted to generate large output samples, we find that few are required to capture the mean model response of the variables tested. However, characterizing standard errors and empirical probability distribution functions is not possible without a large number of simulations.« less
  7. The global gridded crop model intercomparison: Data and modeling protocols for Phase 1 (v1.0)

    We present protocols and input data for Phase 1 of the Global Gridded Crop Model Intercomparison, a project of the Agricultural Model Intercomparison and Improvement Project (AgMIP). The project consist of global simulations of yields, phenologies, and many land-surface fluxes using 12–15 modeling groups for many crops, climate forcing data sets, and scenarios over the historical period from 1948 to 2012. The primary outcomes of the project include (1) a detailed comparison of the major differences and similarities among global models commonly used for large-scale climate impact assessment, (2) an evaluation of model and ensemble hindcasting skill, (3) quantification ofmore » key uncertainties from climate input data, model choice, and other sources, and (4) a multi-model analysis of the agricultural impacts of large-scale climate extremes from the historical record.« less
  8. CIM-EARTH : Framework and case study.

    General equilibrium models have been used for decades to obtain insights into the economic implications of policies and decisions. Despite successes, however, these economic models have substantive limitations. Many of these limitations are due to computational and methodological constraints that can be overcome by leveraging recent advances in computer architecture, numerical methods, and economics research. Motivated by these considerations, we are developing a new modeling framework: the Community Integrated Model of Economic and Resource Trajectories for Humankind (CIM-EARTH). In this paper, we describe the key features of the CIM-EARTH framework and initial implementation, detail the model instance we use formore » studying the impacts of a carbon tax on international trade and the sensitivity of these impacts to assumptions on the rate of change in energy efficiency and labor productivity, and present results on the extent to which carbon leakage limits global reductions in emissions for some policy scenarios.« less
  9. CIM-EARTH: Community integrated model of economic and resource trajectories for humankind.

    Climate change is a global problem with local climatic and economic impacts. Mitigation policies can be applied on large geographic scales, such as a carbon cap-and-trade program for the entire U.S., on medium geographic scales, such as the NOx program for the northeastern U.S., or on smaller scales, such as statewide renewable portfolio standards and local gasoline taxes. To enable study of the environmental benefits, transition costs, capitalization effects, and other consequences of mitigation policies, we are developing dynamic general equilibrium models capable of incorporating important climate impacts. This report describes the economic framework we have developed and the currentmore » Community Integrated Model of Economic and Resource Trajectories for Humankind (CIM-EARTH) instance.« less

Search for:
All Records
Creator / Author
"Foster, I."

Refine by:
Resource Type
Publication Date
Creator / Author
Research Organization