Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information
  1. Merged Observatory Data Files (MODFs): an integrated observational data product supporting process-oriented investigations and diagnostics

    A large and ever-growing body of geophysical information is measured in campaigns and at specialized observatories as a part of scientific expeditions and experiments. These collections of observed data include many essential climate variables (as defined by the Global Climate Observing System) but are often distinguished by a wide range of additional non-routine measurements that are designed to not only document the state of the environment but also the drivers that contribute to that state. These field data are used not only to further understand environmental processes through observation-based studies but also to provide baseline data to test model performance and to codify understanding to improve predictive capabilities. To address the considerable barriers and difficulty in utilizing these diverse and complex data for observation–model research, the Merged Observatory Data File (MODF) concept has been developed. A MODF combines measurements from multiple instruments into a single file that complies with well-established data format and metadata practices and has been designed to parallel the development of corresponding Merged Model Data Files (MMDFs). Using the MODF and MMDF protocols will facilitate the evolution of model intercomparison projects into model intercomparison and improvement projects by putting observation and model data “on the same page” in a timely manner. The MODF concept was developed especially for weather forecast model studies in the Arctic. The surprisingly complex process of implementing MODFs in that context refined the concept itself. Thus, this article explains the concept of MODFs by providing details on the issues that were revealed and resolved during that first specific implementation. Detailed instructions are provided on how to make MODFs, and this article can be considered a MODF creation manual.

  2. Special Observing Period (SOP) data for the Year of Polar Prediction site Model Intercomparison Project (YOPPsiteMIP)

    Abstract. The rapid changes occurring in the polar regions require an improved understanding of the processes that are driving these changes. At the same time, increased human activities such as marine navigation, resource exploitation, aviation, commercial fishing, and tourism require reliable and relevant weather information. One of the primary goals of the World Meteorological Organization's Year of Polar Prediction (YOPP) project is to improve the accuracy of numerical weather prediction (NWP) at high latitudes. During YOPP, two Canadian “supersites” were commissioned and equipped with new ground-based instruments for enhanced meteorological and system process observations. Additional pre-existing supersites in Canada, the United States, Norway, Finland, and Russia also provided data from ongoing long-term observing programs. These supersites collected a wealth of observations that are well suited to address YOPP objectives. In order to increase data useability and station interoperability, novel Merged Observatory Data Files (MODFs) were created for the seven supersites over two Special Observing Periods (February to March 2018 and July to September 2018). All observations collected at the supersites were compiled into this standardized NetCDF MODF format, simplifying the process of conducting pan-Arctic NWP verification and process evaluation studies. This paper describes the seven Arctic YOPP supersites, their instrumentation, data collection and processing methods, the novel MODF format, and examples of the observations contained therein. MODFs comprise the observational contribution to the model intercomparison effort, termed YOPP site Model Intercomparison Project (YOPPsiteMIP). All YOPPsiteMIP MODFs are publicly accessible via the YOPP Data Portal (Whitehorse: https://doi.org/10.21343/a33e-j150, Huang et al., 2023a; Iqaluit: https://doi.org/10.21343/yrnf-ck57, Huang et al., 2023b; Sodankylä: https://doi.org/10.21343/m16p-pq17, O'Connor, 2023; Utqiaġvik: https://doi.org/10.21343/a2dx-nq55, Akish and Morris, 2023c; Tiksi: https://doi.org/10.21343/5bwn-w881, Akish and Morris, 2023b; Ny-Ålesund: https://doi.org/10.21343/y89m-6393, Holt, 2023; and Eureka: https://doi.org/10.21343/r85j-tc61, Akish and Morris, 2023a), which is hosted by MET Norway, with corresponding output from NWP models.

  3. Architecture of a Data Portal for Publishing and Delivering Open Data for Atmospheric Measurement

    Atmospheric data are collected by researchers every day. Campaigns such as GOAmazon 2014/2015 and the Amazon Tall Tower Observatory collect essential data on aerosols, gases, cloud properties, and meteorological parameters in the Brazilian Amazon basin. These data products provide insights and essential information for analyzing and predicting natural processes. However, in Brazil, it is estimated that more than 80% of the scientific data collected are not published due to the lack of web portals that collect and store these data. This makes it difficult, or even impossible, to access and integrate the data, which can result in the loss of significant amounts of information and significantly affect the understanding of the overall data. To address this problem, we propose a data portal architecture and open data deployment that enable Big Data processing, human interaction, and download-oriented approaches with tools that help users catalog, publish and visualize atmospheric data. Thus, we describe the architecture developed, based on the experience of the Atmospheric Radiation Measurement Data Center, which incorporates the principles of FAIR, the infrastructure and content management system for managing scientific data. The portal partial results were tested with environmental data from contaminated areas at the University of São Paulo. Overall, this data portal creates more shared knowledge about atmospheric processes by providing users with access to open environmental data.

  4. AGU/AMS Abstract Search and Display Software

    The AGU/AMS Abstract Search and Display Software is a standalone web application which enables the searching, storing, and displaying of abstracts featured at the annual American Geophysical Union (AGU) and American Meteorological Society (AMS) meetings. This application is designed for those who wish to host a standalone web application and feature a select subset of posters and talks scheduled for the AGU/AMS meetings. Please read the entirety of this README.md file before attempting to download and use the application. There are three views available via the UI: Lookup - enables searching and submitting posters for displaying on the summary view Manual Submission - allows individual manual submission of posters given a poster ID Summary - displays all posters submitted by users from the lookup view

  5. Spatial Interpolation of Air Pollutant and Meteorological Variables in Central Amazonia

    The Amazon Rainforest is highlighted by the global community both for its extensive vegetation cover that constantly suffers the effects of anthropic action and for its substantial biodiversity. This dataset presents data of meteorological variables from the Amazon Rainforest region with a spatial resolution of 0.001° in latitude and longitude, resulting from an interpolation process. The original data were obtained from the GoAmazon 2014/5 project, in the Atmospheric Radiation Measurement (ARM) repository, and then processed through mathematical and statistical methods. The dataset presented here can be used in experiments in the field of Data Science, such as training models for predicting climate variables or modeling the distribution of species.

  6. Clustering-Based Predictive Analytics to Improve Scientific Data Discovery

    Given the sheer volume of scientific data archived within the data-intensive projects at the US Department of Energy's Oak Ridge National Laboratory, finding precisely what data we are looking for may not be a trivial task; conversely, we may also miss a more prominent data product. To address such issues, we propose improving the data discovery system and using data analytics methods to comprehend what specific users might be interested in based on their physiological state, search patterns, and past data usage history. This work's primary goal is to prune the complexity, increase the visibility of popular data products, and direct users toward the data that best meet their needs. The proposed algorithm constructs a user profile based on the user's explicit or implicit interactions with the system, such as items they are currently looking at on-site and the key metadata mappings related to the data set. The pattern is then used to build a training data set, which will help find relevant data to recommend to the user.

  7. FAIR data infrastructure and tools for AI-assisted streamflow prediction

    Focal Area(s) Areas: We discuss how the integration of AI into Earth Science models can impact streamflow predictions at both the science and data levels. Doing so, we address cross-cutting needs related to the goal of making data FAIR (Findable, Accessible, Interoperable, and Re-usable [1]) for seamless use with Artificial Intelligence/Machine Learning (AI/ML) in Earth System Science at DOE. A novel idea is that AI/ML itself can help with the FAIR data goal and address issues in targeted areas e.g. missing data, data quality and reduction. In addition, the interpretability of results obtained with new AI methods is poised to impact broader scientific challenges in hydrology..

  8. Automated Indexing of Structured Scientific Metadata Using Apache Solr

    Scientific datasets are continuously growing with the amount of raw data being collected worldwide. This amount of data poses the biggest challenge to web search engines on how to retrieve them efficiently. This paper discusses how major scientific data centers are using popular open-source search platforms such as Solr [1] to retrieve structured data stored in data sources such as relational database management systems using its import handler mechanisms [2]. Additionally, we will also focus on how we can configure Solr to serve advanced full-text, faceted search capabilities, along with its key features, which simplify representing and delivering better performance to the scientific search interfaces.

  9. AI-Driven Data Discovery to Improve Earth System Predictability

    Focal Area(s): (3) Insight gleaned from complex data (both observed and simulated) using AI, big data analytics, and other advanced methods, including explainable AI and physics- or knowledge-guided AI.

  10. AI-Based Upgrades to Observational Data Centers to Facilitate Data Interoperability

    Focal Areas: (1) Data acquisition and assimilation enabled by machine learning, AI, and advanced methods including experimental/network design/optimization, unsupervised learning (including deep learning), and hardware-related efforts involving AI (e.g., edge computing). Focal areas 2 and 3 have critical dependencies to the modernization described. Key benefits to the focal areas: (1) Modernized observatory framework capable of agile adaptive observation, (2) Advanced instrument and data tagging supporting AI data acquisition for assimilation or validation, and (3) Widespread data interoperability bridging Earth system prediction scales


Search for:
All Records
Author / Contributor
"Prakash, Giri"

Refine by:
Resource Type
Availability
Publication Date
  • 2013: 1 results
  • 2014: 3 results
  • 2015: 0 results
  • 2016: 3 results
  • 2017: 1 results
  • 2018: 0 results
  • 2019: 1 results
  • 2020: 2 results
  • 2021: 9 results
  • 2022: 0 results
  • 2023: 1 results
  • 2024: 2 results
2013
2024
Author / Contributor
Research Organization