skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report

Abstract

This research showed the feasibility of applying the concepts of the Semantic Web to Computation Chemistry. We have created the first web portal (www.chemsem.com) that allows data created in the calculations of quantum chemistry, and other such chemistry calculations to be placed on the web in a way that makes the data accessible to scientists in a semantic form never before possible. The semantic web nature of the portal allows data to be searched, found, and used as an advance over the usual approach of a relational database. The semantic data on our portal has the nature of a Giant Global Graph (GGG) that can be easily merged with related data and searched globally via a SPARQL Protocol and RDF Query Language (SPARQL) that makes global searches for data easier than with traditional methods. Our Semantic Web Portal requires that the data be understood by a computer and hence defined by an ontology (vocabulary). This ontology is used by the computer in understanding the data. We have created such an ontology for computational chemistry (purl.org/gc) that encapsulates a broad knowledge of the field of computational chemistry. We refer to this ontology as the Gainesville Core. While it is perhaps themore » first ontology for computational chemistry and is used by our portal, it is only a start of what must be a long multi-partner effort to define computational chemistry. In conjunction with the above efforts we have defined a new potential file standard (Common Standard for eXchange – CSX for computational chemistry data). This CSX file is the precursor of data in the Resource Description Framework (RDF) form that the semantic web requires. Our portal translates CSX files (as well as other computational chemistry data files) into RDF files that are part of the graph database that the semantic web employs. We propose a CSX file as a convenient way to encapsulate computational chemistry data.« less

Authors:
ORCiD logo [1]
  1. Chemical Semantics, Inc., Gainesville, FL (United States)
Publication Date:
Research Org.:
Chemical Semantics, Inc.,Gainesville, FL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
OSTI Identifier:
1371962
Report Number(s):
DOE-CSI-11735
EIN 46-2540835
DOE Contract Number:
SC0011735
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY

Citation Formats

Ostlund, Neil. Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report. United States: N. p., 2017. Web. doi:10.2172/1371962.
Ostlund, Neil. Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report. United States. doi:10.2172/1371962.
Ostlund, Neil. 2017. "Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report". United States. doi:10.2172/1371962. https://www.osti.gov/servlets/purl/1371962.
@article{osti_1371962,
title = {Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report},
author = {Ostlund, Neil},
abstractNote = {This research showed the feasibility of applying the concepts of the Semantic Web to Computation Chemistry. We have created the first web portal (www.chemsem.com) that allows data created in the calculations of quantum chemistry, and other such chemistry calculations to be placed on the web in a way that makes the data accessible to scientists in a semantic form never before possible. The semantic web nature of the portal allows data to be searched, found, and used as an advance over the usual approach of a relational database. The semantic data on our portal has the nature of a Giant Global Graph (GGG) that can be easily merged with related data and searched globally via a SPARQL Protocol and RDF Query Language (SPARQL) that makes global searches for data easier than with traditional methods. Our Semantic Web Portal requires that the data be understood by a computer and hence defined by an ontology (vocabulary). This ontology is used by the computer in understanding the data. We have created such an ontology for computational chemistry (purl.org/gc) that encapsulates a broad knowledge of the field of computational chemistry. We refer to this ontology as the Gainesville Core. While it is perhaps the first ontology for computational chemistry and is used by our portal, it is only a start of what must be a long multi-partner effort to define computational chemistry. In conjunction with the above efforts we have defined a new potential file standard (Common Standard for eXchange – CSX for computational chemistry data). This CSX file is the precursor of data in the Resource Description Framework (RDF) form that the semantic web requires. Our portal translates CSX files (as well as other computational chemistry data files) into RDF files that are part of the graph database that the semantic web employs. We propose a CSX file as a convenient way to encapsulate computational chemistry data.},
doi = {10.2172/1371962},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2017,
month = 7
}

Technical Report:

Save / Share:
  • National Stream Survey Phase I (NSS-I) field activities were conducted in the Mid-Atlantic and Southeastern U.S. in the spring of 1986 by the U.S. EPA as part of the National Surface Water Survey and the National Acid Precipitation Assessment Program. The Survey employed a probability sample of 500 stream reaches representing a regional population of 64,700 reaches portrayed as blue lines on 1:250,000 scale maps and which have drainage areas less than 155 sq km. The NSS-I design allows regional extrapolation, with known confidence, of the number and total length of reaches with specified chemical characteristics. Excluding Florida, the Southeasternmore » subregions of the NSS-I were estimated to contain very few acidic reaches. Acidic reaches were found primarily in lowland reaches of Florida and the Mid-Atlantic Coastal Plain and in upland reaches of the Mid-Atlantic subregions. Analysis of ion composition and evidence of stream acidification are presented.« less
  • Document retrieval systems accept a user request for information and respond with a list of documents which contain information relevant to the request. When the documents (or abstracts of the documents) are stored in a computer memory, a function can be defined which estimates the semantic distance between documents. If this function together with the set of documents forms a metric space, a graph, which I call a progressive graph, can be constructed to aid the search for the documents with relevant information. Progressive graphs are studied and the search algorithms which use this graph structure are presented. The searchmore » algorithms always perform correctly on any progressive graph, but the presence of the progressive property in a graph is not sufficient to insure that the algorithms will work efficiently. The characteristics of a progressive graph which will optimize the search algorithms are discussed, and algorithms to build and optimize progressive graphs are given. The results of a small problem show that the search process using the graph created by these algorithms can be very efficient. Finally, the distance function property which determines when a graph is a progressive graph is isolated and studied. 10 figures, 4 tables« less
  • Swiftsure was a project to destroy old chemical warfare agent waste at the Defence Research Establishment Suffield Experimental Proving Ground. This report begins with an overview of the project and the consultation process, and describes the project planning and development process, the methods used to destroy the nerve agents, the contracting of a waste incinerator, the environmental protection plan, incinerator installation and testing, waste preparation and incineration operations, final waste product disposal and the environmental monitoring program. Appendices include details on the properties of the agents destroyed, sampling and analysis methods, and air quality monitoring specifications.
  • Image Retrieval (IR) problem is concerned with retrieving images that are relevant to users` requests from a large collection of images, referred to as the image database. A taxonomy for and the limitations of the existing approaches for image retrieval are discussed. Also, to alleviate some of the problems associated with these approaches, a unified framework for retrieval in image databases for a class of application areas is proposed. The framework provides a taxonomy for image attributes and identifies four generic types of retrieval based on the attribute taxonomy. Semantic attributes play a central role in supporting one of thosemore » generic retrieval types, referred to as Retrieval by Semantic Attributes (RSA). Semantic attributes are those attributes the specification of which necessarily involves some subjectivity, imprecision, and/or uncertainty. In this paper, we introduce Personal Construct Theory (PCT) as a knowledge elicitation tool for systematically deriving semantic attributes to support RSA in image retrieval applications. As a case study, we use a prototype database system comprising of human face images. The knowledge elicited from the face images is stored in a matrix form which is referred to as repertory grid. We propose an algorithm for RSA based on the repertory grid. The algorithm incorporates user relevance judgments as a means to deal with the inherent problems associated with the specification of semantic attributes. The algorithm incorporates user relevance judgments as a means to deal with the inherent problems associated with the specification of semantic attributes. The algorithm is implemented and tested on the human face image database and the initial results are encouraging. In essence, we have developed an overall methodology/test bed to facilitate experimentation with different algorithms for RSA.« less
  • This activity supports the retrieval data quality objective (DQO) process by identifying the material properties that are important to the design, development, and operation of retrieval equipment; the activity also provides justification for characterizing those properties. These properties, which control tank waste behavior during retrieval operations, are also critical to the development of valid physical simulants for designing retrieval equipment. The waste is to be retrieved in a series of four steps. First, a selected retrieval technology breaks up or dislodges the waste into subsequently smaller pieces. Then, the dislodged waste is conveyed out of the tank through the conveyancemore » line. Next, the waste flows into a separator unit that separates the gaseous phase from the liquid and solid phases. Finally, a unit may be present to condition the slurried waste before transporting it to the treatment facility. This document describes the characterization needs for the proposed processes to accomplish waste retrieval. Baseline mobilization technologies include mixer pump technology, sluicing, and high-pressure water-jet cutting. Other processes that are discussed in this document include slurry formation, pneumatic conveyance, and slurry transport. Section 2.0 gives a background of the DQO process and the different retrieval technologies. Section 3.0 provides the mechanistic descriptions and material properties critical to the different technologies and processes. Supplemental information on specific technologies and processes is provided in the appendices. Appendix A contains a preliminary sluicing model, and Appendices B and C cover pneumatic transport and slurry transport, respectively, as prepared for this document. Appendix D contains sample calculations for various equations.« less