National Library of Energy BETA

Sample records for queries sparql queries

  1. SPARQL Query Form | OpenEI

    Open Energy Info (EERE)

    where a ?Concept LIMIT 100 Display Results As: Auto HTML Spreadsheet XML JSON Javascript NTriples RDFXML Rigorous check of the query Execution timeout, in milliseconds,...

  2. Query optimization for graph analytics on linked data using SPARQL

    SciTech Connect (OSTI)

    Hong, Seokyong; Lee, Sangkeun; Lim, Seung -Hwan; Sukumar, Sreenivas R.; Vatsavai, Ranga Raju

    2015-07-01

    Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.

  3. Query | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Query Query Query Query_1.pdf PDF icon Query More Documents & Publications DOE Retrospective Review Plan and Burden Reduction Report July 29, 2013 DOE EO 13563 January 2014 Update Report and Burden Reduction Efforts DOE Retrospective Review Plan and Burden Reduction Report - December 18, 2012

  4. How can I query data on OpenEI and generate a map? | OpenEI Community

    Open Energy Info (EERE)

    How can I query data on OpenEI and generate a map? Home > Groups > Developer I'd like to have an Ask or SPARQL query display as a map in the OpenEI wiki section. What are the...

  5. Query

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Queryl Detailees Name From To Proposed Start Date Proposed End Date Office of the Director of Alex, Aileen DOE/CFO National Intelligence 5/27/2007 5/27/2009 Federal Railroad Alexander, Alice Administration DOE/CFO 7/1/2007 1/1/2008 Arnaudo, Raymond Dept of State DOE/NNSA Moscow Office 3/5/2006 3/3/2008 Cambridge Energy Research Ashley, Peter DOE/PI Associates (CERA) 1/4/2009 3/27/2009 Executive Office of the Barringer, Jody DOE/EE President/OMB 11/16/2008 3/14/2009 Benigni, Deborah Dept of State

  6. HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets...

    Office of Scientific and Technical Information (OSTI)

    UsingFast Bitmap Indices Citation Details In-Document Search Title: HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets UsingFast Bitmap Indices Large scale scientific ...

  7. compound queries | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  8. ask queries | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  9. HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets UsingFast

    Office of Scientific and Technical Information (OSTI)

    Bitmap Indices (Conference) | SciTech Connect UsingFast Bitmap Indices Citation Details In-Document Search Title: HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets UsingFast Bitmap Indices Large scale scientific data is often stored in scientific data formats such as FITS, netCDF and HDF. These storage formats are of particular interest to the scientific user community since they provide multi-dimensional storage and retrieval. However, one of the drawbacks of these storage

  10. HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets usingFast

    Office of Scientific and Technical Information (OSTI)

    Bitmap Indices (Conference) | SciTech Connect usingFast Bitmap Indices Citation Details In-Document Search Title: HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets usingFast Bitmap Indices Large scale scientific data is often stored in scientific data formats such as FITS, netCDF and HDF. These storage formats are of particular interest to the scientific user community since they provide multi-dimensional storage and retrieval. However, one of the drawbacks of these storage

  11. Increasing ask query limit | OpenEI Community

    Open Energy Info (EERE)

    via json. For example, this query only returns two entries: http:en.openei.orgservicesrestutilityrates?versionlatest&formatjsonplain&offset9998&limit30&detailb...

  12. T-703: Cisco Unified Communications Manager Open Query Interface...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    T-703: Cisco Unified Communications Manager Open Query Interface Lets Remote Users Obtain ... Authentication Bypass Vulnerability T-614: Cisco Unified Communications Manager ...

  13. Oregon Certified Water Right Examiners Query Webpage | Open Energy...

    Open Energy Info (EERE)

    Not Provided DOI Not Provided Check for DOI availability: http:crossref.org Online Internet link for Oregon Certified Water Right Examiners Query Webpage Citation State of...

  14. Query-Driven Visualization and Analysis

    SciTech Connect (OSTI)

    Ruebel, Oliver; Bethel, E. Wes; Prabhat, Mr.; Wu, Kesheng

    2012-11-01

    This report focuses on an approach to high performance visualization and analysis, termed query-driven visualization and analysis (QDV). QDV aims to reduce the amount of data that needs to be processed by the visualization, analysis, and rendering pipelines. The goal of the data reduction process is to separate out data that is "scientifically interesting'' and to focus visualization, analysis, and rendering on that interesting subset. The premise is that for any given visualization or analysis task, the data subset of interest is much smaller than the larger, complete data set. This strategy---extracting smaller data subsets of interest and focusing of the visualization processing on these subsets---is complementary to the approach of increasing the capacity of the visualization, analysis, and rendering pipelines through parallelism. This report discusses the fundamental concepts in QDV, their relationship to different stages in the visualization and analysis pipelines, and presents QDV's application to problems in diverse areas, ranging from forensic cybersecurity to high energy physics.

  15. HDF5-FastQuery: An API for Simplifying Access to Data Storage,Retrieval, Indexing and Querying

    SciTech Connect (OSTI)

    Bethel, E. Wes; Gosink, Luke; Shalf, John; Stockinger, Kurt; Wu,Kesheng

    2006-06-15

    This work focuses on research and development activities that bridge a gap between fundamental data management technology index, query, storage and retrieval and use of such technology in computational and computer science algorithms and applications. The work has resulted in a streamlined applications programming interface (API) that simplifies data storage and retrieval using the HDF5 data I/O library, and eases use of the FastBit compressed bitmap indexing software for data indexing/querying. The API, which we call HDF5-FastQuery, will have broad applications in domain sciences as well as associated data analysis and visualization applications.

  16. Multicolor Maps from Compound Queries | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  17. T-703: Cisco Unified Communications Manager Open Query Interface Lets

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Remote Users Obtain Database Contents | Department of Energy 703: Cisco Unified Communications Manager Open Query Interface Lets Remote Users Obtain Database Contents T-703: Cisco Unified Communications Manager Open Query Interface Lets Remote Users Obtain Database Contents August 26, 2011 - 3:45pm Addthis PROBLEM: A vulnerability was reported in Cisco Unified Communications Manager. A remote user can obtain database contents PLATFORM: Cisco Unified Communications Manager 6.x, 7.x, 8.0, 8.5

  18. Large-Scale Continuous Subgraph Queries on Streams

    SciTech Connect (OSTI)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Feo, John T.

    2011-11-30

    Graph pattern matching involves finding exact or approximate matches for a query subgraph in a larger graph. It has been studied extensively and has strong applications in domains such as computer vision, computational biology, social networks, security and finance. The problem of exact graph pattern matching is often described in terms of subgraph isomorphism which is NP-complete. The exponential growth in streaming data from online social networks, news and video streams and the continual need for situational awareness motivates a solution for finding patterns in streaming updates. This is also the prime driver for the real-time analytics market. Development of incremental algorithms for graph pattern matching on streaming inputs to a continually evolving graph is a nascent area of research. Some of the challenges associated with this problem are the same as found in continuous query (CQ) evaluation on streaming databases. This paper reviews some of the representative work from the exhaustively researched field of CQ systems and identifies important semantics, constraints and architectural features that are also appropriate for HPC systems performing real-time graph analytics. For each of these features we present a brief discussion of the challenge encountered in the database realm, the approach to the solution and state their relevance in a high-performance, streaming graph processing framework.

  19. Towards Optimal Multi-Dimensional Query Processing with BitmapIndices

    SciTech Connect (OSTI)

    Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

    2005-09-30

    Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work efficiently. This paper studies strategies for minimizing the access costs for processing multi-dimensional queries using bitmap indices with binning. Innovative features of our algorithm include (a) optimally placing the bin boundaries and (b) dynamically reordering the evaluation of the query terms. In addition, we derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.

  20. V-204: A specially crafted query can cause BIND to terminate...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    affected source distributions may crash with assertion failures triggered in the same fashion. IMPACT: A specially crafted DNS query could cause the DNS service to terminate...

  1. Energy Information, Data, and other Resources | OpenEI

    Open Energy Info (EERE)

    SPARQL SPARQL OpenEI's SPARQL endpoint is accessible at sparql Sample SPARQL queries are available at resource...

  2. Minimizing I/O Costs of Multi-Dimensional Queries with BitmapIndices

    SciTech Connect (OSTI)

    Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

    2006-03-30

    Bitmap indices have been widely used in scientific applications and commercial systems for processing complex,multi-dimensional queries where traditional tree-based indices would not work efficiently. A common approach for reducing the size of a bitmap index for high cardinality attributes is to group ranges of values of an attribute into bins and then build a bitmap for each bin rather than a bitmap for each value of the attribute. Binning reduces storage costs,however, results of queries based on bins often require additional filtering for discarding it false positives, i.e., records in the result that do not satisfy the query constraints. This additional filtering,also known as ''candidate checking,'' requires access to the base data on disk and involves significant I/O costs. This paper studies strategies for minimizing the I/O costs for ''candidate checking'' for multi-dimensional queries. This is done by determining the number of bins allocated for each dimension and then placing bin boundaries in optimal locations. Our algorithms use knowledge of data distribution and query workload. We derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.

  3. Cyber Graph Queries for Geographically Distributed Data Centers

    SciTech Connect (OSTI)

    Berry, Jonathan W.; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

    2015-05-01

    We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed in [20]. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly- detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove perfor- mance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

  4. Composing Data Parallel Code for a SPARQL Graph Engine

    SciTech Connect (OSTI)

    Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste; Haglin, David J.; Feo, John

    2013-09-08

    Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basic graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.

  5. U-038: BIND 9 Resolver crashes after logging an error in query.c

    Broader source: Energy.gov [DOE]

    A remote server can cause the target connected client to crash. Organizations across the Internet are reporting crashes interrupting service on BIND 9 nameservers performing recursive queries. Affected servers crash after logging an error in query.c with the following message: "INSIST(! dns_rdataset_isassociated(sigrdataset))" Multiple versions are reported as being affected, including all currently supported release versions of ISC BIND 9. ISC is actively investigating the root cause and working to produce patches which avoid the crash.

  6. U-039: ISC Update: BIND 9 Resolver crashes after logging an error in query.c

    Broader source: Energy.gov [DOE]

    A remote server can cause the target connected client to crash. Organizations across the Internet are reporting crashes interrupting service on BIND 9 nameservers performing recursive queries. Affected servers crash after logging an error in query.c with the following message: "INSIST(! dns_rdataset_isassociated(sigrdataset))" Multiple versions are reported as being affected, including all currently supported release versions of ISC BIND 9. ISC is actively investigating the root cause and working to produce patches which avoid the crash.

  7. An Application of Multivariate Statistical Analysis for Query-Driven Visualization

    SciTech Connect (OSTI)

    Gosink, Luke J.; Garth, Christoph; Anderson, John C.; Bethel, E. Wes; Joy, Kenneth I.

    2010-03-01

    Abstract?Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.

  8. Tool For Editing Structured Query Language Text Within ORACLE Forms Applications

    Energy Science and Technology Software Center (OSTI)

    1991-02-01

    SQTTEXT is an ORACLE SQL*Forms application that allows a programmer to view and edit all the Structured Query Language (SQL) text for a given application on one screen. This application is an outgrowth of the prototyping of an on-line system dictionary for the Worldwide Household Goods Information system for Transportation-Modernization decision support system being prototyped by the Oak Ridge National Laboratory, but it can be applied to all SQL*Forms software development, debugging, and maintenance.

  9. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2006-08-08

    A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.

  10. Computer systems and methods for the query and visualization of multidimensional database

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2010-05-11

    A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.

  11. A METHOD FOR ESTIMATING GAS PRESSURE IN 3013 CONTAINERS USING AN ISP DATABASE QUERY

    SciTech Connect (OSTI)

    Friday, G; L. G. Peppers, L; D. K. Veirs, D

    2008-07-31

    The U.S. Department of Energy's Integrated Surveillance Program (ISP) is responsible for the storage and surveillance of plutonium-bearing material. During storage, plutonium-bearing material has the potential to generate hydrogen gas from the radiolysis of adsorbed water. The generation of hydrogen gas is a safety concern, especially when a container is breached within a glove box during destructive evaluation. To address this issue, the DOE established a standard (DOE, 2004) that sets the criteria for the stabilization and packaging of material for up to 50 years. The DOE has now packaged most of its excess plutonium for long-term storage in compliance with this standard. As part of this process, it is desirable to know within reasonable certainty the total maximum pressure of hydrogen and other gases within the 3013 container if safety issues and compliance with the DOE standards are to be attained. The principal goal of this investigation is to document the method and query used to estimate total (i.e. hydrogen and other gases) gas pressure within a 3013 container based on the material properties and estimated moisture content contained in the ISP database. Initial attempts to estimate hydrogen gas pressure in 3013 containers was based on G-values (hydrogen gas generation per energy input) derived from small scale samples. These maximum G-values were used to calculate worst case pressures based on container material weight, assay, wattage, moisture content, container age, and container volume. This paper documents a revised hydrogen pressure calculation that incorporates new surveillance results and includes a component for gases other than hydrogen. The calculation is produced by executing a query of the ISP database. An example of manual mathematical computations from the pressure equation is compared and evaluated with results from the query. Based on the destructive evaluation of 17 containers, the estimated mean absolute pressure was significantly higher (P<.01) than the mean GEST pressure. There was no significant difference (P>.10) between the mean pressures from DR and the calculation. The mean predicted absolute pressure was consistently higher than GEST by an average difference of 57 kPa (8 psi). The mean difference between the estimated pressure and digital radiography was 11 kPa (2 psi). Based on the initial results of destructive evaluation, the pressure query was found to provide a reasonably conservative estimate of the total pressure in 3013 containers whose material contained minimal moisture content.

  12. Query and Visualization of extremely large network datasets over the web using Quadtree based KML Regional Network Links

    SciTech Connect (OSTI)

    Dadi, Upendra; Liu, Cheng; Vatsavai, Raju

    2009-01-01

    Geographic data sets are often very large in size. Interactive visualization of such data at all scales is not easy because of the limited resolution of the monitors and inability of visualization applications to handle the volume of data. This is especially true for large vector datasets. The end user s experience is frequently unsatisfactory when exploring such data over the web using a naive application. Network bandwidth is another contributing factor to the low performance. In this paper, a Quadtree based technique to visualize extremely large spatial network datasets over the web is described. It involves using custom developed algorithms leveraging a PostGIS database as the data source and Google Earth as the visualization client. This methodology supports both point and range queries along with non-spatial queries. This methodology is demonstrated using a network dataset consisting of several million links. The methodology is based on using some of the powerful features of KML (Keyhole Markup Language). Keyhole Markup Language (KML) is an Open Geospatial Consortium (OGC) standard for displaying geospatial data on Earth browsers. One of the features of KML is the notion of Network Links. Using network links, a wide range of geospatial data sources such as geodatabases, static files and geospatial data services can be simultaneously accessed and visualized seamlessly. Using the network links combined with Level of Detail principle, view based rendering and intelligent server and client-side caching, scalability in visualizing extremely large spatial datasets can be achieved.

  13. HDF5-FastQuery: Accelerating Complex Queries

    Office of Scientific and Technical Information (OSTI)

    Gosink, John Shalf, Kurt Stockinger, Kesheng Wu, Wes Bethel Department of Applied Science, University of California at Davis One Shields Ave, Davis, CA 95616, USA Computational Research Division, Lawrence Berkeley National Laboratory One Cyclotron Road, Berkeley, CA 94720, USA 1 Introduction Efficient analysis of large scientific datasets often requires a means to rapidly search and select interesting portions of data based on ad-hoc search criteria. We present our work on integrating an

  14. HDF5-FastQuery: Accelerating Complex Queries

    Office of Scientific and Technical Information (OSTI)

    Gosink 1 , John Shalf 2 , Kurt Stockinger 2 , Kesheng Wu 2 , Wes Bethel 2 1 Institute for Data Analysis and Visualization, University of California at Davis One Shields Ave, Davis, CA 95616, USA 2 Computational Research Division, Lawrence Berkeley National Laboratory One Cyclotron Road, Berkeley, CA 94720, USA Abstract Large scale scientific data is often stored in scientific data formats such as FITS, netCDF and HDF. These storage formats are of particular interest to the scientific user com-

  15. Complex Queries | Open Energy Information

    Open Energy Info (EERE)

    Electricity Markets Afghanistan-NREL Mission Afghanistan-NREL Resource Maps and Toolkits China-NREL Cooperation Dominica Island-NREL Cooperation Egypt-NREL Energy Activities...

  16. Example Queries | Open Energy Information

    Open Energy Info (EERE)

    Avoca, New York Avoca, Pennsylvania Avoca, Wisconsin Avocado Heights, California Avon Lake, Ohio Avon Park, Florida Avon, Alabama Avon, Colorado Avon, Connecticut Avon,...

  17. User:Woodjr/Sandbox/Sparql | Open Energy Information

    Open Energy Info (EERE)

    WoodjrSandboxSparql < User:Woodjr | Sandbox Jump to: navigation, search Software Developed by Companies Founded in California Retrieved from "http:en.openei.orgw...

  18. User:Woodjr/Sandbox/Sparql2 | Open Energy Information

    Open Energy Info (EERE)

    SandboxSparql2 < User:Woodjr | Sandbox Jump to: navigation, search States of the USA Retrieved from "http:en.openei.orgwindex.php?titleUser:WoodjrSandbox...

  19. User:Woodjr/Sandbox/Sparql3 | Open Energy Information

    Open Energy Info (EERE)

    Sparql3 < User:Woodjr | Sandbox Jump to: navigation, search States of the USA which have a geographic "point" defined in DBpedia Loading map... "minzoom":false,"mappingservice"...

  20. Image subregion querying using color correlograms

    DOE Patents [OSTI]

    Huang, Jing; Kumar, Shanmugasundaram Ravi; Mitra, Mandar; Zhu, Wei-Jing

    2002-01-01

    A color correlogram (10) is a representation expressing the spatial correlation of color and distance between pixels in a stored image. The color correlogram (10) may be used to distinguish objects in an image as well as between images in a plurality of images. By intersecting a color correlogram of an image object with correlograms of images to be searched, those images which contain the objects are identified by the intersection correlogram.

  1. Natural Gas Annual Respondent Query System

    Gasoline and Diesel Fuel Update (EIA)

    (Volumes in Thousand Cubic Feet, Prices in Dollars per Thousand Cubic Feet) Form EIA-176 * User Guide * Definitions, Sources, & Notes Natural Gas Deliveries (2011 - 2014)...

  2. Table Name query? | OpenEI Community

    Open Energy Info (EERE)

    - 06:39 Groups Menu You must login in order to post into this group. Recent content Hello-Sorry for the delay in... Use of DynamicAggregationProcessor I submitted a pull...

  3. developer | OpenEI Community

    Open Energy Info (EERE)

    apps, lod, sparql and community will continue to function normally. Additionally, web services that rely on Ask queries (utility rate database API) may have some downtime...

  4. Category:Query Results Templates | Open Energy Information

    Open Energy Info (EERE)

    The following 4 pages are in this category, out of 4 total. D Template:DefineVariables L Template:LabelActivities Template:LabelValuePair S Template:SubPageListHelper Retrieved...

  5. Querying Allocations Using cbank | Argonne Leadership Computing Facility

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    two-dimensional Hubbard model (Journal Article) | SciTech Connect Quasiparticle properties of the superconducting state of the two-dimensional Hubbard model Citation Details In-Document Search Title: Quasiparticle properties of the superconducting state of the two-dimensional Hubbard model Authors: Gull, E. ; Millis, A. J. Publication Date: 2015-02-20 OSTI Identifier: 1180695 Grant/Contract Number: AC02-05CH11231 Type: Publisher's Accepted Manuscript Journal Name: Physical Review B

  6. Improving Estimation Accuracy of Aggregate Queries on Data Cubes

    SciTech Connect (OSTI)

    Pourabbas, Elaheh; Shoshani, Arie

    2008-08-15

    In this paper, we investigate the problem of estimation of a target database from summary databases derived from a base data cube. We show that such estimates can be derived by choosing a primary database which uses a proxy database to estimate the results. This technique is common in statistics, but an important issue we are addressing is the accuracy of these estimates. Specifically, given multiple primary and multiple proxy databases, that share the same summary measure, the problem is how to select the primary and proxy databases that will generate the most accurate target database estimation possible. We propose an algorithmic approach for determining the steps to select or compute the source databases from multiple summary databases, which makes use of the principles of information entropy. We show that the source databases with the largest number of cells in common provide the more accurate estimates. We prove that this is consistent with maximizing the entropy. We provide some experimental results on the accuracy of the target database estimation in order to verify our results.

  7. Graph Mining Meets the Semantic Web

    SciTech Connect (OSTI)

    Lee, Sangkeun; Sukumar, Sreenivas R; Lim, Seung-Hwan

    2015-01-01

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.

  8. Template:LabelValuePair | Open Energy Information

    Open Energy Info (EERE)

    is typically used to display the results of an ask or sparql query in a simple label: value format. It is used by many pages, including the sub pages for country profiles, and is...

  9. Massive-scale RDF Processing Using Compressed Bitmap Indexes

    SciTech Connect (OSTI)

    Madduri, Kamesh; Wu, Kesheng

    2011-05-26

    The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scienti#12;c data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-#12;nding queries on this implicit multigraph in a SQL- like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins e#14;ciently, we propose a new strategy based on bitmap indexes. We store the RDF data in column-oriented structures as compressed bitmaps along with two dictionaries. This paper makes three new contributions. (i) We present an e#14;cient parallel strategy for parsing the raw RDF data, building dictionaries of unique entities, and creating compressed bitmap indexes of the data. (ii) We utilize the constructed bitmap indexes to e#14;ciently answer SPARQL queries, simplifying the join evaluations. (iii) To quantify the performance impact of using bitmap indexes, we compare our approach to the state-of-the-art triple-store RDF-3X. We #12;nd that our bitmap index-based approach to answering queries is up to an order of magnitude faster for a variety of SPARQL queries, on gigascale RDF data sets.

  10. Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong; Lim, Seung-Hwan

    2016-01-01

    The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore » graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.« less

  11. Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis

    SciTech Connect (OSTI)

    Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong; Lim, Seung-Hwan

    2016-01-01

    The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existing graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.

  12. EAGLE: 'EAGLE'Is an' Algorithmic Graph Library for Exploration

    Energy Science and Technology Software Center (OSTI)

    2015-01-16

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. Today there is no tools to conduct "graphmore » mining" on RDF standard data sets. We address that need through implementation of popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, degree distribution, diversity degree, PageRank, etc.). We implement these algorithms as SPARQL queries, wrapped within Python scripts and call our software tool as EAGLE. In RDF style, EAGLE stands for "EAGLE 'Is an' algorithmic graph library for exploration. EAGLE is like 'MATLAB' for 'Linked Data.'« less

  13. EAGLE: 'EAGLE'Is an' Algorithmic Graph Library for Exploration

    SciTech Connect (OSTI)

    2015-01-16

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. Today there is no tools to conduct "graph mining" on RDF standard data sets. We address that need through implementation of popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, degree distribution, diversity degree, PageRank, etc.). We implement these algorithms as SPARQL queries, wrapped within Python scripts and call our software tool as EAGLE. In RDF style, EAGLE stands for "EAGLE 'Is an' algorithmic graph library for exploration. EAGLE is like 'MATLAB' for 'Linked Data.'

  14. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2015-11-10

    A computer displays a graphical user interface on its display. The graphical user interface includes a schema information region and a data visualization region. The schema information region includes a plurality of fields of a multi-dimensional database that includes at least one data hierarchy. The data visualization region includes a columns shelf and a rows shelf. The computer detects user actions to associate one or more first fields with the columns shelf and to associate one or more second fields with the rows shelf. The computer generates a visual table in the data visualization region in accordance with the user actions. The visual table includes one or more panes. Each pane has an x-axis defined based on data for the one or more first fields, and each pane has a y-axis defined based on data for the one or more second fields.

  15. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2012-03-20

    In response to a user request, a computer generates a graphical user interface on a computer display. A schema information region of the graphical user interface includes multiple operand names, each operand name associated with one or more fields of a multi-dimensional database. A data visualization region of the graphical user interface includes multiple shelves. Upon detecting a user selection of the operand names and a user request to associate each user-selected operand name with a respective shelf in the data visualization region, the computer generates a visual table in the data visualization region in accordance with the associations between the operand names and the corresponding shelves. The visual table includes a plurality of panes, each pane having at least one axis defined based on data for the fields associated with a respective operand name.

  16. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2011-02-01

    In response to a user request, a computer generates a graphical user interface on a computer display. A schema information region of the graphical user interface includes multiple operand names, each operand name associated with one or more fields of a multi-dimensional database. A data visualization region of the graphical user interface includes multiple shelves. Upon detecting a user selection of the operand names and a user request to associate each user-selected operand name with a respective shelf in the data visualization region, the computer generates a visual table in the data visualization region in accordance with the associations between the operand names and the corresponding shelves. The visual table includes a plurality of panes, each pane having at least one axis defined based on data for the fields associated with a respective operand name.

  17. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L; Hanrahan, Patrick

    2015-03-03

    A computer displays a graphical user interface on its display. The graphical user interface includes a schema information region and a data visualization region. The schema information region includes multiple operand names, each operand corresponding to one or more fields of a multi-dimensional database that includes at least one data hierarchy. The data visualization region includes a columns shelf and a rows shelf. The computer detects user actions to associate one or more first operands with the columns shelf and to associate one or more second operands with the rows shelf. The computer generates a visual table in the data visualization region in accordance with the user actions. The visual table includes one or more panes. Each pane has an x-axis defined based on data for the one or more first operands, and each pane has a y-axis defined based on data for the one or more second operands.

  18. Computer systems and methods for the query and visualization of multidimensional databases

    DOE Patents [OSTI]

    Stolte, Chris; Tang, Diane L; Hanrahan, Patrick

    2014-04-29

    In response to a user request, a computer generates a graphical user interface on a computer display. A schema information region of the graphical user interface includes multiple operand names, each operand name associated with one or more fields of a multi-dimensional database. A data visualization region of the graphical user interface includes multiple shelves. Upon detecting a user selection of the operand names and a user request to associate each user-selected operand name with a respective shelf in the data visualization region, the computer generates a visual table in the data visualization region in accordance with the associations between the operand names and the corresponding shelves. The visual table includes a plurality of panes, each pane having at least one axis defined based on data for the fields associated with a respective operand name.

  19. Accelerating semantic graph databases on commodity clusters

    SciTech Connect (OSTI)

    Morari, Alessandro; Castellana, Vito G.; Haglin, David J.; Feo, John T.; Weaver, Jesse R.; Tumeo, Antonino; Villa, Oreste

    2013-10-06

    We are developing a full software system for accelerating semantic graph databases on commodity cluster that scales to hundreds of nodes while maintaining constant query throughput. Our framework comprises a SPARQL to C++ compiler, a library of parallel graph methods and a custom multithreaded runtime layer, which provides a Partitioned Global Address Space (PGAS) programming model with fork/join parallelism and automatic load balancing over a commodity clusters. We present preliminary results for the compiler and for the runtime.

  20. Developer | OpenEI Community

    Open Energy Info (EERE)

    ask queries Type Term Title Author Replies Last Post sort icon Blog entry ask queries Multicolor Maps from Compound Queries Jweers 16 May 2013 - 14:22...

  1. Developer | OpenEI Community

    Open Energy Info (EERE)

    compound queries Type Term Title Author Replies Last Post sort icon Blog entry compound queries Multicolor Maps from Compound Queries Jweers 16 May 2013 - 14:22...

  2. Enabling Graph Appliance for Genome Assembly

    SciTech Connect (OSTI)

    Singh, Rina; Graves, Jeffrey A; Lee, Sangkeun; Sukumar, Sreenivas R; Shankar, Mallikarjun

    2015-01-01

    In recent years, there has been a huge growth in the amount of genomic data available as reads generated from various genome sequencers. The number of reads generated can be huge, ranging from hundreds to billions of nucleotide, each varying in size. Assembling such large amounts of data is one of the challenging computational problems for both biomedical and data scientists. Most of the genome assemblers developed have used de Bruijn graph techniques. A de Bruijn graph represents a collection of read sequences by billions of vertices and edges, which require large amounts of memory and computational power to store and process. This is the major drawback to de Bruijn graph assembly. Massively parallel, multi-threaded, shared memory systems can be leveraged to overcome some of these issues. The objective of our research is to investigate the feasibility and scalability issues of de Bruijn graph assembly on Cray s Urika-GD system; Urika-GD is a high performance graph appliance with a large shared memory and massively multithreaded custom processor designed for executing SPARQL queries over large-scale RDF data sets. However, to the best of our knowledge, there is no research on representing a de Bruijn graph as an RDF graph or finding Eulerian paths in RDF graphs using SPARQL for potential genome discovery. In this paper, we address the issues involved in representing a de Bruin graphs as RDF graphs and propose an iterative querying approach for finding Eulerian paths in large RDF graphs. We evaluate the performance of our implementation on real world ebola genome datasets and illustrate how genome assembly can be accomplished with Urika-GD using iterative SPARQL queries.

  3. EIA Open Data - Intro - U.S. Energy Information Administration (EIA)

    U.S. Energy Information Administration (EIA) Indexed Site

    Open Data Introduction Register Register Forgot Your API Key? API Terms of Service API Browser API Documentation Query overview Series Query Geoset Query Relation Query Category Query Series Categories Query Updates Data Query Search Data Query Excel Add-In Download / Overview User Guide Installation Instructions Release Notes FAQs Embed Graphs & Maps Bulk Data Files The U.S. Energy Information Administration is committed to enhancing the value of its free and open data by making it

  4. EIA Open Data - Excel - U.S. Energy Information Administration (EIA)

    U.S. Energy Information Administration (EIA) Indexed Site

    Open Data Introduction Register Register Forgot Your API Key? API Terms of Service API Browser API Documentation Query overview Series Query Geoset Query Relation Query Category Query Series Categories Query Updates Data Query Search Data Query Excel Add-In Download / Overview User Guide Installation Instructions Release Notes FAQs Embed Graphs & Maps Bulk Data Files U.S. Energy Information Administration (EIA) Excel Data Add-In Download the EIA Data Add-In for Microsoft Excel for Windows By

  5. EIA's Energy in Brief: How is the fuel mix for U.S. electricity generation

    Gasoline and Diesel Fuel Update (EIA)

    Open Data Introduction Register Register Forgot Your API Key? API Terms of Service API Browser API Documentation Query overview Series Query Geoset Query Relation Query Category Query Series Categories Query Updates Data Query Search Data Query Excel Add-In Download / Overview User Guide Installation Instructions Release Notes FAQs Embed Graphs & Maps Bulk Data Files The U.S. Energy Information Administration is committed to enhancing the value of its free and open data by making it

  6. Toward a Data Scalable Solution for Facilitating Discovery of Science Resources

    SciTech Connect (OSTI)

    Weaver, Jesse R.; Castellana, Vito G.; Morari, Alessandro; Tumeo, Antonino; Purohit, Sumit; Chappell, Alan R.; Haglin, David J.; Villa, Oreste; Choudhury, Sutanay; Schuchardt, Karen L.; Feo, John T.

    2014-12-31

    Science is increasingly motivated by the need to process larger quantities of data. It is facing severe challenges in data collection, management, and processing, so much so that the computational demands of data scaling are competing with, and in many fields surpassing, the traditional objective of decreasing processing time. Example domains with large datasets include astronomy, biology, genomics, climate/weather, and material sciences. This paper presents a real-world use case in which we wish to answer queries pro- vided by domain scientists in order to facilitate discovery of relevant science resources. The problem is that the metadata for these science resources is very large and is growing quickly, rapidly increasing the need for a data scaling solution. We propose a system SGEM designed for answering graph-based queries over large datasets on cluster architectures, and we re- port performance results for queries on the current RDESC dataset of nearly 1.4 billion triples, and on the well-known BSBM SPARQL query benchmark.

  7. In-Memory Graph Databases for Web-Scale Data

    SciTech Connect (OSTI)

    Castellana, Vito G.; Morari, Alessandro; Weaver, Jesse R.; Tumeo, Antonino; Haglin, David J.; Villa, Oreste; Feo, John

    2015-03-01

    RDF databases have emerged as one of the most relevant way for organizing, integrating, and managing expo- nentially growing, often heterogeneous, and not rigidly structured data for a variety of scientific and commercial fields. In this paper we discuss the solutions integrated in GEMS (Graph database Engine for Multithreaded Systems), a software framework for implementing RDF databases on commodity, distributed-memory high-performance clusters. Unlike the majority of current RDF databases, GEMS has been designed from the ground up to primarily employ graph-based methods. This is reflected in all the layers of its stack. The GEMS framework is composed of: a SPARQL-to-C++ compiler, a library of data structures and related methods to access and modify them, and a custom runtime providing lightweight software multithreading, network messages aggregation and a partitioned global address space. We provide an overview of the framework, detailing its component and how they have been closely designed and customized to address issues of graph methods applied to large-scale datasets on clusters. We discuss in details the principles that enable automatic translation of the queries (expressed in SPARQL, the query language of choice for RDF databases) to graph methods, and identify differences with respect to other RDF databases.

  8. All | OpenEI Community

    Open Energy Info (EERE)

    All Home > All By term Q & A Term: ask queries Type Term Title Author Replies Last Post sort icon Blog entry ask queries Multicolor Maps from Compound Queries Jweers 16 May 2013 -...

  9. Energy Information, Data, and other Resources | OpenEI

    Open Energy Info (EERE)

    provides inline queries in the form of an Ask query, which can also be modified into a web service (see OpenEI REST services documentation). Ask queries can be executed here and...

  10. Sounding Board V.1.0

    Energy Science and Technology Software Center (OSTI)

    2006-10-10

    Sounding Board allows users to query multiple models simultaneously, finding relevant experts, related terms, and historical text related to one's query.

  11. User:Woodjr/Sandbox/Sparql4 | Open Energy Information

    Open Energy Info (EERE)

    2005 1 4,506,411.00 MO 2005 1 8,248,149.00 ND 2005 1 2,760,136.00 NJ 2005 1 116,877.00 OK 2005 1 4,413,489.00 OR 2005 1 3,804,311.00 SC 2005 1 8,712,013.00 VT 2005 1 65,139.00 WA...

  12. Help:External SPARQL integration | Open Energy Information

    Open Energy Info (EERE)

    navigation, search Integrating with Reegle logo.png OpenEI is engaged in an ongoing linked open data collaboration with Reegle1. This page serves to document a few of the...

  13. Geospatial | OpenEI Community

    Open Energy Info (EERE)

    Geospatial > Posts by term Content Group Activity By term Q & A Feeds ask queries (1) compound queries (1) data (1) developer (1) geospatial data (1) GIS (1) GIS data (1) Global...

  14. DEX: Increasing

    Office of Scientific and Technical Information (OSTI)

    ... to this approach to visualization as query-driven visual data anal- ysis. Query-driven data analysis methods allow a scientist to define a search criteria as a Boolean expression. ...

  15. Microsoft Word - NG_ResQrySys_UsersGuide_Sept2015-FINAL.docx

    Gasoline and Diesel Fuel Update (EIA)

    ... the Query System is a web-based system, no download or installation is necessary. All that is needed to run the Query System is a PC with up-to-date web-browsing software (such ...

  16. OSTI Launches SciTech Connect, Consolidates Information Bridge...

    Office of Scientific and Technical Information (OSTI)

    ... SciTech Connect employs a semantic search technique known as keyword-to-concept mapping. It accepts keyword-based queries and returns concept-mapped queries as in a taxonomy; a ...

  17. OSTI, US Dept of Energy, Office of Scientific and Technical Informatio...

    Office of Scientific and Technical Information (OSTI)

    ... SciTech Connect employs a semantic search technique known as keyword-to-concept mapping. It accepts keyword-based queries and returns concept-mapped queries as in a taxonomy; a ...

  18. Sqlog

    Energy Science and Technology Software Center (OSTI)

    2007-08-22

    The sqlog software implements a system for creation, query, and maintenance of database for SLURM job history.

  19. Search for: All records | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    in ITER. less Full Text Available April 2013, Physical Review Letters (Novembe r2012) Prev Next SOLR Query Details...

  20. July 24, 2009, Visiting Speakers Program - The Next Generation...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    ... * Labor - Pilots - Mechanics - Air traffic controllers INVESTIGATOR AIRLINES PILOTS REGULATOR CONTROLLERS MECHANICS MANUFACTURERS The System * Regulator(s) Query: ...

  1. Unhappy with internal corporate search? : learn tips and tricks for building a controlled vocabulary ontology.

    SciTech Connect (OSTI)

    Arpin, Bettina Karin Schimanski; Jones, Brian S.; Bemesderfer, Joy; Ralph, Mark E.; Miller, Jennifer L

    2010-06-01

    Are your employees unhappy with internal corporate search? Frequent complaints include: too many results to sift through; results are unrelated/outdated; employees aren't sure which terms to search for. One way to improve intranet search is to implement a controlled vocabulary ontology. Employing this takes the guess work out of searching, makes search efficient and precise, educates employees about the lingo used within the corporation, and allows employees to contribute to the corpus of terms. It promotes internal corporate search to rival its superior sibling, internet search. We will cover our experiences, lessons learned, and conclusions from implementing a controlled vocabulary ontology at Sandia National Laboratories. The work focuses on construction of this ontology from the content perspective and the technical perspective. We'll discuss the following: (1) The tool we used to build a polyhierarchical taxonomy; (2) Examples of two methods of indexing the content: traditional 'back of the book' and folksonomy word-mapping; (3) Tips on how to build future search capabilities while building the basic controlled vocabulary; (4) How to implement the controlled vocabulary as an ontology that mimics Google's search suggestions; (5) Making the user experience more interactive and intuitive; and (6) Sorting suggestions based on preferred, alternate and related terms using SPARQL queries. In summary, future improvements will be presented, including permitting end-users to add, edit and remove terms, and filtering on different subject domains.

  2. Estimating Missing Features to Improve Multimedia Information Retrieval

    SciTech Connect (OSTI)

    Bagherjeiran, A; Love, N S; Kamath, C

    2006-09-28

    Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features. In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.

  3. Coal Markets

    U.S. Energy Information Administration (EIA) Indexed Site

    Coal Glossary FAQS Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports ...

  4. U.S. Energy Information Administration (EIA) - Ap

    U.S. Energy Information Administration (EIA) Indexed Site

    Coal Glossary FAQS Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports ...

  5. User:Woodjr/Sandbox/GoogleEarth | Open Energy Information

    Open Energy Info (EERE)

    < User:Woodjr | Sandbox Jump to: navigation, search Demonstration of an experimental "GoogleEarth" result format for ask queries. Based on the Thematic Mapping API....

  6. T-559: Stack-based buffer overflow in oninit in IBM Informix...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    exploit this vulnerability. The specific flaw exists within the oninit process bound to TCP port 9088 when processing the arguments to the USELASTCOMMITTED option in a SQL query....

  7. Category:Tech Potential Properties | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Category Edit History...

  8. Prueba 3 | Open Energy Information

    Open Energy Info (EERE)

    de Redes (98) add Empresas de Energas Renovables (12) add Programas y Proyectos (1157The part "|Programs and Projects" of the query was not understood....

  9. User:GregZiebold/Sector test | Open Energy Information

    Open Energy Info (EERE)

    search Query all sector types for Companies: Bioenergy Biofuels Biomass Buildings Carbon Efficiency Geothermal energy Hydro Hydrogen Marine and Hydrokinetic Ocean Renewable Energy...

  10. Form:Marine and Hydrokinetic Technology Project Milestone | Open...

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Form Edit History...

  11. Form:Marine and Hydrokinetic Technology Project | Open Energy...

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Form Edit History...

  12. Gateway:ECOWAS Clean Energy Gateway | Open Energy Information

    Open Energy Info (EERE)

    Policy Organizations (3) add West African Companies (4) add West African Programs (76The part "|Programs and Projects" of the query was not understood. Results might not...

  13. Inline_System

    Energy Science and Technology Software Center (OSTI)

    2010-02-01

    Inline_System replaces a small subset of file query and manipulation commands, on computing platforms that do not offer a complete standard POSIX environment.

  14. Search for: All records | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    rhombohedral structure in TbMn2. Full Text Available January 2015 , Royal Society of Chemistry Prev Next Switch to Detail View for this search SOLR Query Details

  15. Energy Events | Open Energy Information

    Open Energy Info (EERE)

    Events Jump to: navigation, search Upcoming Events You need to have JavaScript enabled to view the interactive timeline. Further results for this query.WEEKMONTHYEAREvent:Preparing...

  16. Properties | Open Energy Information

    Open Energy Info (EERE)

    View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Special page Properties Jump to:...

  17. Allocations | Argonne Leadership Computing Facility

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Allocation Management Determining Allocation Requirements Querying Allocations Using cbank MiraCetusVesta Cooley Policies Documentation Feedback Please provide feedback to help...

  18. Search for: All records | DOE Data Explorer

    Office of Scientific and Technical Information (OSTI)

    Save Results Excel (limit 2000) CSV (limit 5000) XML (limit 5000) Have feedback or suggestions for a way to improve these results? Prev Next SOLR Query Details Close...

  19. Search for: All records | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    Barbeau, P.S. ; Stanford U., Phys. Dept. ; Beauchamp, E. ; Laurentian U. ; Belov, V. ; Moscow, ITEP ; et al Full Text Available March 2013 Prev Next SOLR Query Details...

  20. SWNVF: database contents

    National Nuclear Security Administration (NNSA)

    Nevada Test Site (NTS). The database complies with protocols of Structured Query Language (SQL), allowing construction of relationships among these data, from...

  1. Search for: All records | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    and memory applications at room temperature and above. less Full Text Available October 2014 , Wiley Prev Next Switch to Detail View for this search SOLR Query Details

  2. Template:TATNav | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  3. Template:ResourceLibraryTabs | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  4. Widget:CSC-CSS | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Widget Edit History...

  5. Template:LEDSLACNavs | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  6. Template:LEDSGPFooter | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  7. Template:WebServiceGraphic | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  8. Template:WFSPNav | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  9. Form:RAPID-BestPractices | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Form Edit History...

  10. Template:Organization | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  11. Widget:ContactFinder | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Widget Edit History...

  12. Widget:MailChimp | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Widget Edit History...

  13. Template:RAPID-Nav | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  14. Form:GeothermalResourceArea | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Form Edit History...

  15. Template:WindCover | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  16. Template:NEPA CX | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  17. Montana/Wind Resources | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit History Montana...

  18. Template:RegulatoryToolkitTabs | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  19. Template:Reflist | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  20. Cultural Resources | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit History Cultural...

  1. Widget:UtilityRateFinder | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Widget Edit History...

  2. Form:RRSection | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Form Edit History...

  3. Arizona/Wind Resources | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit History Arizona...

  4. Search for: All records | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    potential applications in high density nonvolatile storage in the future. less October 2015 , Wiley Prev Next Switch to Detail View for this search SOLR Query Details

  5. Template:UnderDevelopment | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Template Edit History...

  6. Link Alpha O

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    (OSPIP) Office of the Chief Financial Officer (OCFO) Ombudsman - Technology Transfer Innovation and Partnerships Office One Minute 4 HR Onestop: LBLnet host block query...

  7. T-617: BIND RPZ Processing Flaw Lets Remote Users Deny Service

    Broader source: Energy.gov [DOE]

    When a name server is configured with a response policy zone (RPZ), queries for type RRSIG can trigger a server crash.

  8. OCIO Technology Summit: Data Analytics | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    at the Energy Information Administration demonstrated the Electricity Data Browser influence on creating the visualization of data, bringing together maps, query tools, and...

  9. File:02SiteConsiderations (1).pdf | Open Energy Information

    Open Energy Info (EERE)

    source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search File Edit History...

  10. Collegiate Wind Competition | Open Energy Information

    Open Energy Info (EERE)

    Querying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit History Collegiate Wind Competition Jump to: navigation, search Wind competition.jpg The...

  11. NREL: Energy Analysis - Jon Weers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Areas of expertise Interactive web applications (PHP, jQuery, javascript, Python, Mongo, MySQL) Custom visualizations Project management User interface design Primary...

  12. QQACCT

    Energy Science and Technology Software Center (OSTI)

    2015-01-01

    batchacct provides convenient library and command-line access to batch system accounting data for GridEngine and SLURM schedulers. It can be used to perform queries useful for data analysis of the accounting data alone or for integrative analysis in the context of a larger query.

  13. Multidimensional structured data visualization method and apparatus, text visualization method and apparatus, method and apparatus for visualizing and graphically navigating the world wide web, method and apparatus for visualizing hierarchies

    DOE Patents [OSTI]

    Risch, John S.; Dowson, Scott T.

    2012-03-06

    A method of displaying correlations among information objects includes receiving a query against a database; obtaining a query result set; and generating a visualization representing the components of the result set, the visualization including one of a plane and line to represent a data field, nodes representing data values, and links showing correlations among fields and values. Other visualization methods and apparatus are disclosed.

  14. Multidimensional structured data visualization method and apparatus, text visualization method and apparatus, method and apparatus for visualizing and graphically navigating the world wide web, method and apparatus for visualizing hierarchies

    DOE Patents [OSTI]

    Risch, John S.; Dowson, Scott T.; Hart, Michelle L.; Hatley, Wes L.

    2008-05-13

    A method of displaying correlations among information objects comprises receiving a query against a database; obtaining a query result set; and generating a visualization representing the components of the result set, the visualization including one of a plane and line to represent a data field, nodes representing data values, and links showing correlations among fields and values. Other visualization methods and apparatus are disclosed.

  15. Climatepipes: User-friendly data access, data manipulation, data analysis and visualization of community climate models Phase II

    SciTech Connect (OSTI)

    Chaudhary, Aashish

    2015-09-02

    In Phase I, we successfully developed a web-based tool that provides workflow and form-based interfaces for accessing, querying, and visualizing interesting datasets from one or more sources. For Phase II of the project, we have implemented mechanisms for supporting more elaborate and relevant queries.

  16. Efficient Data Management for Knowledge Discovery in Large-Scale Geospatial Imagery Collections

    SciTech Connect (OSTI)

    Baldwin, C; Abdulla, G

    2006-01-24

    We describe the results of our investigation on supporting ad-hoc and continuous queries over data streams. The major problem we address here is how to identify and utilize metadata for smart caching and to support queries over streaming and archived or historical data.

  17. SAPLE: Sandia Advanced Personnel Locator Engine.

    SciTech Connect (OSTI)

    Procopio, Michael J.

    2010-04-01

    We present the Sandia Advanced Personnel Locator Engine (SAPLE) web application, a directory search application for use by Sandia National Laboratories personnel. SAPLE's purpose is to return Sandia personnel 'results' as a function of user search queries, with its mission to make it easier and faster to find people at Sandia. To accomplish this, SAPLE breaks from more traditional directory application approaches by aiming to return the correct set of results while placing minimal constraints on the user's query. Two key features form the core of SAPLE: advanced search query interpretation and inexact string matching. SAPLE's query interpretation permits the user to perform compound queries when typing into a single search field; where able, SAPLE infers the type of field that the user intends to search on based on the value of the search term. SAPLE's inexact string matching feature yields a high-quality ranking of personnel search results even when there are no exact matches to the user's query. This paper explores these two key features, describing in detail the architecture and operation of SAPLE. Finally, an extensive analysis on logged search query data taken from an 11-week sample period is presented.

  18. EIA Open Data - Doc - U.S. Energy Information Administration (EIA)

    Gasoline and Diesel Fuel Update (EIA)

    API Commands EIA's API uses a modified RESTful architecture, where a separate URI is used for each query command with query string variables, both required and optional, providing input parameters. Two such query string input parameters apply to all commands: api_key: Required. A valid API key is required and may be obtained from Registration out: Optional. Valid values are "xml" or "json". If missing or any other value, the API call will return JSON formatted output. API

  19. User:Woodjr/Sandbox/GeoMap | Open Energy Information

    Open Energy Info (EERE)

    GeoMap < User:Woodjr | Sandbox Jump to: navigation, search Demonstration of an experimental "GeoMap" result format for ask queries. Based on Google's GeoMap Visualization API....

  20. User:Nlangle/Timeline Test | Open Energy Information

    Open Energy Info (EERE)

    enabled to view the interactive timeline. Further results for this query.DECADEFederal Oil and Gas Royalty Simplification and Fairness Act of 19961996-01-010Year: 1996 Federal...

  1. OSTI, US Dept of Energy, Office of Scientific and Technical Informatio...

    Office of Scientific and Technical Information (OSTI)

    ... Is an offset needed? Does the site's query script need to incorporate parameters that will block old, pre-Harvesting records from coming into OSTI's system again? Who at the site ...

  2. Developer | OpenEI Community

    Open Energy Info (EERE)

    Q & A Feeds American Clean Skies Foundation (1) API (3) APIs (1) Apps (1) ask queries (1) Big Data (1) bug (2) challenge (1) citation (1) citing (1) clean energy (1) cleanweb (2)...

  3. Utility Rate API v2 | OpenEI Community

    Open Energy Info (EERE)

    Hi, I am running into one issue with the API and that is using an ask query to get the data in the embedded multiple instance template fields. This is the last big problem that...

  4. User:Jayhuggins/Test | Open Energy Information

    Open Energy Info (EERE)

    6 Compound Queries 7 Pipe Escape 8 Parser Functions 9 Maps 10 Math 11 Loops 12 External Data 13 Dynamic Functions 14 Category Test 15 Array 16 Number Format 17 UUID 18 InputBox...

  5. OpenEI Community Central | OpenEI Community

    Open Energy Info (EERE)

    (1) acres (1) adoption (1) American Clean Skies Foundation (1) Apps (1) ask queries (1) Big Data (1) biofuel art (1) building (1) building load (1) building load data (1) car (1)...

  6. In-situ sampling of a large-scale particle simulation for interactive...

    Office of Scientific and Technical Information (OSTI)

    The limiting technology in this situation is analogous to the problem in many population surveys: there aren't enough human resources to query a large population. To cope with the ...

  7. Polls | OpenEI Community

    Open Energy Info (EERE)

    May 2012 - 13:48 by Rmckeel * The Utility Rate web service * The Incentive web service * Web services to query by geographic location or shape * Mediawiki Ask examples & tutorials...

  8. Help:SubObjects | Open Energy Information

    Open Energy Info (EERE)

    except the subobject is listed too. If this is not included, it might actually be OK, but it's working with it in - so there you have it. You'll notice an ask query is in...

  9. Solar Power In China | Open Energy Information

    Open Energy Info (EERE)

    Solar Power In China Jump to: navigation, search This article is a stub. You can help OpenEI by expanding it. Working on ask query to display all Chinese solar companies TODO:...

  10. OpenEI Community - Utility+Utility Access Map

    Open Energy Info (EERE)

    the Special Ask page, in the query box enter the following:

    &91;&91;Category:Utility...

  11. Finding Utility Companies Under a Given Utility ID | OpenEI Community

    Open Energy Info (EERE)

    utility company pages under a given utility id. From the Special Ask page, in the query box enter the following: Category:Utility CompaniesEiaUtilityId::15248 substituting...

  12. Utility+Utility Access Map | OpenEI Community

    Open Energy Info (EERE)

    utility company pages under a given utility id. From the Special Ask page, in the query box enter the following: Category:Utility CompaniesEiaUtilityId::15248 substituting...

  13. How can an external application get access to OpenEI images and...

    Open Energy Info (EERE)

    How can an external application get access to OpenEI images and thumbnails? Home > Groups > Developer I'm building an external application in Simile Exhibit. Through an Ask query...

  14. Kanuti Geothermal Area | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  15. HCP Handbook | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  16. maps | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  17. multicolor | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  18. Google maps | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  19. results | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  20. Semantic Mediawiki | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  1. result formats | OpenEI Community

    Open Energy Info (EERE)

    queries developer Google maps maps multicolor result formats results Semantic Mediawiki Hi all, Recently, a couple of people on OpenEI have asked me how to do compound (or...

  2. Science.gov 3.0 Launched; Offers Increased Precision Searches...

    Office of Science (SC) Website

    early viewing of results while the database and Web site searches continue in real time. ... A single query searches across 30 databases and 1,800 Web sites. Science.gov allows users ...

  3. Title 43 CFR 3201 Available Lands | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  4. Title 25 USC 323 Rights-of-way for all purposes across any Indian...

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  5. Alaska - AS 42.45.045 - Renewable Energy Grant Fund and Recommendation...

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  6. Title 10, Chapter 49 Protection of Navigable Waters and Shorelands...

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  7. Sulphur Hot Springs Geothermal Area | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  8. Lake City Hot Springs Geothermal Area | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  9. Maazama Well Geothermal Area | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  10. Virgin Islands Wtr&Pwr Auth | Open Energy Information

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  11. OSTI, US Dept of Energy, Office of Scientific and Technical Informatio...

    Office of Scientific and Technical Information (OSTI)

    A systematic program to increase the speed of federated search would begin with a much ... that is frequently requested yet does not change very often, the database is queried once ...

  12. Facility Data Policy

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    public data can be queried via the my.es.net web portal, and via documented programatic APIs. Router Utilization and NetFlow data are stored indefinitely, and stored on reliable...

  13. Usage by Job Size Table

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Usage by Job Size Table Usage by Job Size Table page loading animation Usage Query Interface System All Hopper Edison Cori Carver Planck Matgen Franklin Hopper 1 Magellan Dirac...

  14. Direct-Current Resistivity At Cove Fort Area - Liquid (Combs...

    Open Energy Info (EERE)

    form View source History View New Pages Recent Changes All Special Pages Semantic SearchQuerying Get Involved Help Apps Datasets Community Login | Sign Up Search Page Edit with...

  15. OpenEI Community Central | OpenEI Community

    Open Energy Info (EERE)

    OpenEI Community Central Home > OpenEI Community Central > Posts by term > OpenEI Community Central Content Group Activity By term Q & A Feeds Term: compound queries Type Term...

  16. Buildings Energy Data Book

    Buildings Energy Data Book [EERE]

    Explore Survey Data from the Energy Information Administration Follow the links below to two easy-to-use query tools, developed exclusively for this website. With these tools you can explore results from the Commercial Buildings Energy Consumption Survey (CBECS) and the Residential Energy Consumption Survey (RECS). Commercial Buildings Energy Index Use this custom query tool to analyze micro data from CBECS 2003. Residential Buildings Energy Index Use this custom Microsoft Excel pivot table to

  17. ORNL, ACRF Archive: Raymond McCord, Giri Palanisamy, Karen Gibson, W. Christopher Lenhardt

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    ORNL, ACRF Archive: Raymond McCord, Giri Palanisamy, Karen Gibson, W. Christopher Lenhardt Mission Research: Sean Moore Show me... - Exis'ng Func'onality * Mul'ple interfaces (Data, Catalog, Thumbnail Browsers, and more) * Data Browser supports new and experienced users * Datastream pathway is most efficient expert interface Measurement Merges Condi0onal Queries Core Measurements You tell us??? - Future Func'onality * More complex data extrac'on func'ons (measurement merges, complex queries) *

  18. Geometric Algorithms for Modeling, Motion, and Animation (GAMMA): Collision Detection Videos from the University of North Carolina GAMMA Research Group

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Physically based modeling simulations depend highly on the physical interaction between objects in a scene. Complex physics engines require fast, accurate, and robust proximity queries to maintain a realistic simulation at interactive rates. We couple our proximity query research with physically based modeling to ensure that our packages provide the capabilities of today's physics engines.[Copied from http://www.cs.unc.edu/~geom/collide/index.shtml

  19. Porotomo Subtask 3.9 Build FEM Configuration

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Tabrez Ali

    mesh.vtk: Self contained VTK file that contains mesh information and can be directly visualized in Paraview/Visit mesh.png: Image of mesh as visualized in Paraview nodes.csv: Nodal coordinates of the mesh in UTM coordinates (m). nodes_rotated.csv: Nodal coordinates of the mesh in rotated (X/Y/Z) coordinates (m). cells.csv: Connectivity data query_points.csv: List of points (centroid of cells) that will be used to query the geologic database

  20. Porotomo Subtask 3.9 Build FEM Configuration

    SciTech Connect (OSTI)

    Tabrez Ali

    2015-06-30

    mesh.vtk: Self contained VTK file that contains mesh information and can be directly visualized in Paraview/Visit mesh.png: Image of mesh as visualized in Paraview nodes.csv: Nodal coordinates of the mesh in UTM coordinates (m). nodes_rotated.csv: Nodal coordinates of the mesh in rotated (X/Y/Z) coordinates (m). cells.csv: Connectivity data query_points.csv: List of points (centroid of cells) that will be used to query the geologic database

  1. Efficient binning for bitmap indices on high-cardinality attributes

    SciTech Connect (OSTI)

    Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

    2004-11-17

    Bitmap indexing is a common technique for indexing high-dimensional data in data warehouses and scientific applications. Though efficient for low-cardinality attributes, query processing can be rather costly for high-cardinality attributes due to the large storage requirements for the bitmap indices. Binning is a common technique for reducing storage costs of bitmap indices. This technique partitions the attribute values into a number of ranges, called bins, and uses bitmap vectors to represent bins (attribute ranges) rather than distinct values. Although binning may reduce storage costs, it may increase the access costs of queries that do not fall on exact bin boundaries (edge bins). For this kind of queries the original data values associated with edge bins must be accessed, in order to check them against the query constraints.In this paper we study the problem of finding optimal locations for the bin boundaries in order to minimize these access costs subject to storage constraints. We propose a dynamic programming algorithm for optimal partitioning of attribute values into bins that takes into account query access patterns as well as data distribution statistics. Mathematical analysis and experiments on real life data sets show that the optimal partitioning achieved by this algorithm can lead to a significant improvement in the access costs of bitmap indexing systems for high-cardinality attributes.

  2. Method for localizing and isolating an errant process step

    DOE Patents [OSTI]

    Tobin, Jr., Kenneth W.; Karnowski, Thomas P.; Ferrell, Regina K.

    2003-01-01

    A method for localizing and isolating an errant process includes the steps of retrieving from a defect image database a selection of images each image having image content similar to image content extracted from a query image depicting a defect, each image in the selection having corresponding defect characterization data. A conditional probability distribution of the defect having occurred in a particular process step is derived from the defect characterization data. A process step as a highest probable source of the defect according to the derived conditional probability distribution is then identified. A method for process step defect identification includes the steps of characterizing anomalies in a product, the anomalies detected by an imaging system. A query image of a product defect is then acquired. A particular characterized anomaly is then correlated with the query image. An errant process step is then associated with the correlated image.

  3. Fast Search for Dynamic Multi-Relational Graphs

    SciTech Connect (OSTI)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Feo, John T.

    2013-06-23

    Acting on time-critical events by processing ever growing social media or news streams is a major technical challenge. Many of these data sources can be modeled as multi-relational graphs. Continuous queries or techniques to search for rare events that typically arise in monitoring applications have been studied extensively for relational databases. This work is dedicated to answer the question that emerges naturally: how can we efficiently execute a continuous query on a dynamic graph? This paper presents an exact subgraph search algorithm that exploits the temporal characteristics of representative queries for online news or social media monitoring. The algorithm is based on a novel data structure called the that leverages the structural and semantic characteristics of the underlying multi-relational graph. The paper concludes with extensive experimentation on several real-world datasets that demonstrates the validity of this approach.

  4. Toward a Data Scalable Solution for Facilitating Discovery of Scientific Data Resources

    SciTech Connect (OSTI)

    Chappell, Alan R.; Choudhury, Sutanay; Feo, John T.; Haglin, David J.; Morari, Alessandro; Purohit, Sumit; Schuchardt, Karen L.; Tumeo, Antonino; Weaver, Jesse R.; Villa, Oreste

    2013-11-18

    Science is increasingly motivated by the need to process larger quantities of data. It is facing severe challenges in data collection, management, and processing, so much so that the computational demands of "data scaling" are competing with, and in many fields surpassing, the traditional objective of decreasing processing time. Example domains with large datasets include astronomy, biology, genomic, climate and weather, and material sciences. This paper presents a real-world use case in which we wish to answer queries provided by domain scientists in order to facilitate discovery of relevant science resources. The problem is that the metadata for these science resources is very large and is growing quickly, rapidly increasing the need for a data scaling solution. We propose the use of our SGEM stack -- a system designed for answering graph-based queries over large datasets on cluster architectures -- for answering complex queries over the metadata, and we report early results for our current capability.

  5. Monitoring jobs with qs

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Jobs » Monitoring jobs with qs Monitoring jobs with qs qs is an alternative tool to the SGE-provided qstat for querying the queue status developed at NERSC. qs provides an enhanced user interface designed to make it easier to see resource requests, utilization, and job position in the queue. qs provides a centralized web-service that can be queried using either the provided "qs" client, or by HTTP connection to the qs server. qs reports data from a cached copy of the genepool UGE

  6. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond Science.gov USA.gov for SCIENCE Media Contact For immediate release Cathey Daniels September 15, 2008 865-576-9539 danielsc@osti.gov Science.gov Releases Version 5.0 More Science for Your Query Oak Ridge, TN - Science.gov has added valuable content, now offering even more science information and search results for your query. Version 5.0, released today, searches 200 million pages of scientific information and provides links to

  7. GenoGraphics for OpenWindows trademark

    SciTech Connect (OSTI)

    Hagstrom, R.; Overbeek, R.; Price, M.; Zawada, D. ); Michaels, G.S.; Taylor, R. . Div. of Computer Research and Technology); Yoshida, Kaoru )

    1992-04-01

    GenoGraphics is a generic utility for constructing and querying one-dimensional linear plots. The outgrowth of a request from Dr. Cassandra Smith for a tool to facilitate her genome mapping research. GenoGraphics development has benefited from a continued collaboration with her. Written in Sun Microsystem's OpenWindows environment and the BTOL toolkit developed at Argonne National Laboratory. GenoGraphics provides an interactive, intuitive, graphical interface. Its features include: viewing multiple maps simultaneously, zooming, and querying by mouse clicking. By expediting plot generation, GenoGraphics gives the scientist more time to analyze data and a novel means for deducing conclusions.

  8. GenoGraphics for OpenWindows{trademark}

    SciTech Connect (OSTI)

    Hagstrom, R.; Overbeek, R.; Price, M.; Zawada, D.; Michaels, G.S.; Taylor, R.; Yoshida, Kaoru

    1992-04-01

    GenoGraphics is a generic utility for constructing and querying one-dimensional linear plots. The outgrowth of a request from Dr. Cassandra Smith for a tool to facilitate her genome mapping research. GenoGraphics development has benefited from a continued collaboration with her. Written in Sun Microsystem`s OpenWindows environment and the BTOL toolkit developed at Argonne National Laboratory. GenoGraphics provides an interactive, intuitive, graphical interface. Its features include: viewing multiple maps simultaneously, zooming, and querying by mouse clicking. By expediting plot generation, GenoGraphics gives the scientist more time to analyze data and a novel means for deducing conclusions.

  9. Preliminary Results on Uncertainty Quantification for Pattern Analytics

    SciTech Connect (OSTI)

    Stracuzzi, David John; Brost, Randolph; Chen, Maximillian Gene; Malinas, Rebecca; Peterson, Matthew Gregor; Phillips, Cynthia A.; Robinson, David G.; Woodbridge, Diane

    2015-09-01

    This report summarizes preliminary research into uncertainty quantification for pattern ana- lytics within the context of the Pattern Analytics to Support High-Performance Exploitation and Reasoning (PANTHER) project. The primary focus of PANTHER was to make large quantities of remote sensing data searchable by analysts. The work described in this re- port adds nuance to both the initial data preparation steps and the search process. Search queries are transformed from does the specified pattern exist in the data? to how certain is the system that the returned results match the query? We show example results for both data processing and search, and discuss a number of possible improvements for each.

  10. Annotated Bibliography for the DEWPOINT project

    SciTech Connect (OSTI)

    Oehmen, Christopher S.

    2009-04-21

    This bibliography covers aspects of the Detection and Early Warning of Proliferation from Online INdicators of Threat (DEWPOINT) project including 1) data management and querying, 2) baseline and advanced methods for classifying free text, and 3) algorithms to achieve the ultimate goal of inferring intent from free text sources. Metrics for assessing the quality and correctness of classification are addressed in the second group. Data management and querying include methods for efficiently storing, indexing, searching, and organizing the data we expect to operate on within the DEWPOINT project.

  11. PCard Data Analysis Tool

    Energy Science and Technology Software Center (OSTI)

    2005-04-01

    The Procurement Card data analysis and monitoring tool enables due-diligence review using predefined user-created queries and reports. The system tracks individual compliance emails. More specifically, the tool: - Helps identify exceptions or questionable and non-compliant purchases, - Creates audit random sample on request, - Allows users to create and run new or ad-hoc queries and reports, - Monitors disputed charges, - Creates predefined Emails to Cardholders requesting documentation and/or clarification, - Tracks audit status, notes,more » Email status (date sent, response), audit resolution.« less

  12. An efficient compression scheme for bitmap indices

    SciTech Connect (OSTI)

    Wu, Kesheng; Otoo, Ekow J.; Shoshani, Arie

    2004-04-13

    When using an out-of-core indexing method to answer a query, it is generally assumed that the I/O cost dominates the overall query response time. Because of this, most research on indexing methods concentrate on reducing the sizes of indices. For bitmap indices, compression has been used for this purpose. However, in most cases, operations on these compressed bitmaps, mostly bitwise logical operations such as AND, OR, and NOT, spend more time in CPU than in I/O. To speedup these operations, a number of specialized bitmap compression schemes have been developed; the best known of which is the byte-aligned bitmap code (BBC). They are usually faster in performing logical operations than the general purpose compression schemes, but, the time spent in CPU still dominates the total query response time. To reduce the query response time, we designed a CPU-friendly scheme named the word-aligned hybrid (WAH) code. In this paper, we prove that the sizes of WAH compressed bitmap indices are about two words per row for large range of attributes. This size is smaller than typical sizes of commonly used indices, such as a B-tree. Therefore, WAH compressed indices are not only appropriate for low cardinality attributes but also for high cardinality attributes.In the worst case, the time to operate on compressed bitmaps is proportional to the total size of the bitmaps involved. The total size of the bitmaps required to answer a query on one attribute is proportional to the number of hits. These indicate that WAH compressed bitmap indices are optimal. To verify their effectiveness, we generated bitmap indices for four different datasets and measured the response time of many range queries. Tests confirm that sizes of compressed bitmap indices are indeed smaller than B-tree indices, and query processing with WAH compressed indices is much faster than with BBC compressed indices, projection indices and B-tree indices. In addition, we also verified that the average query response time is proportional to the index size. This indicates that the compressed bitmap indices are efficient for very large datasets.

  13. Search tool plug-in: imploements latent topic feedback

    Energy Science and Technology Software Center (OSTI)

    2011-09-23

    IRIS is a search tool plug-in that is used to implement latent topic feedback for enhancing text navigation. It accepts a list of returned documents from an information retrieval wywtem that is generated from keyword search queries. Data is pulled directly from a topic information database and processed by IRIS to determine the most prominent and relevant topics, along with topic-ngrams, associated with the list of returned documents. User selected topics are then used tomore » expand the query and presumabley refine the search results.« less

  14. Method and system for efficiently searching an encoded vector index

    DOE Patents [OSTI]

    Bui, Thuan Quang; Egan, Randy Lynn; Kathmann, Kevin James

    2001-09-04

    Method and system aspects for efficiently searching an encoded vector index are provided. The aspects include the translation of a search query into a candidate bitmap, and the mapping of data from the candidate bitmap into a search result bitmap according to entry values in the encoded vector index. Further, the translation includes the setting of a bit in the candidate bitmap for each entry in a symbol table that corresponds to candidate of the search query. Also included in the mapping is the identification of a bit value in the candidate bitmap pointed to by an entry in an encoded vector.

  15. Enterprise Middleware for Scientific Data

    SciTech Connect (OSTI)

    Thomson, Judi; Chappell, Alan R.; Almquist, Justin P.

    2003-02-27

    We describe an enterprise middleware system that integrates, from a user’s perspective, data located on disparate data storage devices without imposing additional requirements upon those storage mechanisms. The system provides advanced search capabilities by exploiting a repository of metadata that describes the integrated data. This search mechanism integrates information from a collection of XML metadata documents with diverse schema. Users construct queries using familiar search terms, and the enterprise system uses domain representations and vocabulary mappings to translate the user’s query, expanding the search to include other potentially relevant data. The enterprise architecture allows flexibility with respect to domain dependent processing of user data and metadata

  16. Ensemble Data Analysis ENvironment (EDEN)

    SciTech Connect (OSTI)

    Steed, Chad Allen

    2012-08-01

    The EDEN toolkit facilitates exploratory data analysis and visualization of global climate model simulation datasets. EDEN provides an interactive graphical user interface (GUI) that helps the user visually construct dynamic queries of the characteristically large climate datasets using temporal ranges, variable selections, and geographic areas of interest. EDEN reads the selected data into a multivariate visualization panel which features an extended implementation of parallel coordinates plots as well as interactive scatterplots. The user can query data in the visualization panel using mouse gestures to analyze different ranges of data. The visualization panel provides coordinated multiple views whereby selections made in one plot are propagated to the other plots.

  17. Sifting Through a Trillion Electrons

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Sifting Through a Trillion Electrons Sifting Through a Trillion Electrons Berkeley researchers design strategies for extracting interesting data from massive scientific datasets June 26, 2012 Linda Vu, lvu@lbl.gov, +1 510 495 2402 VPIC1.jpg After querying a dataset of approximately 114,875,956,837 particles for those with Energy values less than 1.5, FastQuery identifies 57,740,614 particles, which are mapped on this plot. Image by Oliver Rubel, Berkeley Lab. Modern research tools like

  18. List of utility company aliases | OpenEI Community

    Open Energy Info (EERE)

    ntro%5D&p%5Boutro%5D&p%5Bdefault%5D&eqyes to query various fields from the aliases. Hope that helps - feel free to edit your question if that didn't answer everything Rmckeel...

  19. ~tx421.ptx

    U.S. Energy Information Administration (EIA) Indexed Site

    ... a day-to- 3 day percentage change in past prices. 4 But what ... similar to our STEO query system where the 20 user could ... The term that I've 20 used in class all the time, I tend to ...

  20. InterMine Webservices for Phytozome

    SciTech Connect (OSTI)

    Carlson, Joseph; Hayes, David; Goodstein, David; Rokhsar, Daniel

    2014-01-10

    A data warehousing framework for biological information provides a useful infrastructure for providers and users of genomic data. For providers, the infrastructure give them a consistent mechanism for extracting raw data. While for the users, the web services supported by the software allows them to make either simple and common, or complex and unique, queries of the data

  1. Linked-View Parallel Coordinate Plot Renderer

    Energy Science and Technology Software Center (OSTI)

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  2. AISL Development Toolkit

    Energy Science and Technology Software Center (OSTI)

    2012-09-13

    AISLDT is a library of utility functions supporting other AISL software. Code provides various utility functions for Common Lisp, including an object-oriented database, distributed objects, logic query engine, web content management, chart drawing, packet sniffing, text processing, and various data structures.

  3. A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    SciTech Connect (OSTI)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Agarwal, Khushbu; Feo, John T.

    2015-02-02

    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving net- works spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with promi- nent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named Relative Selectivity" that is used to se- lect between different query processing strategies. Our experiments performed on real online news, network traffic stream and a syn- thetic social network benchmark demonstrate 10-100x speedups over selectivity agnostic approaches.

  4. Optimal Chunking of Large Multidimensional Arrays for Data Warehousing

    SciTech Connect (OSTI)

    Otoo, Ekow J; Otoo, Ekow J.; Rotem, Doron; Seshadri, Sridhar

    2008-02-15

    Very large multidimensional arrays are commonly used in data intensive scientific computations as well as on-line analytical processingapplications referred to as MOLAP. The storage organization of such arrays on disks is done by partitioning the large global array into fixed size sub-arrays called chunks or tiles that form the units of data transfer between disk and memory. Typical queries involve the retrieval of sub-arrays in a manner that access all chunks that overlap the query results. An important metric of the storage efficiency is the expected number of chunks retrieved over all such queries. The question that immediately arises is"what shapes of array chunks give the minimum expected number of chunks over a query workload?" The problem of optimal chunking was first introduced by Sarawagi and Stonebraker who gave an approximate solution. In this paper we develop exact mathematical models of the problem and provide exact solutions using steepest descent and geometric programming methods. Experimental results, using synthetic and real life workloads, show that our solutions are consistently within than 2.0percent of the true number of chunks retrieved for any number of dimensions. In contrast, the approximate solution of Sarawagi and Stonebraker can deviate considerably from the true result with increasing number of dimensions and also may lead to suboptimal chunk shapes.

  5. Vista Version 0.7

    Energy Science and Technology Software Center (OSTI)

    2005-05-05

    Vista is a database management system tailored to the needs of scientific computing. It provides data storage for Index Sets, topological relationships, parameters, and fields. It provides scoping capabilities for data along with a nice way of managing attribute queries. It is an in-core database that is intended to replace the majority of data structures used in scientific software.

  6. RTOSPlanner v 0.9

    Energy Science and Technology Software Center (OSTI)

    2012-01-05

    RTOSPianner provides a generic robot motion planning capability that interfaces directly with the SMART kinematics and dynamics engine. It provides rapid setup, syncronization and query routines for driving a robot modelled within SMART and kinApps. It requires the following packages to run: core SMART, core Umbra, Esmart, and kinApps.

  7. Visualization and Analysis in Support of Fusion Science

    SciTech Connect (OSTI)

    Sanderson, Allen R.

    2012-10-01

    This report summarizes the results of the award for “Visualization and Analysis in Support of Fusion Science.” With this award our main efforts have been to develop and deploy visualization and analysis tools in three areas 1) magnetic field line analysis 2) query based visualization and 3) comparative visualization.

  8. Semantic Space Analyst

    Energy Science and Technology Software Center (OSTI)

    2004-04-15

    The Semantic Space Analyst (SSA) is software for analyzing a text corpus, discovering relationships among terms, and allowing the user to explore that information in different ways. It includes features for displaying and laying out terms and relationships visually, for generating such maps from manual queries, for discovering differences between corpora. Data can also be exported to Microsoft Excel.

  9. MATRIX AND VECTOR SERVICES

    Energy Science and Technology Software Center (OSTI)

    2001-10-18

    PETRA V2 provides matrix and vector services and the ability construct, query, and use matrix and vector objects that are used and computed by TRILINOS solvers. It provides all basic matr5ix and vector operations for solvers in TRILINOS.

  10. Method and system for the diagnosis of disease using retinal image content and an archive of diagnosed human patient data

    DOE Patents [OSTI]

    Tobin, Kenneth W; Karnowski, Thomas P; Chaum, Edward

    2013-08-06

    A method for diagnosing diseases having retinal manifestations including retinal pathologies includes the steps of providing a CBIR system including an archive of stored digital retinal photography images and diagnosed patient data corresponding to the retinal photography images, the stored images each indexed in a CBIR database using a plurality of feature vectors, the feature vectors corresponding to distinct descriptive characteristics of the stored images. A query image of the retina of a patient is obtained. Using image processing, regions or structures in the query image are identified. The regions or structures are then described using the plurality of feature vectors. At least one relevant stored image from the archive based on similarity to the regions or structures is retrieved, and an eye disease or a disease having retinal manifestations in the patient is diagnosed based on the diagnosed patient data associated with the relevant stored image(s).

  11. Computing quality scores and uncertainty for approximate pattern matching in geospatial semantic graphs

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Stracuzzi, David John; Brost, Randolph C.; Phillips, Cynthia A.; Robinson, David G.; Wilson, Alyson G.; Woodbridge, Diane M. -K.

    2015-09-26

    Geospatial semantic graphs provide a robust foundation for representing and analyzing remote sensor data. In particular, they support a variety of pattern search operations that capture the spatial and temporal relationships among the objects and events in the data. However, in the presence of large data corpora, even a carefully constructed search query may return a large number of unintended matches. This work considers the problem of calculating a quality score for each match to the query, given that the underlying data are uncertain. As a result, we present a preliminary evaluation of three methods for determining both match qualitymore » scores and associated uncertainty bounds, illustrated in the context of an example based on overhead imagery data.« less

  12. Chunking of Large Multidimensional Arrays

    SciTech Connect (OSTI)

    Rotem, Doron; Otoo, Ekow J.; Seshadri, Sridhar

    2007-02-28

    Data intensive scientific computations as well on-lineanalytical processing applications as are done on very large datasetsthat are modeled as k-dimensional arrays. The storage organization ofsuch arrays on disks is done by partitioning the large global array intofixed size hyper-rectangular sub-arrays called chunks or tiles that formthe units of data transfer between disk and memory. Typical queriesinvolve the retrieval of sub-arrays in a manner that accesses all chunksthat overlap the query results. An important metric of the storageefficiency is the expected number of chunks retrieved over all suchqueries. The question that immediately arises is "what shapes of arraychunks give the minimum expected number of chunks over a query workload?"In this paper we develop two probabilistic mathematical models of theproblem and provide exact solutions using steepest descent and geometricprogramming methods. Experimental results, using synthetic workloads onreal life data sets, show that our chunking is much more efficient thanthe existing approximate solutions.

  13. Provenance management in Swift with implementation details.

    SciTech Connect (OSTI)

    Gadelha, L. M. R; Clifford, B.; Mattoso, M.; Wilde, M.; Foster, I.

    2011-04-01

    The Swift parallel scripting language allows for the specification, execution and analysis of large-scale computations in parallel and distributed environments. It incorporates a data model for recording and querying provenance information. In this article we describe these capabilities and evaluate interoperability with other systems through the use of the Open Provenance Model. We describe Swift's provenance data model and compare it to the Open Provenance Model. We also describe and evaluate activities performed within the Third Provenance Challenge, which consisted of implementing a specific scientific workflow, capturing and recording provenance information of its execution, performing provenance queries, and exchanging provenance information with other systems. Finally, we propose improvements to both the Open Provenance Model and Swift's provenance system.

  14. AmeriFlux Network Data from the ORNL AmeriFlux Website

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    The AmeriFlux network was established in 1996 to provide continuous observations of ecosystem level exchanges of CO2, water, energy and momentum spanning diurnal, synoptic, seasonal, and interannual time scales. It is fed by sites from North America, Central America, and South America. DOE's CDIAC stores and maintains AmeriFlux data, and this web site explains the different levels of data available there, with links to the CDIAC ftp site. A separate web-based data interface is also provided; it allows users to graph, query, and download Level 2 data for up to four sites at a time. Data may be queried by site, measurement period, or parameter. More than 550 site-years of level 2 data are available from AmeriFlux sites through the interface.

  15. Shards v 1.0

    Energy Science and Technology Software Center (OSTI)

    2009-07-28

    Shards is a library of Shared Discretization Tools intended to support development of computer codes for the numerical solution of Partial Differential Equations (PDEs). The library comprises of two categories of tools: methods to manage and access information about cell topologies used in mesh-based methods for PDEs, and methods to work with multi-dimensional arrays used to store numerical data in corresponding computer codes. The basic cell topology functionality of Shards includes methods to query adjacenciesmore » of subcells, find subcell permutation with respect to a global cell and create user-defined custom cell topologies. Multi-dimensional array part of the package provides specialized compile-time dimension tags, multi-index access methods, rank and dimension queries.« less

  16. Final Report: Efficient Databases for MPC Microdata

    SciTech Connect (OSTI)

    Michael A. Bender; Martin Farach-Colton; Bradley C. Kuszmaul

    2011-08-31

    The purpose of this grant was to develop the theory and practice of high-performance databases for massive streamed datasets. Over the last three years, we have developed fast indexing technology, that is, technology for rapidly ingesting data and storing that data so that it can be efficiently queried and analyzed. During this project we developed the technology so that high-bandwidth data streams can be indexed and queried efficiently. Our technology has been proven to work data sets composed of tens of billions of rows when the data streams arrives at over 40,000 rows per second. We achieved these numbers even on a single disk driven by two cores. Our work comprised (1) new write-optimized data structures with better asymptotic complexity than traditional structures, (2) implementation, and (3) benchmarking. We furthermore developed a prototype of TokuFS, a middleware layer that can handle microdata I/O packaged up in an MPI-IO abstraction.

  17. The PANTHER User Experience

    SciTech Connect (OSTI)

    Coram, Jamie L.; Morrow, James D.; Perkins, David Nikolaus

    2015-09-01

    This document describes the PANTHER R&D Application, a proof-of-concept user interface application developed under the PANTHER Grand Challenge LDRD. The purpose of the application is to explore interaction models for graph analytics, drive algorithmic improvements from an end-user point of view, and support demonstration of PANTHER technologies to potential customers. The R&D Application implements a graph-centric interaction model that exposes analysts to the algorithms contained within the GeoGraphy graph analytics library. Users define geospatial-temporal semantic graph queries by constructing search templates based on nodes, edges, and the constraints among them. Users then analyze the results of the queries using both geo-spatial and temporal visualizations. Development of this application has made user experience an explicit driver for project and algorithmic level decisions that will affect how analysts one day make use of PANTHER technologies.

  18. Computing quality scores and uncertainty for approximate pattern matching in geospatial semantic graphs

    SciTech Connect (OSTI)

    Stracuzzi, David John; Brost, Randolph C.; Phillips, Cynthia A.; Robinson, David G.; Wilson, Alyson G.; Woodbridge, Diane M. -K.

    2015-09-26

    Geospatial semantic graphs provide a robust foundation for representing and analyzing remote sensor data. In particular, they support a variety of pattern search operations that capture the spatial and temporal relationships among the objects and events in the data. However, in the presence of large data corpora, even a carefully constructed search query may return a large number of unintended matches. This work considers the problem of calculating a quality score for each match to the query, given that the underlying data are uncertain. As a result, we present a preliminary evaluation of three methods for determining both match quality scores and associated uncertainty bounds, illustrated in the context of an example based on overhead imagery data.

  19. Method for indexing and retrieving manufacturing-specific digital imagery based on image content

    DOE Patents [OSTI]

    Ferrell, Regina K.; Karnowski, Thomas P.; Tobin, Jr., Kenneth W.

    2004-06-15

    A method for indexing and retrieving manufacturing-specific digital images based on image content comprises three steps. First, at least one feature vector can be extracted from a manufacturing-specific digital image stored in an image database. In particular, each extracted feature vector corresponds to a particular characteristic of the manufacturing-specific digital image, for instance, a digital image modality and overall characteristic, a substrate/background characteristic, and an anomaly/defect characteristic. Notably, the extracting step includes generating a defect mask using a detection process. Second, using an unsupervised clustering method, each extracted feature vector can be indexed in a hierarchical search tree. Third, a manufacturing-specific digital image associated with a feature vector stored in the hierarchicial search tree can be retrieved, wherein the manufacturing-specific digital image has image content comparably related to the image content of the query image. More particularly, can include two data reductions, the first performed based upon a query vector extracted from a query image. Subsequently, a user can select relevant images resulting from the first data reduction. From the selection, a prototype vector can be calculated, from which a second-level data reduction can be performed. The second-level data reduction can result in a subset of feature vectors comparable to the prototype vector, and further comparable to the query vector. An additional fourth step can include managing the hierarchical search tree by substituting a vector average for several redundant feature vectors encapsulated by nodes in the hierarchical search tree.

  20. Buildings Energy Data Book

    Buildings Energy Data Book [EERE]

    Current and Past EditionsGlossaryPopular TablesQuery Tools Contact Us Search What Is the Buildings Energy Data Book? The Data Book includes statistics on residential and commercial building energy consumption. Data tables contain statistics related to construction, building technologies, energy consumption, and building characteristics. The Building Technologies Program within the U.S. Department of Energy's Office of Energy Efficiency and Renewable Energy developed this resource to provide a

  1. DOE Science Showcase - Biofuels in the databases | OSTI, US Dept of

    Office of Scientific and Technical Information (OSTI)

    Energy, Office of Scientific and Technical Information DOE Science Showcase - Biofuels in the databases The new ScienceCinema provides access points where the term biofuels is spoken in DOE multimedia presentations. DOE Green Energy renewable energy portal offers biofuels related research. Science Accelerator returns results for biofuels from DOE resources with just one query: DOE Data Explorer DOE Information Bridge Energy Citations Database Federal R&D Project Summaries Biofuels in the

  2. National Library of Energy (BETA): the Department of Energy's National

    Office of Scientific and Technical Information (OSTI)

    Resource for Energy Literacy, Innovation and Security - Help Help Simple Search Advanced Search Search Results Search Tools Selecting, Downloading and Printing Results Emailing Results Search Tips Simple Search A simple search from the homepage will search all of the collections in the application, merge the results and rank them according to how relevant they are to your query. To conduct a Simple Search: Type in your keyword(s), like "deep web technologies" and select Search.

  3. ScienceLab | OSTI, US Dept of Energy, Office of Scientific and Technical

    Office of Scientific and Technical Information (OSTI)

    Information ScienceLab The ScienceLab product has been discontinued. For single-query access to an array of federal science education resources, intended for students, teachers, and parents, including resources from DOE, please visit: http://www.science.gov. Science.gov and USAJOBS often also have information about fellowships and internships. Don't forget to update your bookmarks! For more information on the streamlining of OSTI Products, please read the OSTIblog entitled "OSTI Is

  4. FE0005961_UIllinois | netl.doe.gov

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    RVA: 3-D Visualization and Analysis Software to Support Management of Unconventional Oil and Gas Resources Last Reviewed 12/2/2015 DE-FE0005961 Goal This project will produce a state-of-the-art 3-D visualization and analysis software package targeted for improving development of oil and gas resources. The software [RVA (Reservoir Visualization and Analysis)] will display data, models, and reservoir simulation results and have the ability to jointly visualize and query data from geologic models

  5. System and method for generating a relationship network

    DOE Patents [OSTI]

    Franks, Kasian; Myers, Cornelia A; Podowski, Raf M

    2015-05-05

    A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.

  6. Queues and Scheduling Policies

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Coal Glossary › FAQS › Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports & distribution Coal-fired electric power plants Transportation costs to electric power sector International All coal data reports Analysis & Projections Major Topics Most popular Consumption Environment Imports & exports Industry characteristics Prices Production Projections Recurring Reserves Stocks All

  7. Sandia National Laboratories: Business Opportunities Website

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Prospective Suppliers What Sandia Looks For In Our Suppliers What Does Sandia Buy? Business Opportunities Website Small Business Working with Sandia Business Opportunities Website The Business Opportunities Website (BOW) is located at the following URL: https://supplierportal.sandia.gov/OA_HTML/snl/AbstractQuery.jsp NOTE: Internet Explore and Firefox are preferred browsers. Users may encounter an error when using other browsers to view the above link. Twenty-four hours a day, 7 days a week, and

  8. Coal - U.S. Energy Information Administration (EIA)

    U.S. Energy Information Administration (EIA) Indexed Site

    Coal Glossary › FAQS › Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports & distribution Coal-fired electric power plants Transportation costs to electric power sector International All coal data reports Analysis & Projections Major Topics Most popular Consumption Environment Imports & exports Industry characteristics Prices Production Projections Recurring Reserves Stocks All

  9. Electricity - U.S. Energy Information Administration (EIA)

    U.S. Energy Information Administration (EIA) Indexed Site

    Electricity Glossary › FAQS › Overview Data Electricity Data Browser (interactive query tool with charting & mapping) Summary Sales (consumption), revenue, prices & customers Generation and thermal output Electric power plants generating capacity Consumption of fuels used to generate electricity Receipts of fossil-fuels for electricity generation Average cost of fossil-fuels for electricity generation Fossil-fuel stocks for electricity generation Revenue and expense statistics for...

  10. Quarterly Coal Report - Energy Information Administration

    U.S. Energy Information Administration (EIA) Indexed Site

    Coal Glossary › FAQS › Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports & distribution Coal-fired electric power plants Transportation costs to electric power sector International All coal data reports Analysis & Projections Major Topics Most popular Consumption Environment Imports & exports Industry characteristics Prices Production Projections Recurring Reserves Stocks All

  11. Electric Power Annual 2014 - U.S. Energy Information Administration

    U.S. Energy Information Administration (EIA) Indexed Site

    Electricity Glossary › FAQS › Overview Data Electricity Data Browser (interactive query tool with charting & mapping) Summary Sales (consumption), revenue, prices & customers Generation and thermal output Electric power plants generating capacity Consumption of fuels used to generate electricity Receipts of fossil-fuels for electricity generation Average cost of fossil-fuels for electricity generation Fossil-fuel stocks for electricity generation Revenue and expense statistics for...

  12. Electricity Monthly Update - Energy Information Administration

    U.S. Energy Information Administration (EIA) Indexed Site

    Electricity Glossary › FAQS › Overview Data Electricity Data Browser (interactive query tool with charting & mapping) Summary Sales (consumption), revenue, prices & customers Generation and thermal output Electric power plants generating capacity Consumption of fuels used to generate electricity Receipts of fossil-fuels for electricity generation Average cost of fossil-fuels for electricity generation Fossil-fuel stocks for electricity generation Revenue and expense statistics for...

  13. Worldwide report: Arms control, [July 26, 1986

    SciTech Connect (OSTI)

    1986-07-26

    This report contains translations/transcriptions of articles and/or broadcasts on arms control. Titles include; Soviet Spokesman Explains Far East Arms Cut; Delegation attends Society Naval Exercise; Defense Minister Queried on Military Reductions; Further on Society Force Withdrawals from Poland; Criteria of Military-Strategic Parity, Sufficiency; Further on Allegations of CW Materiel Sale to Iran; Reports on Nuclear, Chemical Warheads Denied; and others.

  14. JPRS report: Arms control, [July 11, 1989

    SciTech Connect (OSTI)

    1989-07-11

    This report contains translations/transcriptions of articles and/or broadcasts on arms control. Titles include: Soviet Spokesman Explains Far East Arms Cut; Delegation Attends Soviet Naval Exercise; Defense Minister Queried on Military Reductions; Further on Soviet Force Withdrawals from Poland; Criteria of Military-Strategic Parity, Sufficiency; Further on Allegations of CW Materiel Sale to Iran; Reports on Nuclear, Chemical Warheads Denied; and others.

  15. Worldwide report: Arms control, [19 July 1985

    SciTech Connect (OSTI)

    1985-07-19

    This report contains translations/transcriptions of articles and/or broadcasts on arms control. Titles include: Soviet Spokesman Explains Far East Arms Cut; Delegation attends Soviet Naval Exercise; Defense Minister Queried on Military Reductions; Further on Soviet Force Withdrawals from Poland; Criteria of Military-Strategic Parity, Sufficiency; Further on Allegations of CW Materiel Sale to Iran; Reports on Nuclear, Chemical Warheads Denied; and others.

  16. Application Programming Interface | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Commercial Buildings » Analysis Tools » Building Performance Database » Application Programming Interface Application Programming Interface While the BPD platform offers various browser-based analysis tools, third parties can also access the database through an Application Programming Interface (API). Using the API, users can query the same analytical tools available through the web interface, without compromising the security or anonymity of the database. The API enables the sharing of

  17. System and method for generating a relationship network

    DOE Patents [OSTI]

    Franks, Kasian; Myers, Cornelia A.; Podowski, Raf M.

    2011-07-26

    A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.

  18. Determining Allocation Requirements | Argonne Leadership Computing Facility

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Allocation Management Determining Allocation Requirements Querying Allocations Using cbank Mira/Cetus/Vesta Cooley Policies Documentation Feedback Please provide feedback to help guide us as we continue to build documentation for our new computing resource. [Feedback Form] Determining Allocation Requirements Estimating CPU-Hours for ALCF Blue Gene/Q Systems When estimating CPU-hours for the ALCF Blue Gene/Q systems, it is important to take into consideration the unique aspects of the Blue Gene

  19. NREL: International Activities - Philippines Wind Resource Maps and Data

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    A map depicting wind resources at 100 meters of the republic of the Philippines. Additional Resources Wind Prospector A web-based GIS applications designed to support resource assessment and data exploration associated with wind development. Philippines Wind Viewer Tutorial Learn how to navigate, display, query and download Philippines data in the Wind Prospector. Philippines Geospatial Toolkit EXE 926.5 MB Philippines Wind Resource Maps and Data In 2014, under the Enhancing Capacity for Low

  20. Electric Sales, Revenue, and Average Price 2011 - Energy Information

    U.S. Energy Information Administration (EIA) Indexed Site

    Administration Electricity Glossary › FAQS › Overview Data Electricity Data Browser (interactive query tool with charting & mapping) Summary Sales (consumption), revenue, prices & customers Generation and thermal output Electric power plants generating capacity Consumption of fuels used to generate electricity Receipts of fossil-fuels for electricity generation Average cost of fossil-fuels for electricity generation Fossil-fuel stocks for electricity generation Revenue and

  1. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond data warehouse Topic How to Integrate Anything on the Web by Dr. Walt Warnick 03 Aug, 2011 in Technology Computer Integration OSTI is especially proud of its web integration work whereby we take multiple web pages, documents, and web databases and make them appear to the user as if they were an integrated whole. Once the sources are virtually integrated by OSTI, the virtual collection becomes searchable via a single query. Because

  2. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond integration Topic How to Integrate Anything on the Web by Dr. Walt Warnick 03 Aug, 2011 in Technology Computer Integration OSTI is especially proud of its web integration work whereby we take multiple web pages, documents, and web databases and make them appear to the user as if they were an integrated whole. Once the sources are virtually integrated by OSTI, the virtual collection becomes searchable via a single query. Because

  3. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond technical Topic How to Integrate Anything on the Web by Dr. Walt Warnick 03 Aug, 2011 in Technology Computer Integration OSTI is especially proud of its web integration work whereby we take multiple web pages, documents, and web databases and make them appear to the user as if they were an integrated whole. Once the sources are virtually integrated by OSTI, the virtual collection becomes searchable via a single query. Because

  4. UNITED STATES ATOMIC ENERGY COMMISSION SAC200063~~0oooo Frank K. Pittman, Director, /Division of Waste Management and Trans-

    Office of Legacy Management (LM)

    SAC200063~~0oooo .- Frank K. Pittman, Director, /Division of Waste Management and Trans- portation, Headquarters CONTAMIWATRD EE-AEC-OWNED OR IEASED FACILITIES This memorandum responds to your TWX dated October 30, 1973, requesting certain information on the above subject. Unfortunately, some of the documentation necessary to answer your queries is no Longer available due to the records disposal program or the agreements pre- vailing at the time of release or transfer of the facilities. From

  5. Automated Nuclear Data Test Suite

    Energy Science and Technology Software Center (OSTI)

    2013-01-09

    Provides python routines to create a database of test problems in a user-defined directory tree, to query the database using user-defined parameters, to generate a list of test urns, to automatically run with user-defined particle transport codes. Includes natural isotope abundance data, and a table of benchmark effective for fast critical assemblies. Does not include input decks, cross-section libraries, or particle transport codes.

  6. Web Support

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Web Support We want to be able to respond promptly to your queries. To expedite our response, please check the specific website or page in question for the name of the appropriate person to contact. Contact our Public Affairs Department for comments about content here. Other comments about the main Lab website, www.lbl.gov, can be directed to webs...@lbl.gov. Also see privacy, security, copyright, and disclaimer information.

  7. Compact Mesh Generator

    Energy Science and Technology Software Center (OSTI)

    2007-02-02

    The CMG is a small, lightweight, structured mesh generation code. It features a simple text input parser that allows setup of various meshes via a small set of text commands. Mesh generation data can be output to text, the silo file format, or the API can be directly queried by applications. It can run serially or in parallel via MPI. The CMG includes the ability to specify varius initial conditions on a mesh via meshmore » tags.« less

  8. Accounting - What happened with that job?

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Accounting - What happened with that job? Accounting - What happened with that job? On genepool there are three options for accessing information on your past jobs: Genepool completed jobs webpage (genepool only) The UGE provided tool: qacct (genepool or phoebe) The NERSC provided tool: qqacct - Query Queue Accounting data (genepool or phoebe) Everytime a job is completed - either failed or successful, the UGE batch system writes an entry into its accounting logs. These accounting logs contain a

  9. Allocation Management | Argonne Leadership Computing Facility

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Allocation Management Determining Allocation Requirements Querying Allocations Using cbank Mira/Cetus/Vesta Cooley Policies Documentation Feedback Please provide feedback to help guide us as we continue to build documentation for our new computing resource. [Feedback Form] Allocation Management Allocations require management - balance checks, resource allocation, requesting more time, etc. Checking for an active allocation To determine if there is an active allocation, check Running Jobs. For

  10. Decontamination and Decommisioning Equipment Tracking System

    Energy Science and Technology Software Center (OSTI)

    1994-08-26

    DDETS is Relational Data Base Management System (RDBMS) which incorporates 1-D (code 39) and 2-D (PDF417) bar codes into its equipment tracking capabilities. DDETS is compatible with the Reportable Excess Automated Property System (REAPS), and has add, edit, delete and query capabilities for tracking equipment being decontaminated and decommissioned. In addition, bar code technology is utilized in the inventory tracking and shipping of equipment.

  11. Test program element II blanket and shield thermal-hydraulic and thermomechanical testing, experimental facility survey

    SciTech Connect (OSTI)

    Ware, A.G.; Longhurst, G.R.

    1981-12-01

    This report presents results of a survey conducted by EG and G Idaho to determine facilities available to conduct thermal-hydraulic and thermomechanical testing for the Department of Energy Office of Fusion Energy First Wall/Blanket/Shield Engineering Test Program. In response to EG and G queries, twelve organizations (in addition to EG and G and General Atomic) expressed interest in providing experimental facilities. A variety of methods of supplying heat is available.

  12. AmiGO: online access to ontology and annotation data

    SciTech Connect (OSTI)

    Carbon, Seth; Ireland, Amelia; Mungall, Christopher J.; Shu, ShengQiang; Marshall, Brad; Lewis, Suzanna

    2009-01-15

    AmiGO is a web application that allows users to query, browse, and visualize ontologies and related gene product annotation (association) data. AmiGO can be used online at the Gene Ontology (GO) website to access the data provided by the GO Consortium; it can also be downloaded and installed to browse local ontologies and annotations. AmiGO is free open source software developed and maintained by the GO Consortium.

  13. Nuclear Fuel Cycle Reasoner: PNNL FY13 Report

    SciTech Connect (OSTI)

    Hohimer, Ryan E.; Strasburg, Jana D.

    2013-09-30

    In Fiscal Year 2012 (FY12) PNNL implemented a formal reasoning framework and applied it to a specific challenge in nuclear nonproliferation. The Semantic Nonproliferation Analysis Platform (SNAP) was developed as a preliminary graphical user interface to demonstrate the potential power of the underlying semantic technologies to analyze and explore facts and relationships relating to the nuclear fuel cycle (NFC). In Fiscal Year 2013 (FY13) the SNAP demonstration was enhanced with respect to query and navigation usability issues.

  14. Large-Scale Geospatial Indexing for Image-Based Retrieval and Analysis

    SciTech Connect (OSTI)

    Tobin Jr, Kenneth William; Bhaduri, Budhendra L; Bright, Eddie A; Cheriydat, Anil; Karnowski, Thomas Paul; Palathingal, Paul J; Potok, Thomas E; Price, Jeffery R

    2005-12-01

    We describe a method for indexing and retrieving high-resolution image regions in large geospatial data libraries. An automated feature extraction method is used that generates a unique and specific structural description of each segment of a tessellated input image file. These tessellated regions are then merged into similar groups and indexed to provide flexible and varied retrieval in a query-by-example environment.

  15. Annual Coal Distribution Report - Energy Information Administration

    U.S. Energy Information Administration (EIA) Indexed Site

    Coal Glossary › FAQS › Overview Data Coal Data Browser (interactive query tool with charting and mapping) Summary Prices Reserves Consumption Production Stocks Imports, exports & distribution Coal-fired electric power plants Transportation costs to electric power sector International All coal data reports Analysis & Projections Major Topics Most popular Consumption Environment Imports & exports Industry characteristics Prices Production Projections Recurring Reserves Stocks All

  16. Contact DMSE | The Ames Laboratory

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Contact DMSE Division of Materials Sciences and Engineering Director Business Manager General Inquiries Web Queries Matthew Kramer Susan Elsner Julie Dredla Sarah Wiley 125 Metals Development 126 Metals Development 125 Metals Development 305 TASF mjkramer@ameslab.gov elsner@ameslab.gov jdredla@ameslab.gov swiley@ameslab.gov Alisa Sivils Administrative Specialist II 515-294-5011 107 MD MPC Cost Center Coordination E-Beam Cost Center Coordination FWP Budget Oversight Bev Carstensen Secretary II

  17. AIA 2030 Commitment Portal | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    AIA 2030 Commitment Portal AIA 2030 Commitment Portal The Design Data Exchange (DDx) lets 2030 Commitment firms track their own projects and report them to the AIA. It also lets firms query the entire 2030 database to learn about the performance of different types of projects. Firms can see their own projects superimposed on those of other firms. Whereas their own projects can be expanded, those of other firms remain anonymized. This research screen focuses on LEED office buildings under 500,000

  18. At the intersection of past and future-The Lab's archives

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    The Lab's Archives Community Connections: Your link to news and opportunities from Los Alamos National Laboratory Latest Issue:May 2016 all issues All Issues » submit At the intersection of past and future-The Lab's archives The Archives staff typically handle about 60 requests a month for everything from Freedom of Information Act queries to calls from journalists and television producers. January 1, 2013 dummy image Read our archives Contacts Editor Linda Anderman Email Community Programs

  19. Temporal Representation in Semantic Graphs

    SciTech Connect (OSTI)

    Levandoski, J J; Abdulla, G M

    2007-08-07

    A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.

  20. Sandia Cognitive Aide V2.0

    Energy Science and Technology Software Center (OSTI)

    2004-04-15

    The Sandia Cognitive Aide (SCA) collects data from personal computer use and uses this information to make suggestions to the user. It records interactions with MS Outlook, MS Word, MS PowerPoint, and the Internet Explorer, indexing email messages, documents, presentation, and web pages accessed, The user can then query the indexed documents from any Windows application. The system also suggests what it believes to be relevant terms to a given query. The software provides facilitiesmore » for constructing and submitting queries to WWW search engines. This version of the software also enables the user to define different "task contexts" within the user works. The contexts are defined in terms of related terms. The user can associate documents with these contexts. The contexts can be searched as well as the documents. This software is designed to access and utilize the cognitive model being build by Sandia National Laboratories, org. 15311 and uses the STANLEY text analysis library.« less

  1. Reliability Availability Serviceability

    Energy Science and Technology Software Center (OSTI)

    2006-09-18

    Our work is aimed at providing a data store for system-level events and presenting a flexible query interface to those events. The work extends the functinality provided by the open source Request Tracker (RT) (http://www.bestpractical.com/rt) project witht the Asset Tracker (AT) addon (http://atwiki.chaka.net). We have developed an Event Tracker add-on to RT and an interface for gathering, dispatching, and inserting system events into Event Tracker. Data sources include data from all components of the system.more » Data is initially sent to a defined set of data filters. The data filters are capable of discarding specified data, throttling input, handling context-sensitive input, passing data through an external shell pipe command, and compressing multiple data enteries into a single event. The filters then pass the data on to an event dispatch engine. The dispatcher can print events to the screen as they happen, track them in the database, forward them on, or pass them on to an external command. By collecting all of the data into a single database, we are able to leverage the Query Builder interface supplied by RT to create, save, and restore almost any kind of query imaginable.« less

  2. Efficient Analysis of Live and Historical Streaming Data and itsApplication to Cybersecurity

    SciTech Connect (OSTI)

    Reiss, Frederick; Stockinger, Kurt; Wu, Kesheng; Shoshani, Arie; Hellerstein, Joseph M.

    2007-04-06

    Applications that query data streams in order to identifytrends, patterns, or anomalies can often benefit from comparing the livestream data with archived historical stream data. However, searching thishistorical data in real time has been considered so far to beprohibitively expensive. One of the main bottlenecks is the update costsof the indices over the archived data. In this paper, we address thisproblem by using our highly-efficient bitmap indexing technology (calledFastBit) and demonstrate that the index update operations aresufficiently efficient for this bottleneck to be removed. We describe ourprototype system based on the TelegraphCQ streaming query processor andthe FastBit bitmap index. We present a detailed performance evaluation ofour system using a complex query workload for analyzing real networktraffic data. The combined system uses TelegraphCQ to analyze streams oftraffic information and FastBit to correlate current behaviors withhistorical trends. We demonstrate that our system can simultaneouslyanalyze (1) live streams with high data rates and (2) a large repositoryof historical stream data.

  3. A Metadata-Rich File System

    SciTech Connect (OSTI)

    Ames, S; Gokhale, M B; Maltzahn, C

    2009-01-07

    Despite continual improvements in the performance and reliability of large scale file systems, the management of file system metadata has changed little in the past decade. The mismatch between the size and complexity of large scale data stores and their ability to organize and query their metadata has led to a de facto standard in which raw data is stored in traditional file systems, while related, application-specific metadata is stored in relational databases. This separation of data and metadata requires considerable effort to maintain consistency and can result in complex, slow, and inflexible system operation. To address these problems, we have developed the Quasar File System (QFS), a metadata-rich file system in which files, metadata, and file relationships are all first class objects. In contrast to hierarchical file systems and relational databases, QFS defines a graph data model composed of files and their relationships. QFS includes Quasar, an XPATH-extended query language for searching the file system. Results from our QFS prototype show the effectiveness of this approach. Compared to the defacto standard, the QFS prototype shows superior ingest performance and comparable query performance on user metadata-intensive operations and superior performance on normal file metadata operations.

  4. Expediting Scientific Data Analysis with Reorganization of Data

    SciTech Connect (OSTI)

    Byna, Surendra; Wu, Kesheng

    2013-08-19

    Data producers typically optimize the layout of data files to minimize the write time. In most cases, data analysis tasks read these files in access patterns different from the write patterns causing poor read performance. In this paper, we introduce Scientific Data Services (SDS), a framework for bridging the performance gap between writing and reading scientific data. SDS reorganizes data to match the read patterns of analysis tasks and enables transparent data reads from the reorganized data. We implemented a HDF5 Virtual Object Layer (VOL) plugin to redirect the HDF5 dataset read calls to the reorganized data. To demonstrate the effectiveness of SDS, we applied two parallel data organization techniques: a sort-based organization on a plasma physics data and a transpose-based organization on mass spectrometry imaging data. We also extended the HDF5 data access API to allow selection of data based on their values through a query interface, called SDS Query. We evaluated the execution time in accessing various subsets of data through existing HDF5 Read API and SDS Query. We showed that reading the reorganized data using SDS is up to 55X faster than reading the original data.

  5. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect (OSTI)

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  6. Data Management Architectures

    SciTech Connect (OSTI)

    Critchlow, Terence J.; Abdulla, Ghaleb; Becla, Jacek; Kleese van Dam, Kerstin; Lang, Sam; McGuinness, Deborah L.

    2012-10-31

    Data management is the organization of information to support efficient access and analysis. For data intensive computing applications, the speed at which relevant data can be accessed is a limiting factor in terms of the size and complexity of computation that can be performed. Data access speed is impacted by the size of the relevant subset of the data, the complexity of the query used to define it, and the layout of the data relative to the query. As the underlying data sets become increasingly complex, the questions asked of it become more involved as well. For example, geospatial data associated with a city is no longer limited to the map data representing its streets, but now also includes layers identifying utility lines, key points, locations and types of businesses within the city limits, tax information for each land parcel, satellite imagery, and possibly even street-level views. As a result, queries have gone from simple questions, such as "how long is Main Street?", to much more complex questions such as "taking all other factors into consideration, are the property values of houses near parks higher than those under power lines, and if so, by what percentage". Answering these questions requires a coherent infrastructure, integrating the relevant data into a format optimized for the questions being asked.

  7. Cosmetic Outcomes and Complications Reported by Patients Having Undergone Breast-Conserving Treatment

    SciTech Connect (OSTI)

    Hill-Kayser, Christine E.; Vachani, Carolyn; Hampshire, Margaret K.; Di Lullo, Gloria A.; Metz, James M.

    2012-07-01

    Purpose: Over the past 30 years, much work in treatment of breast cancer has contributed to improvement of cosmetic and functional outcomes. The goal of breast-conservation treatment (BCT) is avoidance of mastectomy through use of lumpectomy and adjuvant radiation. Modern data demonstrate 'excellent' or 'good' cosmesis in >90% of patients treated with BCT. Methods and Materials: Patient-reported data were gathered via a convenience sample frame from breast cancer survivors using a publically available, free, Internet-based tool for creation of survivorship care plans. During use of the tool, breast cancer survivors are queried as to the cosmetic appearance of the treated breast, as well as perceived late effects. All data have been maintained anonymously with internal review board approval. Results: Three hundred fifty-four breast cancer survivors having undergone BCT and voluntarily using this tool were queried with regard to breast cosmesis and perceived late effects. Median diagnosis age was 48 years, and median current age 52 years. 'Excellent' cosmesis was reported by 27% (n = 88), 'Good' by 44% (n = 144), 'Fair' by 24% (n = 81), and 'Poor' by 5% (n = 18). Of the queries posted to survivors after BCT, late effects most commonly reported were cognitive changes (62%); sexual concerns (52%); changes in texture and color of irradiated skin (48%); chronic pain, numbness, or tingling (35%); and loss of flexibility in the irradiated area (30%). Survivors also described osteopenia/osteoporosis (35%), cardiopulmonary problems (12%), and lymphedema (19%). Conclusions: This anonymous tool uses a convenience sample frame to gather patient reported assessments of cosmesis and complications after breast cancer. Among the BCT population, cosmetic assessment by survivors appears less likely to be 'excellent' or 'good' than would be expected, with 30% of BCT survivors reporting 'fair' or 'poor' cosmesis. Patient reported incidence of chronic pain, as well as cognitive and sexual changes, also appears higher than expected.

  8. Streamnet; Northwest Aquatic Information Network, 2002 Annual Report.

    SciTech Connect (OSTI)

    Schmidt, Bruce

    2003-02-07

    A primary focus of the StreamNet project in FY-02 was maintenance and update of ongoing data types. Significant progress was made toward updating data for the primary data categories in the StreamNet regional database. Data updates had been slowed in previous years due to the time required for conversion of georeferencing for most data types from the 1:250,000 scale River Reach Number (RRN) system to the 1:100,000 Longitude-Latitude Identifier (LLID) system. In addition, data relating to Protected Areas and Smolt Density Model results, the last data sets in the StreamNet database still in the 1:250,000 RRN format, were converted this year to the LLID system, making them available through the on-line Web Query System. The Protected Areas data were also made available through an on-line interactive mapping application. All routine project activities continued, including project administration at the full project and cooperating project levels, project management through the StreamNet Steering Committee, maintenance of databases and Internet data delivery systems, and providing data related services to the Northwest Power Planning Council's (NWPPC) Fish and Wildlife Program. As part of system management, a new web server was put in operation, significantly improving speed and reliability of Internet data delivery. The web based data query system was modified to utilize ColdFusion, in preparation for a full conversion to ColdFusion from the custom programming in Delphi. This greatly increased flexibility and the ability to modify query system function, correct errors, and develop new query capabilities. All project participants responded to numerous requests for information (data, maps, technical assistance, etc.) throughout the year. A significant accomplishment this year was resolution of long standing differences in how fish distribution is defined and presented. By focusing strictly on definitions related to current distribution (ignoring potential and historic distribution for the time being), all project participants were able to reach agreement. This now makes it possible to update anadromous distribution and habitat use data and to also include resident fish distribution in the regional database. The cooperating projects plan to begin delivering distribution update information beginning in the first quarter of FY-03.

  9. Notices

    National Nuclear Security Administration (NNSA)

    4908 Federal Register / Vol. 77, No. 173 / Thursday, September 6, 2012 / Notices and 214 of the Commission's Regulations (18 CFR 385.211 and 385.214) on or before 5 p.m. Eastern time on the specified comment date. Protests may be considered, but intervention is necessary to become a party to the proceeding. The filings are accessible in the Commission's eLibrary system by clicking on the links or querying the docket number. eFiling is encouraged. More detailed information relating to filing

  10. DOE Research and Development Accomplishments XML Service

    Office of Scientific and Technical Information (OSTI)

    XML Service This XML service is a mechanism for searching the DOE R&D Accomplishments Database, full-text documents, and Web pages, either through a query string in a browser or via a computer application. It is based upon open standards Web protocols and facilitates communications and collaborations of applications and people. Search results are returned in an XML format. This format can be easily parsed, making it simple to add to a federated search. Specifics about the DOE R&D

  11. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond March 1, 2006 Science.gov Alerts Help Track Latest Science Information Oak Ridge, TN - The Science.gov Alert Service has been updated to take advantage of the new Science.gov 3.0 query capabilities. The Alert Service tracks the latest information on your science topics of interest and delivers that information to your desktop e-mail each Monday. The Alert Service is free, and registration is available at the Science.gov home page.

  12. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond 7, 2011 You Provide the Search Term, Green Energy Portal Provides the Concepts New Semantic Search Technology plus Auto-complete Gets You a More Direct Line to Rich Scientific Content When you type "solar power" into a search box, are you looking for information on solar farms, solar radiation, or solar electric power plants? The U.S. Department of Energy (DOE) Green Energy portal can now map your keyword query to

  13. In-Situ Microphysics from the RACORO IOP (Dataset) | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    In-Situ Microphysics from the RACORO IOP Citation Details In-Document Search Title: In-Situ Microphysics from the RACORO IOP These files were generated by Greg McFarquhar and Robert Jackson at the University of Illinois. Please contact mcfarq@atmos.uiuc.edu or rjackso2@atmos.uiuc.edu for more information or for assistance in interpreting the content of these files. We highly recommend that anyone wishing to use these files do so in a collaborative endeavor and we welcome queries and

  14. Adding Data Management Services to Parallel File Systems

    SciTech Connect (OSTI)

    Brandt, Scott

    2015-03-04

    The objective of this project, called DAMASC for “Data Management in Scientific Computing”, is to coalesce data management with parallel file system management to present a declarative interface to scientists for managing, querying, and analyzing extremely large data sets efficiently and predictably. Managing extremely large data sets is a key challenge of exascale computing. The overhead, energy, and cost of moving massive volumes of data demand designs where computation is close to storage. In current architectures, compute/analysis clusters access data in a physically separate parallel file system and largely leave it scientist to reduce data movement. Over the past decades the high-end computing community has adopted middleware with multiple layers of abstractions and specialized file formats such as NetCDF-4 and HDF5. These abstractions provide a limited set of high-level data processing functions, but have inherent functionality and performance limitations: middleware that provides access to the highly structured contents of scientific data files stored in the (unstructured) file systems can only optimize to the extent that file system interfaces permit; the highly structured formats of these files often impedes native file system performance optimizations. We are developing Damasc, an enhanced high-performance file system with native rich data management services. Damasc will enable efficient queries and updates over files stored in their native byte-stream format while retaining the inherent performance of file system data storage via declarative queries and updates over views of underlying files. Damasc has four key benefits for the development of data-intensive scientific code: (1) applications can use important data-management services, such as declarative queries, views, and provenance tracking, that are currently available only within database systems; (2) the use of these services becomes easier, as they are provided within a familiar file-based ecosystem; (3) common optimizations, e.g., indexing and caching, are readily supported across several file formats, avoiding effort duplication; and (4) performance improves significantly, as data processing is integrated more tightly with data storage. Our key contributions are: SciHadoop which explores changes to MapReduce assumption by taking advantage of semantics of structured data while preserving MapReduce’s failure and resource management; DataMods which extends common abstractions of parallel file systems so they become programmable such that they can be extended to natively support a variety of data models and can be hooked into emerging distributed runtimes such as Stanford’s Legion; and Miso which combines Hadoop and relational data warehousing to minimize time to insight, taking into account the overhead of ingesting data into data warehousing.

  15. Sandia Equation of State Model Library

    Energy Science and Technology Software Center (OSTI)

    2013-08-29

    The software provides a general interface for querying thermodynamic states of material models along with implementation of both general and specific equation of state models. In particular, models are provided for the IAPWS-IF97 and IAPWS95 water standards as well as the associated water standards for viscosity, thermal conductivity, and surface tension. The interface supports implementation of models in a variety of independent variable spaces. Also, model support routines are included that allow for coupling ofmore » models and determination and representation of phase boundaries.« less

  16. Relational Blackboard

    Energy Science and Technology Software Center (OSTI)

    2012-09-11

    The Relational Blackboard (RBB) is an extension of the H2 Relational Database to support discrete events and timeseries data. The original motivation for RBB is as a knowledge base for cognitive systems and simulations. It is useful wherever there is a need for persistent storage of timeseries (i.e. samples of a continuous process generating numerical data) and semantic labels for the data. The RBB is an extension to the H2 Relational Database, which is open-source.more » RBB is a set of stored procedures for H2 allowing data to be labeled, queried, and resampled.« less

  17. Yushu

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Yushu SciDB @ NERSC --- 1 --- Array Like Science Data " - More common than you think --- 2 --- SciDB, parallel processing without parallel programming Everything i n A rrays - Locate a n e lement a t O(constant) - Can b e v ery s parse - Best f or m achine/simula9on generated s tructure d ata - Good f or m etadata t oo * Query---like l anguage, a uto--- paralleliza:on * Do C alcula:ons i nside t he DB --- 3 --- Yushu Y ao NERSC SciDB Testbed * Partner u p w ith S cience T eams - Hold t heir

  18. AHF: Array-Based Half-Facet Data Structure for Mixed-Dimensional and

    Office of Scientific and Technical Information (OSTI)

    Non-Manifold Meshes (Conference) | SciTech Connect Conference: AHF: Array-Based Half-Facet Data Structure for Mixed-Dimensional and Non-Manifold Meshes Citation Details In-Document Search Title: AHF: Array-Based Half-Facet Data Structure for Mixed-Dimensional and Non-Manifold Meshes We present an Array-based Half-Facet mesh data structure, or AHF, for efficient mesh query and modification operations. The AHF extends the compact array-based half-edge and half-face data structures (T.J.

  19. Towards a Relation Extraction Framework for Cyber-Security Concepts

    SciTech Connect (OSTI)

    Jones, Corinne L; Bridges, Robert A; Huffer, Kelly M; Goodall, John R

    2015-01-01

    In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised NLP and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.

  20. Co-op

    Energy Science and Technology Software Center (OSTI)

    2007-05-25

    Co-op is primarily middleware software, a runtime system for the support of the Cooperative Parallel Programming model. This model is based on using whole SPMD applications as components in a scalable programming, and having them treat one another as single objects and communicate via remote method invocation. Also included is some application level software: (1) a metric space database library for managing data items located in an arbitrary metric space and retrieving based on nearestmore » neighbor queries; and (2) a Krieging extrapolation library for use in implementing adaptive sampling for generic multiscale simulations.« less

  1. Method for gathering and summarizing internet information

    DOE Patents [OSTI]

    Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna

    2010-04-06

    A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.

  2. Method for gathering and summarizing internet information

    DOE Patents [OSTI]

    Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna

    2008-01-01

    A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.

  3. System for gathering and summarizing internet information

    DOE Patents [OSTI]

    Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna

    2006-07-04

    A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.

  4. Cost and Quality of Fuels for Electric Plants - Energy Information

    Gasoline and Diesel Fuel Update (EIA)

    Decade Year-0 Year-1 Year-2 Year-3 Year-4 Year-5 Year-6 Year-7 Year-8 Year-9 2010's 0 29 85 Administration

    Electricity Glossary › FAQS › Overview Data Electricity Data Browser (interactive query tool with charting & mapping) Summary Sales (consumption), revenue, prices & customers Generation and thermal output Electric power plants generating capacity Consumption of fuels used to generate electricity Receipts of fossil-fuels for electricity generation Average cost of

  5. Navigating nuclear science: Enhancing analysis through visualization

    SciTech Connect (OSTI)

    Irwin, N.H.; Berkel, J. van; Johnson, D.K.; Wylie, B.N.

    1997-09-01

    Data visualization is an emerging technology with high potential for addressing the information overload problem. This project extends the data visualization work of the Navigating Science project by coupling it with more traditional information retrieval methods. A citation-derived landscape was augmented with documents using a text-based similarity measure to show viability of extension into datasets where citation lists do not exist. Landscapes, showing hills where clusters of similar documents occur, can be navigated, manipulated and queried in this environment. The capabilities of this tool provide users with an intuitive explore-by-navigation method not currently available in today`s retrieval systems.

  6. Notices

    Energy Savers [EERE]

    5 Federal Register / Vol. 80, No. 146 / Thursday, July 30, 2015 / Notices Description: § 4(d) Rate Filing: Negotiated Rates Filing 7-22-2015 to be effective 8/1/2015. Filed Date: 7/22/15. Accession Number: 20150722-5118. Comments Due: 5 p.m. ET 8/3/15. The filings are accessible in the Commission's eLibrary system by clicking on the links or querying the docket number. Any person desiring to intervene or protest in any of the above proceedings must file in accordance with Rules 211 and 214 of

  7. NREL: Energy Analysis - Nick Langle

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Langle Photo of Nick Langle. Nick Langle is a member of the Data Analysis and Visualization Group in the Strategic Energy Analysis Center. Front-End Web Engineer On staff since July 2010 Phone number: 303-275-3775 E-mail: nicholas.langle@nrel.gov Areas of expertise HTML/CSS/Javascript/jQuery UI/UX Responsive web design Semantic MediaWiki SEO Primary research interests Web design and usability Data visualization and analytics Geospatial analysis Transportation and alternative fuels Industrial

  8. In-Situ Microphysics from the RACORO IOP (Dataset) | Data Explorer

    Office of Scientific and Technical Information (OSTI)

    Data Explorer Search Results In-Situ Microphysics from the RACORO IOP Title: In-Situ Microphysics from the RACORO IOP These files were generated by Greg McFarquhar and Robert Jackson at the University of Illinois. Please contact mcfarq@atmos.uiuc.edu or rjackso2@atmos.uiuc.edu for more information or for assistance in interpreting the content of these files. We highly recommend that anyone wishing to use these files do so in a collaborative endeavor and we welcome queries and opportunities for

  9. Science.gov 3.0 Launched | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Science.gov 3.0 Launched Science.gov 3.0 Launched November 15, 2005 - 2:46pm Addthis Offers Increased Precision Searches of Federal Science Database WASHINGTON, DC - The latest version of Science.gov was launched today allowing more refined queries for searches of federal science databases. While Science.gov 3.0 is available to everyone, these improvements will be especially helpful to scientists and information specialists in their searches. "In these wonderful times for science, the tools

  10. UNITED STATES ATOMIC ENERGY COMMISSION

    Office of Legacy Management (LM)

    lLB"O"L"P"E OPC"AT10*s OCFlCC ..a .0x s.00 ALSUOULIQUL. "6" YLXICO "98s Nov 28 1973, Frank K. Pittmsn, Director, 'Division of Waste Management and Trans- portation, Headquarters CONTAMINATED KK-AEC-OWNED OR LEASED FACILITIES This memorandum responds to your TWK dat.ed October 30, 1973, requesting certain information on the above subject. Unfortunately, same of the documentation necessary to answer your queries is no longer available due to the records

  11. Using Web and Social Media for Influenza Surveillance

    SciTech Connect (OSTI)

    Corley, Courtney D.; Cook, Diane; Mikler, Armin R.; Singh, Karan P.

    2010-01-04

    Analysis of Google influenza-like-illness (ILI) search queries has shown a strongly correlated pattern with Centers for Disease Control (CDC) and Prevention seasonal ILI reporting data.Web and social media provide another resource to detect increases in ILI. This paper evaluates trends in blog posts that discuss influenza. Our key finding is that from 5-October 2008 to 31-January 2009 a high correlation exists between the frequency of posts, containing influenza keywords, per week and CDC influenza-like-illness surveillance data.

  12. StreamWorks - A system for Dynamic Graph Search

    SciTech Connect (OSTI)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Ray, Abhik; Beus, Sherman J.; Feo, John T.

    2013-06-11

    Acting on time-critical events by processing ever growing social media, news or cyber data streams is a major technical challenge. Many of these data sources can be modeled as multi-relational graphs. Mining and searching for subgraph patterns in a continuous setting requires an efficient approach to incremental graph search. The goal of our work is to enable real-time search capabilities for graph databases. This demonstration will present a dynamic graph query system that leverages the structural and semantic characteristics of the underlying multi-relational graph.

  13. Construction of file database management

    SciTech Connect (OSTI)

    MERRILL,KYLE J.

    2000-03-01

    This work created a database for tracking data analysis files from multiple lab techniques and equipment stored on a central file server. Experimental details appropriate for each file type are pulled from the file header and stored in a searchable database. The database also stores specific location and self-directory structure for each data file. Queries can be run on the database according to file type, sample type or other experimental parameters. The database was constructed in Microsoft Access and Visual Basic was used for extraction of information from the file header.

  14. Geospatial Toolkits and Resource Maps for Selected Countries from the National Renewable Energy Laboratory (NREL)

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    NREL developed the Geospatial Toolkit (GsT), a map-based software application that integrates resource data and geographic information systems (GIS) for integrated resource assessment. A variety of agencies within countries, along with global datasets, provided country-specific data. Originally developed in 2005, the Geospatial Toolkit was completely redesigned and re-released in November 2010 to provide a more modern, easier-to-use interface with considerably faster analytical querying capabilities. Toolkits are available for 21 countries and each one can be downloaded separately. The source code for the toolkit is also available. [Taken and edited from http://www.nrel.gov/international/geospatial_toolkits.html

  15. Feature Based Tolerancing Product Modeling V4.1

    Energy Science and Technology Software Center (OSTI)

    2001-11-30

    FBTol is a component technology in the form of software linkable library. The purpose of FBToI is to augment the shape of a nominal solid model with an explicit representation of a product’s tolerances and other non-shape attributes. This representation enforces a complete and unambiguous definition of non-shape information, permits an open architecture to dynamically create, modify, delete, and query tolerance information, and incorporates verify and checking algorithms to assure the quality of the tolerancemore » design.« less

  16. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond Naming the First World Wide Science Gateway by Kristin Bingham on Fri, Nov 16, 2007 In 2005, the idea of creating a global science gateway for the web was conceived at OSTI. It would make the best collections of scientific information from nations around the world act as if they were a single enormous collection. It would be searchable via a single query, and it would be available at no cost to anyone anywhere with web access. In the

  17. OSTI, US Dept of Energy, Office of Scientific and Technical Information |

    Office of Scientific and Technical Information (OSTI)

    Speeding access to science information from DOE and Beyond gov's Unique Collaboration by Valerie Allen on Mon, Sep 14, 2009 Science.gov is a one-stop portal for federal government science information. Over 200 million pages of science information from 14 federal agencies may be searched through a single query. How far we have come in the past decade! You may not be aware that Science.gov was developed and is governed by the Science.gov Alliance, a group of science information managers who

  18. USGS Annual Water Data Reports

    SciTech Connect (OSTI)

    2012-04-01

    Water resources data are published annually for use by engineers, scientists, managers, educators, and the general public. These archival products supplement direct access to current and historical water data provided by the National Water Information System (NWIS). Beginning with Water Year 2006, annual water data reports are available as individual electronic Site Data Sheets for the entire Nation for retrieval, download, and localized printing on demand. National distribution includes tabular and map interfaces for search, query, display and download of data. Data provided include extreme and mean discharge rates.

  19. Interoperable PKI Data Distribution in Computational Grids

    SciTech Connect (OSTI)

    Pala, Massimiliano; Cholia, Shreyas; Rea, Scott A.; Smith, Sean W.

    2008-07-25

    One of the most successful working examples of virtual organizations, computational grids need authentication mechanisms that inter-operate across domain boundaries. Public Key Infrastructures(PKIs) provide sufficient flexibility to allow resource managers to securely grant access to their systems in such distributed environments. However, as PKIs grow and services are added to enhance both security and usability, users and applications must struggle to discover available resources-particularly when the Certification Authority (CA) is alien to the relying party. This article presents how to overcome these limitations of the current grid authentication model by integrating the PKI Resource Query Protocol (PRQP) into the Grid Security Infrastructure (GSI).

  20. Automated Feature Generation in Large-Scale Geospatial Libraries for Content-Based Indexing.

    SciTech Connect (OSTI)

    Tobin Jr, Kenneth William; Bhaduri, Budhendra L; Bright, Eddie A; Cheriydat, Anil; Karnowski, Thomas Paul; Palathingal, Paul J; Potok, Thomas E; Price, Jeffery R

    2006-05-01

    We describe a method for indexing and retrieving high-resolution image regions in large geospatial data libraries. An automated feature extraction method is used that generates a unique and specific structural description of each segment of a tessellated input image file. These tessellated regions are then merged into similar groups, or sub-regions, and indexed to provide flexible and varied retrieval in a query-by-example environment. The methods of tessellation, feature extraction, sub-region clustering, indexing, and retrieval are described and demonstrated using a geospatial library representing a 153 km2 region of land in East Tennessee at 0.5 m per pixel resolution.

  1. DOE - NNSA/NFO -- Search

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Search NNSA/NFO Language Options U.S. DOE/NNSA - Nevada Field Office Search this Website To enter a query, type in a few descriptive words in the Find Results boxes and press the Enter key or click the Search button for a list of relevant results. Find Results with all of the words with the exact phrase with at least one of the words Occurrences anywhere in the page in the title of the page in the URL of the page Return results where my terms occur Sort Sort by relevance Sort by date Search

  2. Second Line of Defense Master Spares Catalog

    SciTech Connect (OSTI)

    Henderson, Dale L.; Muller, George; Mercier, Theresa M.; Brigantic, Robert T.; Perkins, Casey J.; Cooley, Scott K.

    2012-11-20

    This catalog is intended to be a comprehensive listing of repair parts, components, kits, and consumable items used on the equipment deployed at SLD sites worldwide. The catalog covers detection, CAS, network, ancillary equipment, and tools. The catalog is backed by a Master Parts Database which is used to generate the standard report views of the catalog. The master parts database is a relational database containing a record for every part in the master parts catalog along with supporting tables for normalizing fields in the records. The database also includes supporting queries, database maintenance forms, and reports.

  3. The Configuration Space Toolkit (C-Space Toolkit or CSTK) Ver. 2.5 beta

    Energy Science and Technology Software Center (OSTI)

    2010-02-24

    The C-Space Toolkit provides a software library that makes it easier to program motion planning, simulation, robotics, and virtual reality codes using the Configuration Space abstraction. Key functionality (1) enables the user to special create representations of movable and stationary rigid geometric objects, and (2) perform fast distance, interference (clash) detection, collision detection, closest-feature pairs, and contact queries in terms of object configuration. Not only can queries be computed at any given point in configurationmore » space, but they can be done exactly over linear-translational path segments and approximately for rotational path segments. Interference detection and distance computations can be done with respect to the Minkowski sum of the original geometry and a piece of convex geometry. The Toolkit takes as raw model input (1) collections of convex polygons that form the boundaries of models and (2) convex polyhedra, cones, cylinders, and discs that are models and model components. Configurations are given in terms of homogeneous transforms. A simple OpenGL-based system for displaying and animating the geometric objects is included in the implementation. This version, 2.5 Beta, incorporates feature additions and enhancements, improvements in algorithms, improved robustness, bug fixes and cleaned-up source code, better compliance with standards and recent programming convention, changes to the build process for the software, support for more recent hardware and software platforms, and improvements to documentation and source-code comments.« less

  4. 'Big Data' Collaboration: Exploring, Recording and Sharing Enterprise Knowledge

    SciTech Connect (OSTI)

    Sukumar, Sreenivas R; Ferrell, Regina Kay

    2013-01-01

    As data sources and data size proliferate, knowledge discovery from "Big Data" is starting to pose several challenges. In this paper, we address a specific challenge in the practice of enterprise knowledge management while extracting actionable nuggets from diverse data sources of seemingly-related information. In particular, we address the challenge of archiving knowledge gained through collaboration, dissemination and visualization as part of the data analysis, inference and decision-making lifecycle. We motivate the implementation of an enterprise data-discovery and knowledge recorder tool, called SEEKER based on real world case-study. We demonstrate SEEKER capturing schema and data-element relationships, tracking the data elements of value based on the queries and the analytical artifacts that are being created by analysts as they use the data. We show how the tool serves as digital record of institutional domain knowledge and a documentation for the evolution of data elements, queries and schemas over time. As a knowledge management service, a tool like SEEKER saves enterprise resources and time by avoiding analytic silos, expediting the process of multi-source data integration and intelligently documenting discoveries from fellow analysts.

  5. Coherent Image Layout using an Adaptive Visual Vocabulary

    SciTech Connect (OSTI)

    Dillard, Scott E.; Henry, Michael J.; Bohn, Shawn J.; Gosink, Luke J.

    2013-03-06

    When querying a huge image database containing millions of images, the result of the query may still contain many thousands of images that need to be presented to the user. We consider the problem of arranging such a large set of images into a visually coherent layout, one that places similar images next to each other. Image similarity is determined using a bag-of-features model, and the layout is constructed from a hierarchical clustering of the image set by mapping an in-order traversal of the hierarchy tree into a space-filling curve. This layout method provides strong locality guarantees so we are able to quantitatively evaluate performance using standard image retrieval benchmarks. Performance of the bag-of-features method is best when the vocabulary is learned on the image set being clustered. Because learning a large, discriminative vocabulary is a computationally demanding task, we present a novel method for efficiently adapting a generic visual vocabulary to a particular dataset. We evaluate our clustering and vocabulary adaptation methods on a variety of image datasets and show that adapting a generic vocabulary to a particular set of images improves performance on both hierarchical clustering and image retrieval tasks.

  6. Hanford Environmental Information System (HEIS). Volume 1, User`s guide

    SciTech Connect (OSTI)

    Not Available

    1994-01-14

    The Hanford Environmental Information System (HEIS) is a consolidated set of automated resources that effectively manage the data gathered during environmental monitoring and restoration of the Hanford Site. HEIS includes an integrated database that provides consistent and current data to all users and promotes sharing of data by the entire user community. HEIS is an information system with an inclusive database. Although the database is the nucleus of the system, HEIS also provides user access software: query-by-form data entry, extraction, and browsing facilities; menu-driven reporting facilities; an ad hoc query facility; and a geographic information system (GIS). These features, with the exception of the GIS, are described in this manual set. Because HEIS contains data from the entire Hanford Site, many varieties of data are included and have.been divided into subject areas. Related subject areas comprise several volumes of the manual set. The manual set includes a data dictionary that lists all of the fields in the HEIS database, with their definitions and a cross reference of their locations in the database; definitions of data qualifiers for analytical results; and a mapping between the HEIS software functions and the keyboard keys for each of the supported terminals or terminal emulators.

  7. Rapid Exploitation and Analysis of Documents

    SciTech Connect (OSTI)

    Buttler, D J; Andrzejewski, D; Stevens, K D; Anastasiu, D; Gao, B

    2011-11-28

    Analysts are overwhelmed with information. They have large archives of historical data, both structured and unstructured, and continuous streams of relevant messages and documents that they need to match to current tasks, digest, and incorporate into their analysis. The purpose of the READ project is to develop technologies to make it easier to catalog, classify, and locate relevant information. We approached this task from multiple angles. First, we tackle the issue of processing large quantities of information in reasonable time. Second, we provide mechanisms that allow users to customize their queries based on latent topics exposed from corpus statistics. Third, we assist users in organizing query results, adding localized expert structure over results. Forth, we use word sense disambiguation techniques to increase the precision of matching user generated keyword lists with terms and concepts in the corpus. Fifth, we enhance co-occurrence statistics with latent topic attribution, to aid entity relationship discovery. Finally we quantitatively analyze the quality of three popular latent modeling techniques to examine under which circumstances each is useful.

  8. POSet Ontology Categorizer

    Energy Science and Technology Software Center (OSTI)

    2005-03-01

    POSet Ontology Categorizer (POSOC) V1.0 The POSet Ontology Categorizer (POSOC) software package provides tools for creating and mining of poset-structured ontologies, such as the Gene Ontology (GO). Given a list of weighted query items (ex.genes,proteins, and/or phrases) and one or more focus nodes, POSOC determines the ordered set of GO nodes that summarize the query, based on selections of a scoring function, pseudo-distance measure, specificity level, and cluster determination. Pseudo-distance measures provided are minimum chainmore » length, maximum chain length, average of extreme chain lengths, and average of all chain lengths. A low specificity level, such as -1 or 0, results in a general set of clusters. Increasing the specificity results in more specific results in more specific and lighter clusters. POSOC cluster results can be compared agaist known results by calculations of precision, recall, and f-score for graph neighborhood relationships. This tool has been used in understanding the function of a set of genes, finding similar genes, and annotating new proteins. The POSOC software consists of a set of Java interfaces, classes, and programs that run on Linux or Windows platforms. It incorporates graph classes from OpenJGraph (openjgraph.sourceforge.net).« less

  9. Predicting and Detecting Emerging Cyberattack Patterns Using StreamWorks

    SciTech Connect (OSTI)

    Chin, George; Choudhury, Sutanay; Feo, John T.; Holder, Larry

    2014-06-30

    The number and sophistication of cyberattacks on industries and governments have dramatically grown in recent years. To counter this movement, new advanced tools and techniques are needed to detect cyberattacks in their early stages such that defensive actions may be taken to avert or mitigate potential damage. From a cybersecurity analysis perspective, detecting cyberattacks may be cast as a problem of identifying patterns in computer network traffic. Logically and intuitively, these patterns may take on the form of a directed graph that conveys how an attack or intrusion propagates through the computers of a network. Such cyberattack graphs could provide cybersecurity analysts with powerful conceptual representations that are natural to express and analyze. We have been researching and developing graph-centric approaches and algorithms for dynamic cyberattack detection. The advanced dynamic graph algorithms we are developing will be packaged into a streaming network analysis framework known as StreamWorks. With StreamWorks, a scientist or analyst may detect and identify precursor events and patterns as they emerge in complex networks. This analysis framework is intended to be used in a dynamic environment where network data is streamed in and is appended to a large-scale dynamic graph. Specific graphical query patterns are decomposed and collected into a graph query library. The individual decomposed subpatterns in the library are continuously and efficiently matched against the dynamic graph as it evolves to identify and detect early, partial subgraph patterns. The scalable emerging subgraph pattern algorithms will match on both structural and semantic network properties.

  10. Compressing bitmap indexes for faster search operations

    SciTech Connect (OSTI)

    Wu, Kesheng; Otoo, Ekow J.; Shoshani, Arie

    2002-04-25

    In this paper, we study the effects of compression on bitmap indexes. The main operations on the bitmaps during query processing are bitwise logical operations such as AND, OR, NOT, etc. Using the general purpose compression schemes, such as gzip, the logical operations on the compressed bitmaps are much slower than on the uncompressed bitmaps. Specialized compression schemes, like the byte-aligned bitmap code(BBC), are usually faster in performing logical operations than the general purpose schemes, but in many cases they are still orders of magnitude slower than the uncompressed scheme. To make the compressed bitmap indexes operate more efficiently, we designed a CPU-friendly scheme which we refer to as the word-aligned hybrid code (WAH). Tests on both synthetic and real application data show that the new scheme significantly outperforms well-known compression schemes at a modest increase in storage space. Compared to BBC, a scheme well-known for its operational efficiency, WAH performs logical operations about 12 times faster and uses only 60 percent more space. Compared to the uncompressed scheme, in most test cases WAH is faster while still using less space. We further verified with additional tests that the improvement in logical operation speed translates to similar improvement in query processing speed.

  11. Multi-Level Bitmap Indexes for Flash Memory Storage

    SciTech Connect (OSTI)

    Wu, Kesheng; Madduri, Kamesh; Canon, Shane

    2010-07-23

    Due to their low access latency, high read speed, and power-efficient operation, flash memory storage devices are rapidly emerging as an attractive alternative to traditional magnetic storage devices. However, tests show that the most efficient indexing methods are not able to take advantage of the flash memory storage devices. In this paper, we present a set of multi-level bitmap indexes that can effectively take advantage of flash storage devices. These indexing methods use coarsely binned indexes to answer queries approximately, and then use finely binned indexes to refine the answers. Our new methods read significantly lower volumes of data at the expense of an increased disk access count, thus taking full advantage of the improved read speed and low access latency of flash devices. To demonstrate the advantage of these new indexes, we measure their performance on a number of storage systems using a standard data warehousing benchmark called the Set Query Benchmark. We observe that multi-level strategies on flash drives are up to 3 times faster than traditional indexing strategies on magnetic disk drives.

  12. Security Profile Inspector for UNIX Systems

    Energy Science and Technology Software Center (OSTI)

    1995-04-01

    SPI/U3.2 consists of five tools used to assess and report the security posture of computers running the UNIX operating system. The tools are: Access Control Test: A rule-based system which identifies sequential dependencies in UNIX access controls. Binary Authentication Tool: Evaluates the release status of system binaries by comparing a crypto-checksum to provide table entries. Change Detection Tool: Maintains and applies a snapshot of critical system files and attributes for purposes of change detection. Configurationmore » Query Language: Accepts CQL-based scripts (provided) to evaluate queries over the status of system files, configuration of services and many other elements of UNIX system security. Password Security Inspector: Tests for weak or aged passwords. The tools are packaged with a forms-based user interface providing on-line context-sensistive help, job scheduling, parameter management and output report management utilities. Tools may be run independent of the UI.« less

  13. POSet Ontology Categorizer

    SciTech Connect (OSTI)

    Miniszewski, Sue M.

    2005-03-01

    POSet Ontology Categorizer (POSOC) V1.0 The POSet Ontology Categorizer (POSOC) software package provides tools for creating and mining of poset-structured ontologies, such as the Gene Ontology (GO). Given a list of weighted query items (ex.genes,proteins, and/or phrases) and one or more focus nodes, POSOC determines the ordered set of GO nodes that summarize the query, based on selections of a scoring function, pseudo-distance measure, specificity level, and cluster determination. Pseudo-distance measures provided are minimum chain length, maximum chain length, average of extreme chain lengths, and average of all chain lengths. A low specificity level, such as -1 or 0, results in a general set of clusters. Increasing the specificity results in more specific results in more specific and lighter clusters. POSOC cluster results can be compared agaist known results by calculations of precision, recall, and f-score for graph neighborhood relationships. This tool has been used in understanding the function of a set of genes, finding similar genes, and annotating new proteins. The POSOC software consists of a set of Java interfaces, classes, and programs that run on Linux or Windows platforms. It incorporates graph classes from OpenJGraph (openjgraph.sourceforge.net).

  14. United States Transuranium and Uranium Registries. Annual report, February 1, 2003 - January 31, 2004

    SciTech Connect (OSTI)

    Alldredge, J. R.; Brumbaugh, T. L.; Ehrhart, Susan M.; Elliston, J. T.; Filipy, R. E.; James, A. C.; Pham, M. V.; Wood, T. G.; Sasser, L. B.

    2004-01-31

    This year was my fourteenth year with the U. S. Transuranium and Uranium Registries (USTUR). How time flies! Since I became the director of the program five years ago, one of my primary goals was to increase the usefulness of the large USTUR database that consists of six tables containing personal information, medical histories, radiation exposure histories, causes of death, and the results of radiochemical analysis of organ samples collected at autopsy. It is essential that a query of one or more of these tables by USTUR researchers or by collaborating researchers provides complete and reliable information. Also, some of the tables (those without personal identifiers) are destined to appear on the USTUR website for the use of the scientific community. I am pleased to report that most of the data in the database have now been verified and formatted for easy query. It is important to note that no data were discarded; copies of the original tables were retained and the original paper documents are still available for further verification of values as needed.

  15. Materials Databases Infrastructure Constructed by First Principles Calculations: A Review

    SciTech Connect (OSTI)

    Lin, Lianshan

    2015-10-13

    The First Principles calculations, especially the calculation based on High-Throughput Density Functional Theory, have been widely accepted as the major tools in atom scale materials design. The emerging super computers, along with the powerful First Principles calculations, have accumulated hundreds of thousands of crystal and compound records. The exponential growing of computational materials information urges the development of the materials databases, which not only provide unlimited storage for the daily increasing data, but still keep the efficiency in data storage, management, query, presentation and manipulation. This review covers the most cutting edge materials databases in materials design, and their hot applications such as in fuel cells. By comparing the advantages and drawbacks of these high-throughput First Principles materials databases, the optimized computational framework can be identified to fit the needs of fuel cell applications. The further development of high-throughput DFT materials database, which in essence accelerates the materials innovation, is discussed in the summary as well.

  16. Semantic Features for Classifying Referring Search Terms

    SciTech Connect (OSTI)

    May, Chandler J.; Henry, Michael J.; McGrath, Liam R.; Bell, Eric B.; Marshall, Eric J.; Gregory, Michelle L.

    2012-05-11

    When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.

  17. BitPredator: A Discovery Algorithm for BitTorrent Initial Seeders and Peers

    SciTech Connect (OSTI)

    Borges, Raymond; Patton, Robert M; Kettani, Houssain; Masalmah, Yahya

    2011-01-01

    There is a large amount of illegal content being replicated through peer-to-peer (P2P) networks where BitTorrent is dominant; therefore, a framework to profile and police it is needed. The goal of this work is to explore the behavior of initial seeds and highly active peers to develop techniques to correctly identify them. We intend to establish a new methodology and software framework for profiling BitTorrent peers. This involves three steps: crawling torrent indexers for keywords in recently added torrents using Really Simple Syndication protocol (RSS), querying torrent trackers for peer list data and verifying Internet Protocol (IP) addresses from peer lists. We verify IPs using active monitoring methods. Peer behavior is evaluated and modeled using bitfield message responses. We also design a tool to profile worldwide file distribution by mapping IP-to-geolocation and linking to WHOIS server information in Google Earth.

  18. System of and method for transparent management of data objects in containers across distributed heterogenous resources

    DOE Patents [OSTI]

    Moore, Reagan W.; Rajasekar, Arcot; Wan, Michael Y.

    2010-09-21

    A system of and method for maintaining data objects in containers across a network of distributed heterogeneous resources in a manner which is transparent to a client. A client request pertaining to containers is resolved by querying meta data for the container, processing the request through one or more copies of the container maintained on the system, updating the meta data for the container to reflect any changes made to the container as a result processing the request, and, if a copy of the container has changed, changing the status of the copy to indicate dirty status or synchronizing the copy to one or more other copies that may be present on the system.

  19. System of and method for transparent management of data objects in containers across distributed heterogenous resources

    DOE Patents [OSTI]

    Moore, Reagan W.; Rajasekar, Arcot; Wan, Michael Y.

    2004-01-13

    A system of and method for maintaining data objects in containers across a network of distributed heterogeneous resources in a manner which is transparent to a client. A client request pertaining to containers is resolved by querying meta data for the container, processing the request through one or more copies of the container maintained on the system, updating the meta data for the container to reflect any changes made to the container as a result processing the request, and, if a copy of the container has changed, changing the status of the copy to indicate dirty status or synchronizing the copy to one or more other copies that may be present on the system.

  20. System of and method for transparent management of data objects in containers across distributed heterogenous resources

    DOE Patents [OSTI]

    Moore, Reagan W.; Rajasekar, Arcot; Wan, Michael Y.

    2007-09-11

    A system of and method for maintaining data objects in containers across a network of distributed heterogeneous resources in a manner which is transparent to a client. A client request pertaining to containers is resolved by querying meta data for the container, processing the request through one or more copies of the container maintained on the system, updating the meta data for the container to reflect any changes made to the container as a result processing the re quest, and, if a copy of the container has changed, changing the status of the copy to indicate dirty status or synchronizing the copy to one or more other copies that may be present on the system.

  1. Scenario driven data modelling: a method for integrating diverse sources of data and data streams

    DOE Patents [OSTI]

    Brettin, Thomas S.; Cottingham, Robert W.; Griffith, Shelton D.; Quest, Daniel J.

    2015-09-08

    A system and method of integrating diverse sources of data and data streams is presented. The method can include selecting a scenario based on a topic, creating a multi-relational directed graph based on the scenario, identifying and converting resources in accordance with the scenario and updating the multi-directed graph based on the resources, identifying data feeds in accordance with the scenario and updating the multi-directed graph based on the data feeds, identifying analytical routines in accordance with the scenario and updating the multi-directed graph using the analytical routines and identifying data outputs in accordance with the scenario and defining queries to produce the data outputs from the multi-directed graph.

  2. Thematic World Wide Web Visualization System

    Energy Science and Technology Software Center (OSTI)

    1996-10-10

    WebTheme is a system designed to facilitate world wide web information access and retrieval through visualization. It consists of two principal pieces, a WebTheme Server which allows users to enter in a query and automatocally harvest and process information of interest, and a WebTheme browser, which allows users to work with both Galaxies and Themescape visualizations of their data within a JAVA capable world wide web browser. WebTheme is an Internet solution, meaning that accessmore » to the server and the resulting visualizations can all be performed through the use of a WWW browser. This allows users to access and interact with SPIRE (Spatial Paradigm for Information Retrieval and Exploration) based visualizations through a web browser regardless of what computer platforms they are running on. WebTheme is specifically designed to create databases by harvesting and processing WWW home pages available on the Internet.« less

  3. An organizational survey of the Pittsburgh Energy Technology Center

    SciTech Connect (OSTI)

    Stock, D.A.; Shurberg, D.A.; Haber, S.B.

    1991-09-01

    An Organizational Survey (OS) was administrated at the Pittsburgh Energy Technology Center (PETC) that queried employees on the subjects of organizational culture, various aspects of communications, employee commitment, work group cohesion, coordination of work, environmental, safety, and health concerns, hazardous nature of work, safety and overall job satisfaction. The purpose of the OS is to measure in a quantitative and objective way the notion of culture''; that is, the values attitudes, and beliefs of the individuals working within the organization. In addition, through the OS, a broad sample of individuals can be reached that would probably not be interviewed or observed during the course of a typical assessment. The OS also provides a descriptive profile of the organization at one point in time that can then be compared to a profile taken at a different point in time to assess changes in the culture of the organization.

  4. An organizational survey of the Pittsburgh Energy Technology Center

    SciTech Connect (OSTI)

    Stock, D.A.; Shurberg, D.A.; Haber, S.B.

    1991-09-01

    An Organizational Survey (OS) was administrated at the Pittsburgh Energy Technology Center (PETC) that queried employees on the subjects of organizational culture, various aspects of communications, employee commitment, work group cohesion, coordination of work, environmental, safety, and health concerns, hazardous nature of work, safety and overall job satisfaction. The purpose of the OS is to measure in a quantitative and objective way the notion of ``culture``; that is, the values attitudes, and beliefs of the individuals working within the organization. In addition, through the OS, a broad sample of individuals can be reached that would probably not be interviewed or observed during the course of a typical assessment. The OS also provides a descriptive profile of the organization at one point in time that can then be compared to a profile taken at a different point in time to assess changes in the culture of the organization.

  5. Methods for modeling impact-induced reactivity changes in small reactors.

    SciTech Connect (OSTI)

    Tallman, Tyler N.; Radel, Tracy E.; Smith, Jeffrey A.; Villa, Daniel L.; Smith, Brandon M.; Radel, Ross F.; Lipinski, Ronald J.; Wilson, Paul Philip Hood

    2010-10-01

    This paper describes techniques for determining impact deformation and the subsequent reactivity change for a space reactor impacting the ground following a potential launch accident or for large fuel bundles in a shipping container following an accident. This technique could be used to determine the margin of subcriticality for such potential accidents. Specifically, the approach couples a finite element continuum mechanics model (Pronto3D or Presto) with a neutronics code (MCNP). DAGMC, developed at the University of Wisconsin-Madison, is used to enable MCNP geometric queries to be performed using Pronto3D output. This paper summarizes what has been done historically for reactor launch analysis, describes the impact criticality analysis methodology, and presents preliminary results using representative reactor designs.

  6. Unified Parallel Software

    Energy Science and Technology Software Center (OSTI)

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use ofmore » EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.« less

  7. Generative inspection process planner for integrated production

    SciTech Connect (OSTI)

    Brown, C.W. . Kansas City Div.); Gyorog, D.A. . Dept. of Mechanical Engineering)

    1990-04-01

    This work describes the design prototype development of a generative process planning system for dimensional inspection. The system, IPPEX (Inspection Process Planning EXpert), is a rule-based expert system for integrated production. Using as advanced product modeler, relational databases, and artificial intelligence techniques, IPPEX generates the process plan and part program for the dimensional inspection of products using CMMs. Through an application interface, the IPPEX system software accesses product definition from the product modeler. The modeler is a solid geometric modeler coupled with a dimension and tolerance modeler. Resource data regarding the machines, probes, and fixtures are queried from databases. IPPEX represents inspection process knowledge as production rules and incorporates an embedded inference engine to perform decision making. The IPPEX system, its functional architecture, system architecture, system approach, product modeling environment, inspection features, inspection knowledge, hierarchical planning strategy, user interface formats, and other fundamental issues related to inspection planning and part programming for CMMs are described. 27 refs., 16 figs., 4 tabs.

  8. Simrank: Rapid and sensitive general-purpose k-mer search tool

    SciTech Connect (OSTI)

    DeSantis, T.Z.; Keller, K.; Karaoz, U.; Alekseyenko, A.V; Singh, N.N.S.; Brodie, E.L; Pei, Z.; Andersen, G.L; Larsen, N.

    2011-04-01

    Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project (http://nihroadmap.nih.gov/hmp). Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity.

  9. High-performance Computing Applied to Semantic Databases

    SciTech Connect (OSTI)

    Goodman, Eric L.; Jimenez, Edward; Mizell, David W.; al-Saffar, Sinan; Adolf, Robert D.; Haglin, David J.

    2011-06-02

    To-date, the application of high-performance computing resources to Semantic Web data has largely focused on commodity hardware and distributed memory platforms. In this paper we make the case that more specialized hardware can offer superior scaling and close to an order of magnitude improvement in performance. In particular we examine the Cray XMT. Its key characteristics, a large, global shared-memory, and processors with a memory-latency tolerant design, offer an environment conducive to programming for the Semantic Web and have engendered results that far surpass current state of the art. We examine three fundamental pieces requisite for a fully functioning semantic database: dictionary encoding, RDFS inference, and query processing. We show scaling up to 512 processors (the largest configuration we had available), and the ability to process 20 billion triples completely in-memory.

  10. Frontier: High Performance Database Access Using Standard Web Components in a Scalable Multi-Tier Architecture

    SciTech Connect (OSTI)

    Kosyakov, S.; Kowalkowski, J.; Litvintsev, D.; Lueking, L.; Paterno, M.; White, S.P.; Autio, Lauri; Blumenfeld, B.; Maksimovic, P.; Mathis, M.; /Johns Hopkins U.

    2004-09-01

    A high performance system has been assembled using standard web components to deliver database information to a large number of broadly distributed clients. The CDF Experiment at Fermilab is establishing processing centers around the world imposing a high demand on their database repository. For delivering read-only data, such as calibrations, trigger information, and run conditions data, we have abstracted the interface that clients use to retrieve data objects. A middle tier is deployed that translates client requests into database specific queries and returns the data to the client as XML datagrams. The database connection management, request translation, and data encoding are accomplished in servlets running under Tomcat. Squid Proxy caching layers are deployed near the Tomcat servers, as well as close to the clients, to significantly reduce the load on the database and provide a scalable deployment model. Details the system's construction and use are presented, including its architecture, design, interfaces, administration, performance measurements, and deployment plan.

  11. High-performance computing applied to semantic databases.

    SciTech Connect (OSTI)

    al-Saffar, Sinan; Jimenez, Edward Steven, Jr.; Adolf, Robert; Haglin, David; Goodman, Eric L.; Mizell, David

    2010-12-01

    To-date, the application of high-performance computing resources to Semantic Web data has largely focused on commodity hardware and distributed memory platforms. In this paper we make the case that more specialized hardware can offer superior scaling and close to an order of magnitude improvement in performance. In particular we examine the Cray XMT. Its key characteristics, a large, global shared-memory, and processors with a memory-latency tolerant design, offer an environment conducive to programming for the Semantic Web and have engendered results that far surpass current state of the art. We examine three fundamental pieces requisite for a fully functioning semantic database: dictionary encoding, RDFS inference, and query processing. We show scaling up to 512 processors (the largest configuration we had available), and the ability to process 20 billion triples completely in-memory.

  12. Computer systems and methods for visualizing data

    DOE Patents [OSTI]

    Stolte, Chris; Hanrahan, Patrick

    2010-07-13

    A method for forming a visual plot using a hierarchical structure of a dataset. The dataset comprises a measure and a dimension. The dimension consists of a plurality of levels. The plurality of levels form a dimension hierarchy. The visual plot is constructed based on a specification. A first level from the plurality of levels is represented by a first component of the visual plot. A second level from the plurality of levels is represented by a second component of the visual plot. The dataset is queried to retrieve data in accordance with the specification. The data includes all or a portion of the dimension and all or a portion of the measure. The visual plot is populated with the retrieved data in accordance with the specification.

  13. Computer systems and methods for visualizStolte; Chris ing data

    DOE Patents [OSTI]

    Stolte, Chris; Hanrahan, Patrick

    2013-01-29

    A method for forming a visual plot using a hierarchical structure of a dataset. The dataset comprises a measure and a dimension. The dimension consists of a plurality of levels. The plurality of levels form a dimension hierarchy. The visual plot is constructed based on a specification. A first level from the plurality of levels is represented by a first component of the visual plot. A second level from the plurality of levels is represented by a second component of the visual plot. The dataset is queried to retrieve data in accordance with the specification. The data includes all or a portion of the dimension and all or a portion of the measure. The visual plot is populated with the retrieved data in accordance with the specification.

  14. NGNP Risk Management Database: A Model for Managing Risk

    SciTech Connect (OSTI)

    John Collins

    2009-09-01

    To facilitate the implementation of the Risk Management Plan, the Next Generation Nuclear Plant (NGNP) Project has developed and employed an analytical software tool called the NGNP Risk Management System (RMS). A relational database developed in Microsoft Access, the RMS provides conventional database utility including data maintenance, archiving, configuration control, and query ability. Additionally, the tools design provides a number of unique capabilities specifically designed to facilitate the development and execution of activities outlined in the Risk Management Plan. Specifically, the RMS provides the capability to establish the risk baseline, document and analyze the risk reduction plan, track the current risk reduction status, organize risks by reference configuration system, subsystem, and component (SSC) and Area, and increase the level of NGNP decision making.

  15. Byna-NERSC-ASCR-2017.pptx

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Requirements f or S cien0fic D ata M anagement Suren B yna Scien,fic D ata M anagement G roup Computa,onal R esearch D ivision Lawrence B erkeley L ab NERSC ASCR Requirements for 2017 January 15, 2014 LBNL Projects * m1248 repo * Arie S hoshani, S uren B yna, A lex S im, J ohn W u * Searching s cien,fic d ata * FastBit a nd F astQuery * Scien,fic D ata S ervices ( SDS) f ramework * Transparent d ata r eorganiza,on f or b eQer d ata a ccess * Redirec,on o f d ata r ead c alls f or r eorganized d

  16. Model Components of the Certification Framework for Geologic Carbon Sequestration Risk Assessment

    SciTech Connect (OSTI)

    Oldenburg, Curtis M.; Bryant, Steven L.; Nicot, Jean-Philippe; Kumar, Navanit; Zhang, Yingqi; Jordan, Preston; Pan, Lehua; Granvold, Patrick; Chow, Fotini K.

    2009-06-01

    We have developed a framework for assessing the leakage risk of geologic carbon sequestration sites. This framework, known as the Certification Framework (CF), emphasizes wells and faults as the primary potential leakage conduits. Vulnerable resources are grouped into compartments, and impacts due to leakage are quantified by the leakage flux or concentrations that could potentially occur in compartments under various scenarios. The CF utilizes several model components to simulate leakage scenarios. One model component is a catalog of results of reservoir simulations that can be queried to estimate plume travel distances and times, rather than requiring CF users to run new reservoir simulations for each case. Other model components developed for the CF and described here include fault characterization using fault-population statistics; fault connection probability using fuzzy rules; well-flow modeling with a drift-flux model implemented in TOUGH2; and atmospheric dense-gas dispersion using a mesoscale weather prediction code.

  17. Event heap: a coordination infrastructure for dynamic heterogeneous application interactions in ubiquitous computing environments

    DOE Patents [OSTI]

    Johanson, Bradley E.; Fox, Armando; Winograd, Terry A.; Hanrahan, Patrick M.

    2010-04-20

    An efficient and adaptive middleware infrastructure called the Event Heap system dynamically coordinates application interactions and communications in a ubiquitous computing environment, e.g., an interactive workspace, having heterogeneous software applications running on various machines and devices across different platforms. Applications exchange events via the Event Heap. Each event is characterized by a set of unordered, named fields. Events are routed by matching certain attributes in the fields. The source and target versions of each field are automatically set when an event is posted or used as a template. The Event Heap system implements a unique combination of features, both intrinsic to tuplespaces and specific to the Event Heap, including content based addressing, support for routing patterns, standard routing fields, limited data persistence, query persistence/registration, transparent communication, self-description, flexible typing, logical/physical centralization, portable client API, at most once per source first-in-first-out ordering, and modular restartability.

  18. Supporting Mutual Understanding in a Visual Dialogue Between Analyst and Computer

    SciTech Connect (OSTI)

    Chappell, Alan R.; Cowell, Andrew J.; Thurman, David A.; Thomson, Judi R.

    2004-09-20

    The Knowledge Associates for Novel Intelligence (KANI) project is developing a system of automated associates to actively support and participate in the information analysis task. The primary goal of KANI is to use automatically extracted information in a reasoning system that draws on the strengths of both a human analyst and automated reasoning. The interface between the two agents is a key element in achieving this goal. The KANI interface seeks to support a visual dialogue with mixed-initiative manipulation of information and reasoning components. To be successful, the interface must achieve mutual understanding between the analyst and KANI of the others actions. Toward this mutual understanding, KANI allows the analyst to work at multiple levels of abstraction over the reasoning process, links the information presented across these levels to make use of interaction context, and provides querying facilities to allow exploration and explanation.

  19. An integrated computer modeling environment for regional land use, air quality, and transportation planning

    SciTech Connect (OSTI)

    Hanley, C.J.; Marshall, N.L.

    1997-04-01

    The Land Use, Air Quality, and Transportation Integrated Modeling Environment (LATIME) represents an integrated approach to computer modeling and simulation of land use allocation, travel demand, and mobile source emissions for the Albuquerque, New Mexico, area. This environment provides predictive capability combined with a graphical and geographical interface. The graphical interface shows the causal relationships between data and policy scenarios and supports alternative model formulations. Scenarios are launched from within a Geographic Information System (GIS), and data produced by each model component at each time step within a simulation is stored in the GIS. A menu-driven query system is utilized to review link-based results and regional and area-wide results. These results can also be compared across time or between alternative land use scenarios. Using this environment, policies can be developed and implemented based on comparative analysis, rather than on single-step future projections. 16 refs., 3 figs., 2 tabs.

  20. Ensemble Data Analysis ENvironment (EDEN)

    Energy Science and Technology Software Center (OSTI)

    2012-08-01

    The EDEN toolkit facilitates exploratory data analysis and visualization of global climate model simulation datasets. EDEN provides an interactive graphical user interface (GUI) that helps the user visually construct dynamic queries of the characteristically large climate datasets using temporal ranges, variable selections, and geographic areas of interest. EDEN reads the selected data into a multivariate visualization panel which features an extended implementation of parallel coordinates plots as well as interactive scatterplots. The user can querymore » data in the visualization panel using mouse gestures to analyze different ranges of data. The visualization panel provides coordinated multiple views whereby selections made in one plot are propagated to the other plots.« less

  1. FastBit: Interactively Searching Massive Data

    SciTech Connect (OSTI)

    Wu, Kesheng; Ahern, Sean; Bethel, E. Wes; Chen, Jacqueline; Childs, Hank; Cormier-Michel, Estelle; Geddes, Cameron; Gu, Junmin; Hagen, Hans; Hamann, Bernd; Koegler, Wendy; Lauret, Jerome; Meredith, Jeremy; Messmer, Peter; Otoo, Ekow; Perevoztchikov, Victor; Poskanzer, Arthur; Prabhat,; Rubel, Oliver; Shoshani, Arie; Sim, Alexander; Stockinger, Kurt; Weber, Gunther; Zhang, Wei-Ming

    2009-06-23

    As scientific instruments and computer simulations produce more and more data, the task of locating the essential information to gain insight becomes increasingly difficult. FastBit is an efficient software tool to address this challenge. In this article, we present a summary of the key underlying technologies, namely bitmap compression, encoding, and binning. Together these techniques enable FastBit to answer structured (SQL) queries orders of magnitude faster than popular database systems. To illustrate how FastBit is used in applications, we present three examples involving a high-energy physics experiment, a combustion simulation, and an accelerator simulation. In each case, FastBit significantly reduces the response time and enables interactive exploration on terabytes of data.

  2. High Performance Multivariate Visual Data Exploration for Extremely Large Data

    SciTech Connect (OSTI)

    Rubel, Oliver; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes; Prabhat,

    2008-08-22

    One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

  3. CPTAC Assay Portal: a repository of targeted proteomic assays

    SciTech Connect (OSTI)

    Whiteaker, Jeffrey R.; Halusa, Goran; Hoofnagle, Andrew N.; Sharma, Vagisha; MacLean, Brendan; Yan, Ping; Wrobel, John; Kennedy, Jacob; Mani, DR; Zimmerman, Lisa J.; Meyer, Matthew R.; Mesri, Mehdi; Rodriguez, Henry; Abbateillo, Susan E.; Boja, Emily; Carr, Steven A.; Chan, Daniel W.; Chen, Xian; Chen, Jing; Davies, Sherri; Ellis, Matthew; Fenyo, David; Hiltket, Tara; Ketchum, Karen; Kinsinger, Christopher; Kuhn, Eric; Liebler, Daniel; Lin, De; Liu, Tao; Loss, Michael; MacCoss, Michael; Qian, Weijun; Rivers, Robert; Rodland, Karin D.; Ruggles, Kelly; Scott, Mitchell; Smith, Richard D.; Thomas, Stefani N.; Townsend, Reid; Whiteley, Gordon; Wu, Chaochao; Zhang, Hui; Zhang, Zhen; Paulovich, Amanda G.

    2014-06-27

    To address these issues, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) of the National Cancer Institute (NCI) has launched an Assay Portal (http://assays.cancer.gov) to serve as a public repository of well-characterized quantitative, MS-based, targeted proteomic assays. The purpose of the CPTAC Assay Portal is to facilitate widespread adoption of targeted MS assays by disseminating SOPs, reagents, and assay characterization data for highly characterized assays. A primary aim of the NCI-supported portal is to bring together clinicians or biologists and analytical chemists to answer hypothesis-driven questions using targeted, MS-based assays. Assay content is easily accessed through queries and filters, enabling investigators to find assays to proteins relevant to their areas of interest. Detailed characterization data are available for each assay, enabling researchers to evaluate assay performance prior to launching the assay in their own laboratory.

  4. RTDB: A memory resident real-time object database

    SciTech Connect (OSTI)

    Jerzy M. Nogiec; Eugene Desavouret

    2003-06-04

    RTDB is a fast, memory-resident object database with built-in support for distribution. It constitutes an attractive alternative for architecting real-time solutions with multiple, possibly distributed, processes or agents sharing data. RTDB offers both direct and navigational access to stored objects, with local and remote random access by object identifiers, and immediate direct access via object indices. The database supports transparent access to objects stored in multiple collaborating dispersed databases and includes a built-in cache mechanism that allows for keeping local copies of remote objects, with specifiable invalidation deadlines. Additional features of RTDB include a trigger mechanism on objects that allows for issuing events or activating handlers when objects are accessed or modified and a very fast, attribute based search/query mechanism. The overall architecture and application of RTDB in a control and monitoring system is presented.

  5. System and method for anomaly detection

    DOE Patents [OSTI]

    Scherrer, Chad

    2010-06-15

    A system and method for detecting one or more anomalies in a plurality of observations is provided. In one illustrative embodiment, the observations are real-time network observations collected from a stream of network traffic. The method includes performing a discrete decomposition of the observations, and introducing derived variables to increase storage and query efficiencies. A mathematical model, such as a conditional independence model, is then generated from the formatted data. The formatted data is also used to construct frequency tables which maintain an accurate count of specific variable occurrence as indicated by the model generation process. The formatted data is then applied to the mathematical model to generate scored data. The scored data is then analyzed to detect anomalies.

  6. Needle Federated Search Engine

    Energy Science and Technology Software Center (OSTI)

    2009-12-01

    The Idaho National Laboratory (INL) has combined a number of technologies, tools, and resources to accomplish a new means of federating search results. The resulting product is a search engine called Needle, an open-source-based tool that the INL uses internally for researching across a wide variety of information repositories. Needle has a flexible search interface that allows end users to point at any available data source. A user can select multiple sources such as commercialmore » databases (Web of Science, Engineering Index), external resources (WorldCat, Google Scholar), and internal corporate resources (email, document management system, library collections) in a single interface with one search query. In the future, INL hopes to offer this open-source engine to the public. This session will outline the development processes for making Needle™s search interface and simplifying the federation of internal and external data sources.« less

  7. Tensor Algebra Library for NVidia Graphics Processing Units

    Energy Science and Technology Software Center (OSTI)

    2015-03-16

    This is a general purpose math library implementing basic tensor algebra operations on NVidia GPU accelerators. This software is a tensor algebra library that can perform basic tensor algebra operations, including tensor contractions, tensor products, tensor additions, etc., on NVidia GPU accelerators, asynchronously with respect to the CPU host. It supports a simultaneous use of multiple NVidia GPUs. Each asynchronous API function returns a handle which can later be used for querying the completion ofmore » the corresponding tensor algebra operation on a specific GPU. The tensors participating in a particular tensor operation are assumed to be stored in local RAM of a node or GPU RAM. The main research area where this library can be utilized is the quantum many-body theory (e.g., in electronic structure theory).« less

  8. Webinar: Demonstration of NREL’s BioEnergy Atlas Tools

    Broader source: Energy.gov [DOE]

    The National Renewable Energy Laboratory (NREL) will host a free webinar on December 16 demonstrating how to use the BioEnergy Atlas tools. The U.S. Department of Energy’s Bioenergy Technologies Office funded the BioEnergy Atlas tools, which include the BioFuels and BioPower Atlases. These tools are designed as first-pass visualization tools that allow users to view many bioenergy and related datasets in Google Maps. Users can query and download map data and view incentives and state energy data, as well as select an area on the map for estimated biofuels or biopower production potential. The webinar will review the data source and date of bioenergy data layers. The NREL team will show users how to view and download data behind the map, how to view state energy data and incentives, and how to view and edit potential biofuel or biopower production in a geographical location.

  9. MeSh ToolKit v1.2

    Energy Science and Technology Software Center (OSTI)

    2004-05-15

    MSTK or Mesh Toolkit is a mesh framework that allows users to represent, manipulate and query unstructured 3D arbitrary topology meshes in a general manner without the need to code their own data structures. MSTK is a flexible framework in that is allows (or will eventually allow) a wide variety of underlying representations for the mesh while maintaining a common interface. It will allow users to choose from different mesh representations either at initialization ormore » during the program execution so that the optimal data structures are used for the particular algorithm. The interaction of users and applications with MSTK is through a functional interface that acts as through the mesh always contains vertices, edges, faces and regions and maintains connectivity between all these entities.« less

  10. Storing files in a parallel computing system using list-based index to identify replica files

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Zhang, Zhenhua; Grider, Gary

    2015-07-21

    Improved techniques are provided for storing files in a parallel computing system using a list-based index to identify file replicas. A file and at least one replica of the file are stored in one or more storage nodes of the parallel computing system. An index for the file comprises at least one list comprising a pointer to a storage location of the file and a storage location of the at least one replica of the file. The file comprises one or more of a complete file and one or more sub-files. The index may also comprise a checksum value for one or more of the file and the replica(s) of the file. The checksum value can be evaluated to validate the file and/or the file replica(s). A query can be processed using the list.

  11. Multivariate Data EXplorer (MDX)

    Energy Science and Technology Software Center (OSTI)

    2012-08-01

    The MDX toolkit facilitates exploratory data analysis and visualization of multivariate datasets. MDX provides and interactive graphical user interface to load, explore, and modify multivariate datasets stored in tabular forms. MDX uses an extended version of the parallel coordinates plot and scatterplots to represent the data. The user can perform rapid visual queries using mouse gestures in the visualization panels to select rows or columns of interest. The visualization panel provides coordinated multiple views wherebymore » selections made in one plot are propagated to the other plots. Users can also export selected data or reconfigure the visualization panel to explore relationships between columns and rows in the data.« less

  12. LDRD final report : first application of geospatial semantic graphs to SAR image data.

    SciTech Connect (OSTI)

    Brost, Randolph C.; McLendon, William Clarence,

    2013-01-01

    Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report a preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.

  13. NetState

    Energy Science and Technology Software Center (OSTI)

    2005-09-01

    NetState is a distributed network monitoring system. It uses passive sensors to develop status information on a target network. Two major features provided by NetState are version and port tracking. Version tracking maintains information about software and operating systems versions. Port tracking identifies information about active TOP and UDP ports. Multiple NetState sniffers can be deployed, one at each entry point of the target network. The sniffers monitor network traffic, then send the information tomore » the NetState server. The information is stored in centralized database which can then be accessed via standard SQL database queries or this web-based GUI, for further analysis and display.« less

  14. geryon v. 0.1

    Energy Science and Technology Software Center (OSTI)

    2010-04-28

    Geryon is intended to be a simple library for managing the CUDA Runtime, CUDA Driver, and OpenCL APIs with a consistent interface * Change from one API to another by simply changing the namespace * Use multiple APIs in the same code * Lightweight (only include files no build required) * Manage device query and selection * Simple vector and matrix containers * Simple routines for data copy and type casting * Simple routines formore » data I/O * Simple classes for managing device timing * Simple classes for managing kernel compilation and execution The primary application is to facilitate writing a single code that can be compiled using the CUDA Runtime API, the CUDA Driver API, or OpenCL.« less

  15. ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

    SciTech Connect (OSTI)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Berian James, J.; Brink, Henrik; Long, James P.; Rice, John

    2012-01-10

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL-where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up-is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  16. SU-E-T-357: Semi-Automated Knowledge-Based Radiation Therapy (KBRT) Planning for Head-And-Neck Cancer (HNC): Can KBRT Plans Achieve Better Results Than Manual Planning?

    SciTech Connect (OSTI)

    Lutzky, C; Grzetic, S; Lo, J; Das, S

    2014-06-01

    Purpose: Knowledge Based Radiation Therapy Treatment (KBRT) planning can be used to semi-automatically generate IMRT plans for new patients using constraints derived from previously manually-planned, geometrically similar patients. We investigate whether KBRT plans can achieve greater dose sparing than manual plans using optimized, organspecific constraint weighting factors. Methods: KBRT planning of HNC radiotherapy cases geometrically matched each new (query) case to one of the 105 clinically approved plans in our database. The dose distribution of the planned match was morphed to fit the querys geometry. Dose-volume constraints extracted from the morphed dose distribution were used to run the IMRT optimization with no user input. In the first version, all constraints were multiplied by a weighting factor of 0.7. The weighting factors were then systematically optimized (in order of OARs with increasing separation from the target) to maximize sparing to each OAR without compromising other OARs. The optimized, second version plans were compared against the first version plans and the clinically approved plans for 45 unilateral/bilateral target cases using the dose metrics: mean, median and maximum (brainstem and cord) doses. Results: Compared to the first version, the second version significantly reduced mean/median contralateral parotid doses (>2Gy) for bilateral cases. Other changes between the two versions were not clinically meaningful. Compared to the original clinical plans, both bilateral and unilateral plans in the second version had lower average dose metrics for 5 of the 6 OARs. Compared to the original plans, the second version achieved dose sparing that was at least as good for all OARs and better for the ipsilateral parotid (bilateral) and oral cavity (bilateral/unilateral). Differences in planning target volume coverage metrics were not clinically significant. Conclusion: HNC-KBRT planning generated IMRT plans with at least equivalent dose sparing to manually generated plans; greater dose sparing was achieved in selected OARs.

  17. Midcontinent Interactive Digital Carbon Atlas and Relational Database (MIDCARB)

    SciTech Connect (OSTI)

    Timothy R. Carr; Scott W. White

    2002-06-01

    This annual report describes progress of the project entitled ''Midcontinent Interactive Digital Carbon Atlas and Relational Database (MIDCARB)''. This project, funded by the Department of Energy, is a cooperative project that assembles a consortium of five states (Indiana, Illinois, Kansas, Kentucky and Ohio) to construct an online distributed Relational Database Management System (RDBMS) and Geographic Information System (GIS) covering aspects of carbon dioxide geologic sequestration (http://www.midcarb.org). The system links the five states in the consortium into a coordinated regional database system consisting of datasets useful to industry, regulators and the public. The project is working to provide advanced distributed computing solutions to link database servers across the five states into a single system where data is maintained at the local level but is accessed through a single Web portal and can be queried, assembled, analyzed and displayed. Each individual state has strengths in data gathering, data manipulation and data display, including GIS mapping, custom application development, web development, and database design. Sharing of expertise provides the critical mass of technical expertise to improve CO{sub 2} databases and data access in all states. This project improves the flow of data across servers in the five states and increases the amount and quality of available digital data. The MIDCARB project is developing improved online tools to provide real-time display and analyze CO{sub 2} sequestration data. The system links together data from sources, sinks and transportation within a spatial database that can be queried online. Visualization of high quality and current data can assist decision makers by providing access to common sets of high quality data in a consistent manner.

  18. Sequence modelling and an extensible data model for genomic database

    SciTech Connect (OSTI)

    Li, Peter Wei-Der Lawrence Berkeley Lab., CA )

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  19. Sequence modelling and an extensible data model for genomic database

    SciTech Connect (OSTI)

    Li, Peter Wei-Der |

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  20. An Ontology Design Pattern for Surface Water Features

    SciTech Connect (OSTI)

    Sinha, Gaurav; Mark, David; Kolas, Dave; Varanka, Dalia; Romero, Boleslo E; Feng, Chen-Chieh; Usery, Lynn; Liebermann, Joshua; Sorokine, Alexandre

    2014-01-01

    Surface water is a primary concept of human experience but concepts are captured in cultures and languages in many different ways. Still, many commonalities can be found due to the physical basis of many of the properties and categories. An abstract ontology of surface water features based only on those physical properties of landscape features has the best potential for serving as a foundational domain ontology. It can then be used to systematically incor-porate concepts that are specific to a culture, language, or scientific domain. The Surface Water ontology design pattern was developed both for domain knowledge distillation and to serve as a conceptual building-block for more complex surface water ontologies. A fundamental distinction is made in this on-tology between landscape features that act as containers (e.g., stream channels, basins) and the bodies of water (e.g., rivers, lakes) that occupy those containers. Concave (container) landforms semantics are specified in a Dry module and the semantics of contained bodies of water in a Wet module. The pattern is imple-mented in OWL, but Description Logic axioms and a detailed explanation is provided. The OWL ontology will be an important contribution to Semantic Web vocabulary for annotating surface water feature datasets. A discussion about why there is a need to complement the pattern with other ontologies, es-pecially the previously developed Surface Network pattern is also provided. Fi-nally, the practical value of the pattern in semantic querying of surface water datasets is illustrated through a few queries and annotated geospatial datasets.

  1. SU-E-T-544: A Radiation Oncology-Specific Multi-Institutional Federated Database: Initial Implementation

    SciTech Connect (OSTI)

    Hendrickson, K; Phillips, M; Fishburn, M; Evans, K; Banerian, S; Mayr, N; Wong, J; McNutt, T; Moore, J; Robertson, S

    2014-06-01

    Purpose: To implement a common database structure and user-friendly web-browser based data collection tools across several medical institutions to better support evidence-based clinical decision making and comparative effectiveness research through shared outcomes data. Methods: A consortium of four academic medical centers agreed to implement a federated database, known as Oncospace. Initial implementation has addressed issues of differences between institutions in workflow and types and breadth of structured information captured. This requires coordination of data collection from departmental oncology information systems (OIS), treatment planning systems, and hospital electronic medical records in order to include as much as possible the multi-disciplinary clinical data associated with a patients care. Results: The original database schema was well-designed and required only minor changes to meet institution-specific data requirements. Mobile browser interfaces for data entry and review for both the OIS and the Oncospace database were tailored for the workflow of individual institutions. Federation of database queries--the ultimate goal of the project--was tested using artificial patient data. The tests serve as proof-of-principle that the system as a whole--from data collection and entry to providing responses to research queries of the federated database--was viable. The resolution of inter-institutional use of patient data for research is still not completed. Conclusions: The migration from unstructured data mainly in the form of notes and documents to searchable, structured data is difficult. Making the transition requires cooperation of many groups within the department and can be greatly facilitated by using the structured data to improve clinical processes and workflow. The original database schema design is critical to providing enough flexibility for multi-institutional use to improve each institution s ability to study outcomes, determine best practices, and support research. The project has demonstrated the feasibility of deploying a federated database environment for research purposes to multiple institutions.

  2. HPC Analytics Support. Requirements for Uncertainty Quantification Benchmarks

    SciTech Connect (OSTI)

    Paulson, Patrick R.; Purohit, Sumit; Rodriguez, Luke R.

    2015-05-01

    This report outlines techniques for extending benchmark generation products so they support uncertainty quantification by benchmarked systems. We describe how uncertainty quantification requirements can be presented to candidate analytical tools supporting SPARQL. We describe benchmark data sets for evaluating uncertainty quantification, as well as an approach for using our benchmark generator to produce data sets for generating benchmark data sets.

  3. National Carbon Sequestration Database and Geographic Information System (NatCarb)

    SciTech Connect (OSTI)

    Kenneth Nelson; Timothy Carr

    2009-03-31

    This annual and final report describes the results of the multi-year project entitled 'NATional CARBon Sequestration Database and Geographic Information System (NatCarb)' (http://www.natcarb.org). The original project assembled a consortium of five states (Indiana, Illinois, Kansas, Kentucky and Ohio) in the midcontinent of the United States (MIDCARB) to construct an online distributed Relational Database Management System (RDBMS) and Geographic Information System (GIS) covering aspects of carbon dioxide (CO{sub 2}) geologic sequestration. The NatCarb system built on the technology developed in the initial MIDCARB effort. The NatCarb project linked the GIS information of the Regional Carbon Sequestration Partnerships (RCSPs) into a coordinated regional database system consisting of datasets useful to industry, regulators and the public. The project includes access to national databases and GIS layers maintained by the NatCarb group (e.g., brine geochemistry) and publicly accessible servers (e.g., USGS, and Geography Network) into a single system where data are maintained and enhanced at the local level, but are accessed and assembled through a single Web portal to facilitate query, assembly, analysis and display. This project improves the flow of data across servers and increases the amount and quality of available digital data. The purpose of NatCarb is to provide a national view of the carbon capture and storage potential in the U.S. and Canada. The digital spatial database allows users to estimate the amount of CO{sub 2} emitted by sources (such as power plants, refineries and other fossil-fuel-consuming industries) in relation to geologic formations that can provide safe, secure storage sites over long periods of time. The NatCarb project worked to provide all stakeholders with improved online tools for the display and analysis of CO{sub 2} carbon capture and storage data through a single website portal (http://www.natcarb.org/). While the external project is ending, NatCarb will continue as an internal US Department of Energy National Energy Technology Laboratory (NETL) project with the continued cooperation of personnel at both West Virginia University and the Kansas Geological Survey. The successor project will continue to organize and enhance the information about CO{sub 2} sources and developing the technology needed to access, query, analyze, display, and distribute natural resource data critical to carbon management. Data are generated, maintained and enhanced locally at the RCSP level, or at the national level in specialized data warehouses, and assembled, accessed, and analyzed in real-time through a single geoportal. To address the broader needs of a spectrum of users form high-end technical queries to the general public, NatCarb will be moving to an improved and simplified display for the general public using readily available web tools such as Google Earth{trademark} and Google Maps{trademark}. The goal is for NatCarb to expand in terms of technology and areal coverage and remain the premier functional demonstration of distributed data-management systems that cross the boundaries between institutions and geographic areas, and forms the foundation of a functioning carbon cyber-infrastructure. NatCarb provides access to first-order information to evaluate the costs, economic potential and societal issues of CO{sub 2} capture and storage, including public perception and regulatory aspects.

  4. Improving the Availability and Delivery of Critical Information for Tight Gas Resource Development in the Appalachian Basin

    SciTech Connect (OSTI)

    Mary Behling; Susan Pool; Douglas Patchen; John Harper

    2008-12-31

    To encourage, facilitate and accelerate the development of tight gas reservoirs in the Appalachian basin, the geological surveys in Pennsylvania and West Virginia collected widely dispersed data on five gas plays and formatted these data into a large database that can be accessed by individual well or by play. The database and delivery system that were developed can be applied to any of the 30 gas plays that have been defined in the basin, but for this project, data compilation was restricted to the following: the Mississippian-Devonian Berea/Murrysville sandstone play and the Upper Devonian Venango, Bradford and Elk sandstone plays in Pennsylvania and West Virginia; and the 'Clinton'/Medina sandstone play in northwestern Pennsylvania. In addition, some data were collected on the Tuscarora Sandstone play in West Virginia, which is the lateral equivalent of the Medina Sandstone in Pennsylvania. Modern geophysical logs are the most common and cost-effective tools for evaluating reservoirs. Therefore, all of the well logs in the libraries of the two surveys from wells that had penetrated the key plays were scanned, generating nearly 75,000 scanned e-log files from more than 40,000 wells. A standard file-naming convention for scanned logs was developed, which includes the well API number, log curve type(s) scanned, and the availability of log analyses or half-scale logs. In addition to well logs, other types of documents were scanned, including core data (descriptions, analyses, porosity-permeability cross-plots), figures from relevant chapters of the Atlas of Major Appalachian Gas Plays, selected figures from survey publications, and information from unpublished reports and student theses and dissertations. Monthly and annual production data from 1979 to 2007 for West Virginia wells in these plays are available as well. The final database also includes digitized logs from more than 800 wells, sample descriptions from more than 550 wells, more than 600 digital photos in 1-foot intervals from 11 cores, and approximately 260 references for these plays. A primary objective of the research was to make data and information available free to producers through an on-line data delivery model designed for public access on the Internet. The web-based application that was developed utilizes ESRI's ArcIMS GIS software to deliver both well-based and play-based data that are searchable through user-originated queries, and allows interactive regional geographic and geologic mapping that is play-based. System tools help users develop their customized spatial queries. A link also has been provided to the West Virginia Geological Survey's 'pipeline' system for accessing all available well-specific data for more than 140,000 wells in West Virginia. However, only well-specific queries by API number are permitted at this time. The comprehensive project web site (http://www.wvgs.wvnet.edu/atg) resides on West Virginia Geological Survey's servers and links are provided from the Pennsylvania Geological Survey and Appalachian Oil and Natural Gas Research Consortium web sites.

  5. Genomic insights into the evolution of hybrid isoprenoid biosynthetic gene clusters in the MAR4 marine streptomycete clade

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Gallagher, Kelley A.; Jensen, Paul R.

    2015-11-17

    Background: Considerable advances have been made in our understanding of the molecular genetics of secondary metabolite biosynthesis. Coupled with increased access to genome sequence data, new insight can be gained into the diversity and distributions of secondary metabolite biosynthetic gene clusters and the evolutionary processes that generate them. Here we examine the distribution of gene clusters predicted to encode the biosynthesis of a structurally diverse class of molecules called hybrid isoprenoids (HIs) in the genus Streptomyces. These compounds are derived from a mixed biosynthetic origin that is characterized by the incorporation of a terpene moiety onto a variety of chemicalmore » scaffolds and include many potent antibiotic and cytotoxic agents. Results: One hundred and twenty Streptomyces genomes were searched for HI biosynthetic gene clusters using ABBA prenyltransferases (PTases) as queries. These enzymes are responsible for a key step in HI biosynthesis. The strains included 12 that belong to the ‘MAR4’ clade, a largely marine-derived lineage linked to the production of diverse HI secondary metabolites. We found ABBA PTase homologs in all of the MAR4 genomes, which averaged five copies per strain, compared with 21 % of the non-MAR4 genomes, which averaged one copy per strain. Phylogenetic analyses suggest that MAR4 PTase diversity has arisen by a combination of horizontal gene transfer and gene duplication. Furthermore, there is evidence that HI gene cluster diversity is generated by the horizontal exchange of orthologous PTases among clusters. Many putative HI gene clusters have not been linked to their secondary metabolic products, suggesting that MAR4 strains will yield additional new compounds in this structure class. Finally, we confirm that the mevalonate pathway is not always present in genomes that contain HI gene clusters and thus is not a reliable query for identifying strains with the potential to produce HI secondary metabolites. In conclusion: We found that marine-derived MAR4 streptomycetes possess a relatively high genetic potential for HI biosynthesis. The combination of horizontal gene transfer, duplication, and rearrangement indicate that complex evolutionary processes account for the high level of HI gene cluster diversity in these bacteria, the products of which may provide a yet to be defined adaptation to the marine environment.« less

  6. National Computational Infrastructure for LatticeGauge Theory SciDAC-2 Closeout Report

    SciTech Connect (OSTI)

    Bapty, Theodore; Dubey, Abhishek

    2013-07-18

    As part of the reliability project work, researchers from Vanderbilt University, Fermi National Laboratory and Illinois Institute of technology developed a real-time cluster fault-tolerant cluster monitoring framework. The goal for the scientific workflow project is to investigate and develop domain-specific workflow tools for LQCD to help effectively orchestrate, in parallel, computational campaigns consisting of many loosely-coupled batch processing jobs. Major requirements for an LQCD workflow system include: a system to manage input metadata, e.g. physics parameters such as masses, a system to manage and permit the reuse of templates describing workflows, a system to capture data provenance information, a systems to manage produced data, a means of monitoring workflow progress and status, a means of resuming or extending a stopped workflow, fault tolerance features to enhance the reliability of running workflows. In summary, these achievements are reported: • Implemented a software system to manage parameters. This includes a parameter set language based on a superset of the JSON data-interchange format, parsers in multiple languages (C++, Python, Ruby), and a web-based interface tool. It also includes a templating system that can produce input text for LQCD applications like MILC. • Implemented a monitoring sensor framework in software that is in production on the Fermilab USQCD facility. This includes equipment health, process accounting, MPI/QMP process tracking, and batch system (Torque) job monitoring. All sensor data are available from databases, and various query tools can be used to extract common data patterns and perform ad hoc searches. Common batch system queries such as job status are available in command line tools and are used in actual workflow-based production by a subset of Fermilab users. • Developed a formal state machine model for scientific workflow and reliability systems. This includes the use of Vanderbilt’s Generic Modeling Envirnment (GME) tool for code generation for the production of user APIs, code stubs, testing harnesses, and model correctness verification. It is used for creating wrappers around LQCD applications so that they can be integrated into existing workflow systems such as Kepler. • Implemented a database system for tracking the state of nodes and jobs managed by the Torque batch systems used at Fermilab. This robust system and various canned queuries are used for many tasks, including monitoring the health of the clusters, managing allocated projects, producing accounting reports, and troubleshooting nodes and jobs.

  7. NATIONAL CARBON SEQUESTRATION DATABASE AND GEOGRAPHIC INFORMATION SYSTEM (NATCARB) FORMER TITLE-MIDCONTINENT INTERACTIVE DIGITAL CARBON ATLAS AND RELATIONAL DATABASE (MIDCARB)

    SciTech Connect (OSTI)

    Timothy R. Carr

    2004-07-16

    This annual report describes progress in the third year of the three-year project entitled ''Midcontinent Interactive Digital Carbon Atlas and Relational Database (MIDCARB)''. The project assembled a consortium of five states (Indiana, Illinois, Kansas, Kentucky and Ohio) to construct an online distributed Relational Database Management System (RDBMS) and Geographic Information System (GIS) covering aspects of carbon dioxide (CO{sub 2}) geologic sequestration (http://www.midcarb.org). The system links the five states in the consortium into a coordinated regional database system consisting of datasets useful to industry, regulators and the public. The project has been extended and expanded as a ''NATional CARBon Sequestration Database and Geographic Information System (NATCARB)'' to provide national coverage across the Regional CO{sub 2} Partnerships, which currently cover 40 states (http://www.natcarb.org). Advanced distributed computing solutions link database servers across the five states and other publicly accessible servers (e.g., USGS) into a single system where data is maintained and enhanced at the local level but is accessed and assembled through a single Web portal and can be queried, assembled, analyzed and displayed. This project has improved the flow of data across servers and increased the amount and quality of available digital data. The online tools used in the project have improved in stability and speed in order to provide real-time display and analysis of CO{sub 2} sequestration data. The move away from direct database access to web access through eXtensible Markup Language (XML) has increased stability and security while decreasing management overhead. The MIDCARB viewer has been simplified to provide improved display and organization of the more than 125 layers and data tables that have been generated as part of the project. The MIDCARB project is a functional demonstration of distributed management of data systems that cross the boundaries between institutions and geographic areas. The MIDCARB system addresses CO{sub 2} sequestration and other natural resource issues from sources, sinks and transportation within a spatial database that can be queried online. Visualization of high quality and current data can assist decision makers by providing access to common sets of high quality data in a consistent manner.

  8. Measuring the Interestingness of Articles in a Limited User Environment

    SciTech Connect (OSTI)

    Pon, R K

    2008-10-06

    Search engines, such as Google, assign scores to news articles based on their relevancy to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevancy scores do not take into account what makes an article interesting, which varies from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A general framework, called iScore, is presented for defining and measuring the 'interestingness' of articles, incorporating user-feedback. iScore addresses various aspects of what makes an article interesting, such as topic relevancy, uniqueness, freshness, source reputation, and writing style. It employs various methods to measure these features and uses a classifier operating on these features to recommend articles. The basic iScore configuration is shown to improve recommendation results by as much as 20%. In addition to the basic iScore features, additional features are presented to address the deficiencies of existing feature extractors, such as one that tracks multiple topics, called MTT, and a version of the Rocchio algorithm that learns its parameters online as it processes documents, called eRocchio. The inclusion of both MTT and eRocchio into iScore is shown to improve iScore recommendation results by as much as 3.1% and 5.6%, respectively. Additionally, in TREC11 Adaptive Filter Task, eRocchio is shown to be 10% better than the best filter in the last run of the task. In addition to these two major topic relevancy measures, other features are also introduced that employ language models, phrases, clustering, and changes in topics to improve recommendation results. These additional features are shown to improve recommendation results by iScore by up to 14%. Due to varying reasons that users hold regarding why an article is interesting, an online feature selection method in naive Bayes is also introduced. Online feature selection can improve recommendation results in iScore by up to 18.9%. In summary, iScore in its best configuration can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg. iScore and its components are also evaluated in the TREC Adaptive Filter task using the Reuters RCV1 corpus.

  9. DOE SBIR Phase II Final Report: Distributed Relevance Ranking in Heterogeneous Document Collections

    SciTech Connect (OSTI)

    Abe Lederman

    2007-01-08

    This report contains the comprehensive summary of the work performed on the SBIR Phase II project (Distributed Relevance Ranking in Heterogeneous Document Collections) at Deep Web Technologies (http://www.deepwebtech.com). We have successfully completed all of the tasks defined in our SBIR Proposal work plan (See Table 1 - Phase II Tasks Status). The project was completed on schedule and we have successfully deployed an initial production release of the software architecture at DOE-OSTI for the Science.gov Alliance's search portal (http://www.science.gov). We have implemented a set of grid services that supports the extraction, filtering, aggregation, and presentation of search results from numerous heterogeneous document collections. Illustration 3 depicts the services required to perform QuickRank filtering of content as defined in our architecture documentation. Functionality that has been implemented is indicated by the services highlighted in green. We have successfully tested our implementation in a multi-node grid deployment both within the Deep Web Technologies offices, and in a heterogeneous geographically distributed grid environment. We have performed a series of load tests in which we successfully simulated 100 concurrent users submitting search requests to the system. This testing was performed on deployments of one, two, and three node grids with services distributed in a number of different configurations. The preliminary results from these tests indicate that our architecture will scale well across multi-node grid deployments, but more work will be needed, beyond the scope of this project, to perform testing and experimentation to determine scalability and resiliency requirements. We are pleased to report that a production quality version (1.4) of the science.gov Alliance's search portal based on our grid architecture was released in June of 2006. This demonstration portal is currently available at http://science.gov/search30 . The portal allows the user to select from a number of collections grouped by category and enter a query expression (See Illustration 1 - Science.gov 3.0 Search Page). After the user clicks search a results page is displayed that provides a list of results from the selected collections ordered by relevance based on the query expression the user provided. Our grid based solution to deep web search and document ranking has already gained attention within DOE, other Government Agencies and a fortune 50 company. We are committed to the continued development of grid based solutions to large scale data access, filtering, and presentation problems within the domain of Information Retrieval and the more general categories of content management, data mining and data analysis.

  10. Geospatial Analysis and Technical Assistance for Power Plant Siting Interagency

    SciTech Connect (OSTI)

    Neher, L A

    2002-03-07

    The focus of this contract (in the summer and fall of 2001) was originally to help the California Energy Commission (CEC) locate and evaluate potential sites for electric power generation facilities and to assist the CEC in addressing areas of congestion on transmission lines and natural gas supply line corridors. Subsequent events have reduced the immediate urgency, although not the ultimate need for such analyses. Software technology for deploying interactive geographic information systems (GIS) accessible over the Internet have developed to the point that it is now practical to develop and publish GIS web sites that have substantial viewing, movement, query, and even map-making capabilities. As part of a separate project not funded by the CEC, the GIS Center at LLNL, on an experimental basis, has developed a web site to explore the technical difficulties as well as the interest in such a web site by agencies and others concerned with energy research. This exploratory effort offers the potential or developing an interactive GIS web site for use by the CEC for energy research, policy analysis, site evaluation, and permit and regulatory matters. To help ground the geospatial capabilities in the realistic requirements and needs of the CEC staff, the CEC requested that the GIS Center conduct interviews of several CEC staff persons to establish their current and envisioned use of spatial data and requirements for geospatial analyses. This survey will help define a web-accessible central GIS database for the CEC, which will augment the well-received work of the CEC Cartography Unit. Individuals within each siting discipline have been contacted and their responses to three question areas have been summarized. The web-based geospatial data and analytical tools developed within this project will be available to CEC staff for initial area studies, queries, and informal, small-format maps. It is not designed for fine cartography or for large-format posters such as the Cartographic Unit is excellent at producing for public meetings. Nor is it designed for the specialized geospatial analyses that the Cartographic Unit maintains a deservedly excellent reputation for producing. The web-based system could be used by the Cartographic Unit staff in support of CEC technical and policy staff to respond during meetings to questions posed by senior management or the public.

  11. Assessment of US shipbuilding current capability to build a commercial OTEC platform and a cold water pipe

    SciTech Connect (OSTI)

    Komelasky, M. C.

    1980-03-01

    Lowry and Hoffman Associates Inc. (LHA) performed for ORI an analysis of the shipbuilding requirements for constructing an OTEC plant, and the available shipyard assets which could fulfill these requirements. In addition, several shipyards were queried concerning their attitudes towards OTEC. In assessing the shipbuilding requirements for an OTEC plant, four different platform configurations were studied and four different designs of the cold water pipe (CWP) were examined. The platforms were: a concrete ship design proposed by Lockheed; concrete spar designs with internal heat exchangers (IHE) (Rosenblatt) and external heat exchangers (XHE) (Lockheed); and a steel ship design proposed by Gibbs and Cox. The types of materials examined for CWP construction were: steel, fiber reinforced plastic (FPR), elastomer, and concrete. The report is organized io three major discussion areas. All the construction requirements are synthesized for the four platforms and CWPs, and general comments are made concerning their availability in the US. Specific shipbuilders facilities are reviewed for their applicability to building an OTEC plant, an assessment of the shipyards general interest in the OTEC program is presented providing an insight into their nearterm commercial outlook. The method of determining this interest will depend largely on a risk analysis of the OTEC system. Also included are factors which may comprise this analysis, and a methodology to ascertain the risk. In the appendices, various shipyard specifications are presented, shipyard assessment matrices are given, graphs of various shipyard economic outlooks are provided, and definitions of the risk factors are listed. (WHK)

  12. A SOAP Web Service for accessing MODIS land product subsets

    SciTech Connect (OSTI)

    SanthanaVannan, Suresh K; Cook, Robert B; Pan, Jerry Yun; Wilson, Bruce E

    2011-01-01

    Remote sensing data from satellites have provided valuable information on the state of the earth for several decades. Since March 2000, the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board NASA s Terra and Aqua satellites have been providing estimates of several land parameters useful in understanding earth system processes at global, continental, and regional scales. However, the HDF-EOS file format, specialized software needed to process the HDF-EOS files, data volume, and the high spatial and temporal resolution of MODIS data make it difficult for users wanting to extract small but valuable amounts of information from the MODIS record. To overcome this usability issue, the NASA-funded Distributed Active Archive Center (DAAC) for Biogeochemical Dynamics at Oak Ridge National Laboratory (ORNL) developed a Web service that provides subsets of MODIS land products using Simple Object Access Protocol (SOAP). The ORNL DAAC MODIS subsetting Web service is a unique way of serving satellite data that exploits a fairly established and popular Internet protocol to allow users access to massive amounts of remote sensing data. The Web service provides MODIS land product subsets up to 201 x 201 km in a non-proprietary comma delimited text file format. Users can programmatically query the Web service to extract MODIS land parameters for real time data integration into models, decision support tools or connect to workflow software. Information regarding the MODIS SOAP subsetting Web service is available on the World Wide Web (WWW) at http://daac.ornl.gov/modiswebservice.

  13. Mesh infrastructure for coupled multiprocess geophysical simulations

    SciTech Connect (OSTI)

    Garimella, Rao V.; Perkins, William A.; Buksas, Mike W.; Berndt, Markus; Lipnikov, Konstantin; Coon, Ethan; Moulton, John D.; Painter, Scott L.

    2014-01-01

    We have developed a sophisticated mesh infrastructure capability to support large scale multiphysics simulations such as subsurface flow and reactive contaminant transport at storage sites as well as the analysis of the effects of a warming climate on the terrestrial arctic. These simulations involve a wide range of coupled processes including overland flow, subsurface flow, freezing and thawing of ice rich soil, accumulation, redistribution and melting of snow, biogeochemical processes involving plant matter and finally, microtopography evolution due to melting and degradation of ice wedges below the surface. In addition to supporting the usual topological and geometric queries about the mesh, the mesh infrastructure adds capabilities such as identifying columnar structures in the mesh, enabling deforming of the mesh subject to constraints and enabling the simultaneous use of meshes of different dimensionality for subsurface and surface processes. The generic mesh interface is capable of using three different open source mesh frameworks (MSTK, MOAB and STKmesh) under the hood allowing the developers to directly compare them and choose one that is best suited for the application's needs. We demonstrate the results of some simulations using these capabilities as well as present a comparison of the performance of the different mesh frameworks.

  14. Feature-based Analysis of Plasma-based Particle Acceleration Data

    SciTech Connect (OSTI)

    Ruebel, Oliver; Geddes, Cameron G.R.; Chen, Min; Cormier-Michel, Estelle; Bethel, E. Wes

    2013-07-05

    Plasma-based particle accelerators can produce and sustain thousands of times stronger acceleration fields than conventional particle accelerators, providing a potential solution to the problem of the growing size and cost of conventional particle accelerators. To facilitate scientific knowledge discovery from the ever growing collections of accelerator simulation data generated by accelerator physicists to investigate next-generation plasma-based particle accelerator designs, we describe a novel approach for automatic detection and classification of particle beams and beam substructures due to temporal differences in the acceleration process, here called acceleration features. The automatic feature detection in combination with a novel visualization tool for fast, intuitive, query-based exploration of acceleration features enables an effective top-down data exploration process, starting from a high-level, feature-based view down to the level of individual particles. We describe the application of our analysis in practice to analyze simulations of single pulse and dual and triple colliding pulse accelerator designs, and to study the formation and evolution of particle beams, to compare substructures of a beam and to investigate transverse particle loss.

  15. A Run-Time Verification Framework for Smart Grid Applications Implemented on Simulation Frameworks

    SciTech Connect (OSTI)

    Ciraci, Selim; Sozer, Hasan; Tekinerdogan, Bedir

    2013-05-18

    Smart grid applications are implemented and tested with simulation frameworks as the developers usually do not have access to large sensor networks to be used as a test bed. The developers are forced to map the implementation onto these frameworks which results in a deviation between the architecture and the code. On its turn this deviation makes it hard to verify behavioral constraints that are de- scribed at the architectural level. We have developed the ConArch toolset to support the automated verification of architecture-level behavioral constraints. A key feature of ConArch is programmable mapping for architecture to the implementation. Here, developers implement queries to identify the points in the target program that correspond to architectural interactions. ConArch generates run- time observers that monitor the flow of execution between these points and verifies whether this flow conforms to the behavioral constraints. We illustrate how the programmable mappings can be exploited for verifying behavioral constraints of a smart grid appli- cation that is implemented with two simulation frameworks.

  16. Integration of remote sensing and geographic information systems for Great Lakes water quality monitoring

    SciTech Connect (OSTI)

    Lathrop, R.G. Jr.

    1988-01-01

    The utility of three operational satellite remote sensing systems, namely, the Landsat Thematic Mapper (TM), the SPOT High Resolution Visible (HRV) sensors and the NOAA Advanced Very High Resolution Radiometer (AVHRR), were evaluated as a means of estimating water quality and surface temperature. Empirical calibration through linear regression techniques was used to relate near-simultaneously acquired satellite radiance/reflectance data and water quality observations obtained in Green Bay and the nearshore waters of Lake Michigan. Four dates of TM and one date each of SPOT and AVHRR imagery/surface reference data were acquired and analyzed. Highly significant relationships were identified between the TM and SPOT data and secchi disk depth, nephelometric turbidity, chlorophyll a, total suspended solids (TSS), absorbance, and surface temperature (TM only). The AVHRR data were not analyzed independently but were used for comparison with the TM data. Calibrated water quality image maps were input to a PC-based raster GIS package, EPPL7. Pattern interpretation and spatial analysis techniques were used to document the circulation dynamics and model mixing processes in Green Bay. A GIS facilitates the retrieval, query and spatial analysis of mapped information and provides the framework for an integrated operational monitoring system for the Great Lakes.

  17. Finding Text Information in the Ocean of Electronic Documents

    SciTech Connect (OSTI)

    Medvick, Patricia A.; Calapristi, Augustin J.

    2003-02-05

    Information management in natural resources has become an overwhelming task. A massive amount of electronic documents and data is now available for creating informed decisions. The problem is finding the relevant information to support the decision-making process. Determining gaps in knowledge in order to propose new studies or to determine which proposals to fund for maximum potential is a time-consuming and difficult task. Additionally, available data stores are increasing in complexity; they now may include not only text and numerical data, but also images, sounds, and video recordings. Information visualization specialists at Pacific Northwest National Laboratory (PNNL) have software tools for exploring electronic data stores and for discovering and exploiting relationships within data sets. These provide capabilities for unstructured text explorations, the use of data signatures (a compact format for the essence of a set of scientific data) for visualization (Wong et al 2000), visualizations for multiple query results (Havre et al. 2001), and others (http://www.pnl.gov/infoviz ). We will focus on IN-SPIRE, a MS Windows vision of PNNL’s SPIRE (Spatial Paradigm for Information Retrieval and Exploration). IN-SPIRE was developed to assist information analysts find and discover information in huge masses of text documents.

  18. Arctic & Offshore Technical Data System

    Energy Science and Technology Software Center (OSTI)

    1990-07-01

    AORIS is a computerized information system to assist the technology and planning community in the development of Arctic oil and gas resources. In general, AORIS is geographically dependent and, where possible, site specific. The main topics are sea ice, geotechnology, oceanography, meteorology, and Arctic engineering, as they relate to such offshore oil and gas activities as exploration, production, storage, and transportation. AORIS consists of a directory component that identifies 85 Arctic energy-related databases and tellsmore » how to access them; a bibliographic/management information system or bibliographic component containing over 8,000 references and abstracts on Arctic energy-related research; and a scientific and engineering information system, or data component, containing over 800 data sets, in both tabular and graphical formats, on sea ice characteristics taken from the bibliographic citations. AORIS also contains much of the so-called grey literature, i.e., data and/or locations of Arctic data collected, but never published. The three components are linked so the user may easily move from one component to another. A generic information system is provided to allow users to create their own information systems. The generic programs have the same query and updating features as AORIS, except that there is no directory component.« less

  19. Global disease monitoring and forecasting with Wikipedia

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid; Salathé, Marcel

    2014-11-13

    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data, such as social media and search queries, are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: accessmore » logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.« less

  20. A Scalable Monitoring for the CMS Filter Farm Based on Elasticsearch

    SciTech Connect (OSTI)

    Andre, J.M.; et al.

    2015-12-23

    A flexible monitoring system has been designed for the CMS File-based Filter Farm making use of modern data mining and analytics components. All the metadata and monitoring information concerning data flow and execution of the HLT are generated locally in the form of small documents using the JSON encoding. These documents are indexed into a hierarchy of elasticsearch (es) clusters along with process and system log information. Elasticsearch is a search server based on Apache Lucene. It provides a distributed, multitenant-capable search and aggregation engine. Since es is schema-free, any new information can be added seamlessly and the unstructured information can be queried in non-predetermined ways. The leaf es clusters consist of the very same nodes that form the Filter Farm thus providing natural horizontal scaling. A separate central” es cluster is used to collect and index aggregated information. The fine-grained information, all the way to individual processes, remains available in the leaf clusters. The central es cluster provides quasi-real-time high-level monitoring information to any kind of client. Historical data can be retrieved to analyse past problems or correlate them with external information. We discuss the design and performance of this system in the context of the CMS DAQ commissioning for LHC Run 2.

  1. The Environmental Assessment Management modification of CADET

    Energy Science and Technology Software Center (OSTI)

    1996-05-01

    The original CADET system (finalized in September 1995 as version 1.3) is a data collection and transfer system developed for the Headquarters Air Force Space Command (HQAFSPC) Environmental Compliance Assessment and Management Program (ECAMP). The system was designed as a tool for ECAMP evaluators to use to enter compliance related data while in the field and to subsequently store, modify, sort, query, and print the data and to electronically transfer the data into the Airmore » Force''s Work Information Management System Environmental Subsystem (WIMSES). The original CADET system was designed to match the database structure of the WIMSES ECAMP module that came on-line in 1992. In June 1995, the Department of Defense issued The Environmental Assessment Management (TEAM) Guide and ECAMP Supplement to the TEAM Guide. These included changes to the type and amount of data collected during an ECAMP assessment. The WIMSES database structure was not modified to match the TEAM Guide; however, the need for collecting and storing the ECAMP data remained. The HQAFSC decided to modify the CADET system to incorporate the changes specified in the ECAMP Supplement and to convert the system from simply a data entry and transfer tool to a data entry and storage system to manage ECAMP findings in lieu of the WIMSES ECAMP module. The revised software is designated as version 2.0 and nicknamed TEAM CADET to distinguish it from the original CADET system.« less

  2. Open Research Challenges with Big Data - A Data-Scientist s Perspective

    SciTech Connect (OSTI)

    Sukumar, Sreenivas R

    2015-01-01

    In this paper, we discuss data-driven discovery challenges of the Big Data era. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny and evaluation for gleaning insights from the data than ever before. In that context, we pose and debate the question - Are data mining algorithms scaling with the ability to store and compute? If yes, how? If not, why not? We survey recent developments in the state-of-the-art to discuss emerging and outstanding challenges in the design and implementation of machine learning algorithms at scale. We leverage experience from real-world Big Data knowledge discovery projects across domains of national security, healthcare and manufacturing to suggest our efforts be focused along the following axes: (i) the data science challenge - designing scalable and flexible computational architectures for machine learning (beyond just data-retrieval); (ii) the science of data challenge the ability to understand characteristics of data before applying machine learning algorithms and tools; and (iii) the scalable predictive functions challenge the ability to construct, learn and infer with increasing sample size, dimensionality, and categories of labels. We conclude with a discussion of opportunities and directions for future research.

  3. Concept of Operations for Collaboration and Discovery from Big Data Across Enterprise Data Warehouses

    SciTech Connect (OSTI)

    Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R; McNair, Wade

    2013-01-01

    The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.

  4. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    SciTech Connect (OSTI)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  5. Method and apparatus for biological sequence comparison

    DOE Patents [OSTI]

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  6. Web-Based Geographic Information System Tool for Accessing Hanford Site Environmental Data

    SciTech Connect (OSTI)

    Triplett, Mark B.; Seiple, Timothy E.; Watson, David J.; Charboneau, Briant L.; Morse, John G.

    2014-11-15

    Data volume, complexity, and access issues pose severe challenges for analysts, regulators and stakeholders attempting to efficiently use legacy data to support decision making at the U.S. Department of Energys (DOE) Hanford Site. DOE has partnered with the Pacific Northwest National Laboratory (PNNL) on the PHOENIX (PNNL-Hanford Online Environmental Information System) project, which seeks to address data access, transparency, and integration challenges at Hanford to provide effective decision support. PHOENIX is a family of spatially-enabled web applications providing quick access to decades of valuable scientific data and insight through intuitive query, visualization, and analysis tools. PHOENIX realizes broad, public accessibility by relying only on ubiquitous web-browsers, eliminating the need for specialized software. It accommodates a wide range of users with intuitive user interfaces that require little or no training to quickly obtain and visualize data. Currently, PHOENIX is actively hosting three applications focused on groundwater monitoring, groundwater clean-up performance reporting, and in-tank monitoring. PHOENIX-based applications are being used to streamline investigative and analytical processes at Hanford, saving time and money. But more importantly, by integrating previously isolated datasets and developing relevant visualization and analysis tools, PHOENIX applications are enabling DOE to discover new correlations hidden in legacy data, allowing them to more effectively address complex issues at Hanford.

  7. Advanced cryogenics for cutting tools. Final report

    SciTech Connect (OSTI)

    Lazarus, L.J.

    1996-10-01

    The purpose of the investigation was to determine if cryogenic treatment improved the life and cost effectiveness of perishable cutting tools over other treatments or coatings. Test results showed that in five of seven of the perishable cutting tools tested there was no improvement in tool life. The other two tools showed a small gain in tool life, but not as much as when switching manufacturers of the cutting tool. The following conclusions were drawn from this study: (1) titanium nitride coatings are more effective than cryogenic treatment in increasing the life of perishable cutting tools made from all cutting tool materials, (2) cryogenic treatment may increase tool life if the cutting tool is improperly heat treated during its origination, and (3) cryogenic treatment was only effective on those tools made from less sophisticated high speed tool steels. As a part of a recent detailed investigation, four cutting tool manufacturers and two cutting tool laboratories were queried and none could supply any data to substantiate cryogenic treatment of perishable cutting tools.

  8. From Question Answering to Visual Exploration

    SciTech Connect (OSTI)

    McColgin, Dave W.; Gregory, Michelle L.; Hetzler, Elizabeth G.; Turner, Alan E.

    2006-08-11

    Research in Question Answering has focused on the quality of information retrieval or extraction using the metrics of precision and recall to judge success; these metrics drive toward finding the specific best answer(s) and are best supportive of a lookup type of search. These do not address the opportunity that users? natural language questions present for exploratory interactions. In this paper, we present an integrated Question Answering environment that combines a visual analytics tool for unstructured text and a state-of-the-art query expansion tool designed to compliment the cognitive processes associated with an information analysts work flow. Analysts are seldom looking for factoid answers to simple questions; their information needs are much more complex in that they may be interested in patterns of answers over time, conflicting information, and even related non-answer data may be critical to learning about a problem or reaching prudent conclusions. In our visual analytics tool, questions result in a comprehensive answer space that allows users to explore the variety within the answers and spot related information in the rest of the data. The exploratory nature of the dialog between the user and this system requires tailored evaluation methods that better address the evolving user goals and counter cognitive biases inherent to exploratory search tasks.

  9. Streaming data analytics via message passing with application to graph algorithms

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Plimpton, Steven J.; Shead, Tim

    2014-05-06

    The need to process streaming data, which arrives continuously at high-volume in real-time, arises in a variety of contexts including data produced by experiments, collections of environmental or network sensors, and running simulations. Streaming data can also be formulated as queries or transactions which operate on a large dynamic data store, e.g. a distributed database. We describe a lightweight, portable framework named PHISH which enables a set of independent processes to compute on a stream of data in a distributed-memory parallel manner. Datums are routed between processes in patterns defined by the application. PHISH can run on top of eithermore » message-passing via MPI or sockets via ZMQ. The former means streaming computations can be run on any parallel machine which supports MPI; the latter allows them to run on a heterogeneous, geographically dispersed network of machines. We illustrate how PHISH can support streaming MapReduce operations, and describe streaming versions of three algorithms for large, sparse graph analytics: triangle enumeration, subgraph isomorphism matching, and connected component finding. Lastly, we also provide benchmark timings for MPI versus socket performance of several kernel operations useful in streaming algorithms.« less

  10. Method and apparatus for biological sequence comparison

    DOE Patents [OSTI]

    Marr, Thomas G.; Chang, William I-Wei

    1997-01-01

    A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.

  11. The risk assessment information system

    SciTech Connect (OSTI)

    Kerr, S.B.; Bonczek, R.R.; McGinn, C.W.; Land, M.L.; Bloom, L.D.; Sample, B.E.; Dolislager, F.G.

    1998-06-01

    In an effort to provide service-oriented environmental risk assessment expertise, the Department of Energy (DOE) Center for Risk Excellence (CRE) and DOE Oak Ridge Operations Office (ORO) are sponsoring Oak Ridge National Laboratory (ORNL) to develop a web-based system for disseminating risk tools and information to its users. This system, the Risk Assessment Information System (RAIS), was initially developed to support the site-specific needs of the DOE-ORO Environmental Restoration Risk Assessment Program. With support from the CRE, the system is currently being expanded to benefit all DOE risk information users and can be tailored to meet site-specific needs. Taking advantage of searchable and executable databases, menu-driven queries, and data downloads, using the latest World Wide Web technologies, the RAIS offers essential tools that are used in the risk assessment process or anywhere from project scoping to implementation. The RAIS tools can be located directly at http://risk.lsd.ornl.gov/homepage/rap{_}tool.htm or through the CRE`s homepage at http://www.doe.gov/riskcenter/home.html.

  12. Global prevalence and distribution of genes and microorganisms involved in mercury methylation

    SciTech Connect (OSTI)

    Podar, Mircea; Gilmour, C C; Brandt, Craig C; Bullock, Allyson L; Brown, Steven D; Crable, Bryan R; Palumbo, Anthony Vito; Somenahally, Anil C; Elias, Dwayne A

    2015-01-01

    Mercury methylation produces the neurotoxic, highly bioaccumulative methylmercury (MeHg). Recent identification of the methylation genes (hgcAB) provides the foundation for broadly evaluating microbial Hg-methylation potential in nature without making explicit rate measurements. We queried hgcAB diversity and distribution in all available microbial metagenomes, encompassing most environments. The genes were found in nearly all anaerobic, but not in aerobic, environments including oxygenated layers of the open ocean. Critically, hgcAB was effectively absent in ~1500 human microbiomes, suggesting a low risk of endogenous MeHg production. New potential methylation habitats were identified, including invertebrate guts, thawing permafrost, coastal dead zones , soils, sediments, and extreme environments, suggesting multiple routes for MeHg entry into food webs. Several new taxonomic groups potentially capable of Hg-methylation emerged, including lineages having no cultured representatives. We begin to address long-standing evolutionary questions about Hg-methylation and ancient carbon fixation mechanisms while generating a new global view of Hg-methylation potential.

  13. Compact Graph Representations and Parallel Connectivity Algorithms for Massive Dynamic Network Analysis

    SciTech Connect (OSTI)

    Madduri, Kamesh; Bader, David A.

    2009-02-15

    Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socioeconomic interactions, social networking web sites, communication traffic, and scientific computing can be intuitively modeled as graphs. We present the first study of novel high-performance combinatorial techniques for analyzing large-scale information networks, encapsulating dynamic interaction data in the order of billions of entities. We present new data structures to represent dynamic interaction networks, and discuss algorithms for processing parallel insertions and deletions of edges in small-world networks. With these new approaches, we achieve an average performance rate of 25 million structural updates per second and a parallel speedup of nearly28 on a 64-way Sun UltraSPARC T2 multicore processor, for insertions and deletions to a small-world network of 33.5 million vertices and 268 million edges. We also design parallel implementations of fundamental dynamic graph kernels related to connectivity and centrality queries. Our implementations are freely distributed as part of the open-source SNAP (Small-world Network Analysis and Partitioning) complex network analysis framework.

  14. Streaming data analytics via message passing with application to graph algorithms

    SciTech Connect (OSTI)

    Plimpton, Steven J.; Shead, Tim

    2014-05-06

    The need to process streaming data, which arrives continuously at high-volume in real-time, arises in a variety of contexts including data produced by experiments, collections of environmental or network sensors, and running simulations. Streaming data can also be formulated as queries or transactions which operate on a large dynamic data store, e.g. a distributed database. We describe a lightweight, portable framework named PHISH which enables a set of independent processes to compute on a stream of data in a distributed-memory parallel manner. Datums are routed between processes in patterns defined by the application. PHISH can run on top of either message-passing via MPI or sockets via ZMQ. The former means streaming computations can be run on any parallel machine which supports MPI; the latter allows them to run on a heterogeneous, geographically dispersed network of machines. We illustrate how PHISH can support streaming MapReduce operations, and describe streaming versions of three algorithms for large, sparse graph analytics: triangle enumeration, subgraph isomorphism matching, and connected component finding. Lastly, we also provide benchmark timings for MPI versus socket performance of several kernel operations useful in streaming algorithms.

  15. Model Investigation of Temperature and Concentration Dependent Luminescence of Erbium-doped Tellurite Glasses

    SciTech Connect (OSTI)

    Ghoshal, S. K.; Sahar, M. R.; Rohani, M. S.; Tewari, H. S.

    2011-11-22

    Improving the up-conversion efficiency is the key issue in tellurite glasses. The quantum efficiency, radiative transition rate and lifetimes of excited states are greatly influenced by the optical properties of the host material, ligand field, multiphonon relaxation processes, impurities, temperature and concentration of erbium ions. We develop a comprehensive 4-level model to examine the radiative and nonradiative (NR) decay processes for the green ({sup 4}S{sub 3/2}{yields}{sup 4}I{sub 15/2}) and red ({sup 4}F{sub 9/2}{yields}{sup 4}I{sub 15/2}) emission over a temperature range of (10-340 K) and concentration range of (0.1-4.5 mol.%). Concentration dependent enhancement and thermal quenching of efficiency for up-conversion is investigated using the derived rate equations. These features are attributed to the NR energy transfer processes, trapped impurity effects, and thermal assisted hopping. The unusual nature of temperature and concentration dependent quenching effects for green and red emission is queries for further investigations. It is further suggested that to achieve higher infrared to visible up-converted efficiency in tellurite glasses the NR channels for energy and charge transfer by phonon and impurity mediated process has to be minimized. Our results on pump power dependent emission intensity, quantum efficiency, luminescence intensity, radiative lifetimes, and transition probabilities are in conformity with other experimental findings.

  16. Quick start user%3CU%2B2019%3Es guide for the PATH/AWARE decision support system.

    SciTech Connect (OSTI)

    Knowlton, Robert G.; Melton, Brad Joseph; Einfeld, Wayne; Tucker, Mark D; Franco, David Oliver; Yang, Lynn I.

    2013-06-01

    The Prioritization Analysis Tool for All-Hazards/Analyzer for Wide Area Restoration Effectiveness (PATH/AWARE) software system, developed by Sandia National Laboratories, is a comprehensive decision support tool designed to analyze situational awareness, as well as response and recovery actions, following a wide-area release of chemical, biological or radiological materials. The system provides capability to prioritize critical infrastructure assets and services for restoration. It also provides a capability to assess resource needs (e.g., number of sampling teams, laboratory capacity, decontamination units, etc.), timelines for consequence management activities, and costs. PATH/AWARE is a very comprehensive tool set with a considerable amount of database information managed through a Microsoft SQL (Structured Query Language) database engine, a Geographical Information System (GIS) engine that provides comprehensive mapping capabilities, as well as comprehensive decision logic to carry out the functional aspects of the tool set. This document covers the basic installation and operation of the PATH/AWARE tool in order to give the user enough information to start using the tool. A companion users manual is under development with greater specificity of the PATH/AWARE functionality.

  17. NATIONAL GEODATABASE OF TIDAL STREAM POWER RESOURCE IN USA

    SciTech Connect (OSTI)

    Smith, Brennan T; Neary, Vincent S; Stewart, Kevin M

    2012-01-01

    A geodatabase of tidal constituents is developed to present the regional assessment of tidal stream power resource in the USA. Tidal currents are numerically modeled with the Regional Ocean Modeling System (ROMS) and calibrated with the available measurements of tidal current speeds and water level surfaces. The performance of the numerical model in predicting the tidal currents and water levels is assessed by an independent validation. The geodatabase is published on a public domain via a spatial database engine with interactive tools to select, query and download the data. Regions with the maximum average kinetic power density exceeding 500 W/m2 (corresponding to a current speed of ~1 m/s), total surface area larger than 0.5 km2 and depth greater than 5 m are defined as hotspots and documented. The regional assessment indicates that the state of Alaska (AK) has the largest number of locations with considerably high kinetic power density, followed by, Maine (ME), Washington (WA), Oregon (OR), California (CA), New Hampshire (NH), Massachusetts (MA), New York (NY), New Jersey (NJ), North and South Carolina (NC, SC), Georgia (GA), and Florida (FL).

  18. Visualization Gallery from the Computational Research Division at Lawrence Berkeley National Laboratory

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    This excellent collection of visualization vignettes highlights research work done by the LBNL/NERSC Visualization Group and its collaborators from 1993 to the present. Images lead to technical explanations and project details, helping users to branch out to other related sources. Titles of the projects provide clues both to the imaging focus of the research and the scientific discipline for which the visualizations are intended. Only a few of the many titles/images/projects are listed here: 1) Hybrid Parallelism for Volume Rendering at Large Scale Analysis of Laser Wakefield Particle Acceleration Data; 2) Visualization of Microearthquake Data from Enhanced Geothermal Systems; 3) PointCloudXplore: Visualization and Analysis of 3D Gene Expression Data; 4) Visualization of Quantum Monte-Carlo simulations; 5) Global Cloud Resolving Models; 6) Visualization of large-scale GFDL/NOAA climate simulations; 7) Direct Numerical Simulation of Turbulent Flame Quenching by Fine Water Droplets; 8) Visualization of Magneto-rotational instability and turbulent angular momentum transport; 9) Sunfall: Visual Analytics for Astrophysics; 10) Fast Contour Descriptor Algorithm for Supernova Image Classification; 11) Supernova Recognition Using Support Vector Machines; 12) High Performance Visualization - Query-Driven Network Traffic Analysis; 13) Visualization of Magneto-rotational instability and turbulent angular momentum transport; 14) Life Sciences: Cell Division of Caulobacter Crescentus; 15) Electron Cloud Simulations.

  19. In-Situ Microphysics from the RACORO IOP

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    McFarquhar, Greg

    2013-11-08

    These files were generated by Greg McFarquhar and Robert Jackson at the University of Illinois. Please contact mcfarq@atmos.uiuc.edu or rjackso2@atmos.uiuc.edu for more information or for assistance in interpreting the content of these files. We highly recommend that anyone wishing to use these files do so in a collaborative endeavor and we welcome queries and opportunities for collaboration. There are caveats associated with the use of the data which are difficult to thoroughly document and not all products for all time periods have been thoroughly examined. This is a value added data set of the best estimate of cloud microphysical parameters derived using data collected by the cloud microphysical probes installed on the Center for Interdisciplinary Remotely-Piloted Aircraft Studies (CIRPAS) Twin Otter during RACORO. These files contain best estimates of liquid size distributions N(D) in terms of droplet diameter D, liquid water content LWC, extinction of liquid drops beta, effective radius of cloud drops (re), total number concentration of droplets NT, and radar reflectivity factor Z at 1 second resolution.

  20. Automatic Fault Characterization via Abnormality-Enhanced Classification

    SciTech Connect (OSTI)

    Bronevetsky, G; Laguna, I; de Supinski, B R

    2010-12-20

    Enterprise and high-performance computing systems are growing extremely large and complex, employing hundreds to hundreds of thousands of processors and software/hardware stacks built by many people across many organizations. As the growing scale of these machines increases the frequency of faults, system complexity makes these faults difficult to detect and to diagnose. Current system management techniques, which focus primarily on efficient data access and query mechanisms, require system administrators to examine the behavior of various system services manually. Growing system complexity is making this manual process unmanageable: administrators require more effective management tools that can detect faults and help to identify their root causes. System administrators need timely notification when a fault is manifested that includes the type of fault, the time period in which it occurred and the processor on which it originated. Statistical modeling approaches can accurately characterize system behavior. However, the complex effects of system faults make these tools difficult to apply effectively. This paper investigates the application of classification and clustering algorithms to fault detection and characterization. We show experimentally that naively applying these methods achieves poor accuracy. Further, we design novel techniques that combine classification algorithms with information on the abnormality of application behavior to improve detection and characterization accuracy. Our experiments demonstrate that these techniques can detect and characterize faults with 65% accuracy, compared to just 5% accuracy for naive approaches.

  1. Low-Level Waste Forum notes and summary reports for 1994. Volume 9, Number 4, July 1994

    SciTech Connect (OSTI)

    1994-07-01

    This issue includes the following articles: Federal Facility Compliance Act Task Force forms mixed waste workgroup; Illinois Department of Nuclear Safety considers construction of centralized storage facility; Midwest Commission agrees on capacity limit, advisory committee; EPA responds to California site developer`s queries regarding application of air pollutant standards; county-level disqualification site screening of Pennsylvania complete; Texas Compact legislation introduced in US Senate; Generators ask court to rule in their favor on surcharge rebates lawsuit; Vermont authority and Battelle settle wetlands dispute; Eighth Circuit affirms decision in Nebraska community consent lawsuit; Nebraska court dismisses action filed by Boyd County local monitoring committee; NC authority, Chem-Nuclear, and Stowe exonerated; Senator Johnson introduces legislation to transfer Ward Valley site; Representative Dingell writes to Clinton regarding disposal of low-level radioactive waste; NAS committee on California site convenes; NRC to improve public petition process; NRC releases draft proposed rule on criteria for decontamination and closure of NRC-licensed facilities; and EPA names first environmental justice federal advisory council.

  2. Final project report

    SciTech Connect (OSTI)

    Nitin S. Baliga and Leroy Hood

    2008-11-12

    The proposed overarching goal for this project was the following: Data integration, simulation and visualization will facilitate metabolic and regulatory network prediction, exploration, and formulation of hypotheses. We stated three specific aims to achieve the overarching goal of this project: (1) Integration of multiple levels of information such as mRNA and protein levels, predicted protein-protein interactions/associations and gene function will enable construction of models describing environmental response and dynamic behavior. (2) Flexible tools for network inference will accelerate our understanding of biological systems. (3) Flexible exploration and queries of model hypotheses will provide focus and reveal novel dependencies. The underlying philosophy of these proposed aims is that an iterative cycle of experiments, experimental design, and verification will lead to a comprehensive and predictive model that will shed light on systems level mechanisms involved in responses elicited by living systems upon sensing a change in their environment. In the previous years report we demonstrated considerable progress in development of data standards, regulatory network inference and data visualization and exploration. We are pleased to report that several manuscripts describing these procedures have been published in top international peer reviewed journals including Genome Biology, PNAS, and Cell. The abstracts of these manuscripts are given and they summarize our accomplishments in this project.

  3. RAG-3D: A search tool for RNA 3D substructures

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

    2015-08-24

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less

  4. Computational Cell Environment: A Problem Solving Environment for integrating diverse biological data

    SciTech Connect (OSTI)

    Klicker, Kyle R.; Singhal, Mudita; Stephan, Eric G.; Trease, Lynn L.; Gracio, Deborah K.

    2004-06-22

    Biologists and bioinformaticists face the ever-increasing challenge of managing large datasets queried from diverse data sources. Genomics and proteomics databases such as the National Center for Biotechnology (NCBI), Kyoto Encyclopedia of Genes and Genomes (KEGG), and the European Molecular Biology Laboratory (EMBL) are becoming the standard biological data department stores that biologists visit on a regular basis to obtain the supplies necessary for conducting their research. However, much of the data that biologists retrieve from these databases needs to be further managed and organized in a meaningful way so that the researcher can focus on the problem that they are trying to investigate and share their data and findings with other researchers. We are working towards developing a problem-solving environment called the Computational Cell Environment (CCE) that provides connectivity to these diverse data stores and provides data retrieval, management, and analysis through all aspects of biological study. In this paper we discuss the system and database design of CCE. We also outline a few problems encountered at various stages of its development and the design decisions taken to resolve them.

  5. Parallel Environment for the Creation of Stochastics 1.0

    Energy Science and Technology Software Center (OSTI)

    2011-01-06

    PECOS is a computational library for creating and manipulating realizations of stochastic quantities, including scalar uncertain variables, random fields, and stochastic processes. It offers a unified interface to univariate and multivariate polynomial approximations using either orthogonal or interpolation polynomials; numerical integration drivers for Latin hypercube sampling, quadrature, cubature, and sparse grids; and fast Fourier transforms using third party libraries. The PECOS core also offers statistical utilities and transformations between various representations of stochastic uncertainty. PECOSmore » provides a C++ API through which users can generate and transform realizations of stochastic quantities. It is currently used by Sandia’s DAKOTA, Stokhos, and Encore software packages for uncertainty quantification and verification. PECOS generates random sample sets and multi-dimensional integration grids, typically used in forward propagation of scalar uncertainty in computational models (uncertainty quantification (UQ)). PECOS also generates samples of random fields (RFs) and stochastic processes (SPs) from a set of user-defined power spectral densities (PSDs). The RF/SP may be either Gaussian or non-Gaussian and either stationary or nonstationary, and the resulting sample is intended for run-time query by parallel finite element simulation codes. Finally, PECOS supports nonlinear transformations of random variables via the Nataf transformation and extensions.« less

  6. Materials Databases Infrastructure Constructed by First Principles Calculations: A Review

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Lin, Lianshan

    2015-10-13

    The First Principles calculations, especially the calculation based on High-Throughput Density Functional Theory, have been widely accepted as the major tools in atom scale materials design. The emerging super computers, along with the powerful First Principles calculations, have accumulated hundreds of thousands of crystal and compound records. The exponential growing of computational materials information urges the development of the materials databases, which not only provide unlimited storage for the daily increasing data, but still keep the efficiency in data storage, management, query, presentation and manipulation. This review covers the most cutting edge materials databases in materials design, and their hotmore » applications such as in fuel cells. By comparing the advantages and drawbacks of these high-throughput First Principles materials databases, the optimized computational framework can be identified to fit the needs of fuel cell applications. The further development of high-throughput DFT materials database, which in essence accelerates the materials innovation, is discussed in the summary as well.« less

  7. Common Geometry Module

    Energy Science and Technology Software Center (OSTI)

    2005-01-01

    The Common Geometry Module (CGM) is a code library which provides geometry functionality used for mesh generation and other applications. This functionality includes that commonly found in solid modeling engines, like geometry creation, query and modification; CGM also includes capabilities not commonly found in solid modeling engines, like geometry decomposition tools and support for shared material interfaces. CGM is built upon the ACIS solid modeling engine, but also includes geometry capability developed beside and onmore » top of ACIS. CGM can be used as-is to provide geometry functionality for codes needing this capability. However, CGM can also be extended using derived classes in C++, allowing the geometric model to serve as the basis for other applications, for example mesh generation. CGM is supported on Sun Solaris, SGI, HP, IBM, DEC, Linux and Windows NT platforms. CGM also indudes support for loading ACIS models on parallel computers, using MPI-based communication. Future plans for CGM are to port it to different solid modeling engines, including Pro/Engineer or SolidWorks. CGM is being released into the public domain under an LGPL license; the ACIS-based engine is available to ACIS licensees on request.« less

  8. RAG-3D: A search tool for RNA 3D substructures

    SciTech Connect (OSTI)

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

    2015-08-24

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.

  9. Global prevalence and distribution of genes and microorganisms involved in mercury methylation

    SciTech Connect (OSTI)

    Podar, Mircea; Gilmour, C. C.; Brandt, Craig C.; Soren, Allyson; Brown, Steven D.; Crable, Bryan R.; Palumbo, Anthony Vito; Somenahally, Anil C.; Elias, Dwayne A.

    2015-01-01

    Mercury methylation produces the neurotoxic, highly bioaccumulative methylmercury (MeHg). Recent identification of the methylation genes (hgcAB) provides the foundation for broadly evaluating microbial Hg-methylation potential in nature without making explicit rate measurements. We first queried hgcAB diversity and distribution in all available microbial metagenomes, encompassing most environments. The genes were found in nearly all anaerobic, but not in aerobic, environments including oxygenated layers of the open ocean. Critically, hgcAB was effectively absent in ~1500 human microbiomes, suggesting a low risk of endogenous MeHg production. New potential methylation habitats were identified, including invertebrate guts, thawing permafrost, coastal dead zones, soils, sediments, and extreme environments, suggesting multiple routes for MeHg entry into food webs. Several new taxonomic groups potentially capable of Hg-methylation emerged, including lineages having no cultured representatives. We then begin to address long-standing evolutionary questions about Hg-methylation and ancient carbon fixation mechanisms while generating a new global view of Hg-methylation potential.

  10. Global disease monitoring and forecasting with Wikipedia

    SciTech Connect (OSTI)

    Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid; Salathé, Marcel

    2014-11-13

    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data, such as social media and search queries, are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.

  11. The Wyodak-Anderson coal assessment, Powder River Basin, Wyoming and Montana -- An ArcView project

    SciTech Connect (OSTI)

    Flores, R.M.; Gunther, G.; Ochs, A.; Ellis, M.E.; Stricker, G.D.; Bader, L.R.

    1998-12-31

    In 1997, more than 305 million short tons of clean and compliant coal were produced from the Wyodak-Anderson and associated coal beds and zones of the Paleocene Fort Union Formation in the Powder River Basin, Wyoming and Montana. To date, all coal produced from the Wyodak-Anderson, which averages 0.47 percent sulfur and 6.44 percent ash, has met regulatory compliance standards. Twenty-eight percent of the total US coal production in 1997 was from the Wyodak-Anderson coal. Based on the current consumption rates and forecast by the Energy Information Administration (1996), the Wyodak-Anderson coal is projected to produce 413 million short tons by the year 2016. In addition, this coal deposit as well as other Fort Union coals have recently been targeted for exploration and development of methane gas. New US Geological Survey (USGS) digital products could provide valuable assistance in future mining and gas development in the Powder River Basin. An interactive format, with querying tools, using ArcView software will display the digital products of the resource assessment of Wyodak-Anderson coal, a part of the USGS National Coal Resource Assessment of the Powder River Basin. This ArcView project includes coverages of the data point distribution; land use; surface and subsurface ownerships; coal geology, stratigraphy, quality and geochemistry; and preliminary coal resource calculations. These coverages are displayed as map views, cross sections, tables, and charts.

  12. GROK

    Energy Science and Technology Software Center (OSTI)

    2006-02-24

    GROK is web based Internet Protocol (IP) search tool designed to help the user find and analyze network sessions in close to real time (5 minute). It reliew on the output generated by a packet capture and session summary tool called BAG. The bag program runs on a linux system, and continuously generates 5 minute full packet capture ILIBPCAP files, Internet session summary files, and interface statistic files, round-robin, over a period limited to themore » amount of disc storage available to the system. In the LANL case, an 8 terabyte file system accomodates seven days of data (most of the time). Summary information, such as top 20 outgoing and incoming network services (such as www/tcp or 161/udp) along with network interface statistics which indicate the health of the capture system are plotted every 5 minutes for display by the GROK web server. The grok home page presents the analyst with a set of search criteia used to query the information being collected by the bag program. Since the information ultimately resides in "pcap" files, other pcap aware programs such as bro ethereal, nosehair, smacqq, snort, and tcpdump have been incorporated into groks web interface. Clickable documentation is available for each search criteria« less

  13. Mercury-metadata data management system

    Energy Science and Technology Software Center (OSTI)

    2008-01-03

    Mercury is a federated metadata harvesting, search and retrieval tool based on both open source software and software developed at Oak Ridge National Laboratory. It was originally developed for NASA, USGS, and DOE. A major new version of Mercury (version 3.0) was developed during 2007 and released in early 2008. This Mercury 3.0 version provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facettedmore » type search, support for RSS delivery of search results, and ready customization to meet the needs of the multiple projects which use Mercury. For the end users, Mercury provides a single portal to very quickly search for data and information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data.« less

  14. Intelligent Object-Oriented GIS Engine W/dynamic Coupling to Modeled Objects

    Energy Science and Technology Software Center (OSTI)

    1997-02-12

    The GEOVIEWER is an intelligent object-oriented Geographic Information System (GIS) engine that provides not only a spatially-optimized object representation, but also direct linkage to the underlying object, its data and behaviors. Tools are incorporated to perform tasks involving typical GIS functionality, data ingestion, linkage to external models, and integration with other application frameworks. The GOEVIEWER module was designed to provide GIS functionality to create, query, view, and manipulate software objects within a selected area undermore » investigation in a simulation system. Many of these objects are not stored in a format conductive to efficient GIS usage. Their dynamic nature, complexity, and the sheer number of possible entity classes preclude effective integration with traditional GIS technologies due to the loosely coupled nature of their data representations. The primary difference between GEOVIEWER and standard GIS packages is that standard GIS packages offer static views of geospatial data while GEOVIEWER can be dynamically coupled to models and/or applications producing data and, therefore, display changes in geometry, attributes or behavior as they occur in the simulation.« less

  15. Medical and Transmission Vector Vocabulary Alignment with Schema.org

    SciTech Connect (OSTI)

    Smith, William P.; Chappell, Alan R.; Corley, Courtney D.

    2015-04-21

    Available biomedical ontologies and knowledge bases currently lack formal and standards-based interconnections between disease, disease vector, and drug treatment vocabularies. The PNNL Medical Linked Dataset (PNNL-MLD) addresses this gap. This paper describes the PNNL-MLD, which provides a unified vocabulary and dataset of drug, disease, side effect, and vector transmission background information. Currently, the PNNL-MLD combines and curates data from the following research projects: DrugBank, DailyMed, Diseasome, DisGeNet, Wikipedia Infobox, Sider, and PharmGKB. The main outcomes of this effort are a dataset aligned to Schema.org, including a parsing framework, and extensible hooks ready for integration with selected medical ontologies. The PNNL-MLD enables researchers more quickly and easily to query distinct datasets. Future extensions to the PNNL-MLD will include Traditional Chinese Medicine, broader interlinks across genetic structures, a larger thesaurus of synonyms and hypernyms, explicit coding of diseases and drugs across research systems, and incorporating vector-borne transmission vocabularies.

  16. Understanding the Complexities of Subnational Incentives in Supporting a National Market for Distributed Photovoltaics

    SciTech Connect (OSTI)

    Bush, B.; Doris, E.; Getman, D.

    2014-09-01

    Subnational policies pertaining to photovoltaic (PV) systems have increased in volume in recent years and federal incentives are set to be phased out over the next few. Understanding how subnational policies function within and across jurisdictions, thereby impacting PV market development, informs policy decision making. This report was developed for subnational policy-makers and researchers in order to aid the analysis on the function of PV system incentives within the emerging PV deployment market. The analysis presented is based on a 'logic engine,' a database tool using existing state, utility, and local incentives allowing users to see the interrelationships between PV system incentives and parameters, such as geographic location, technology specifications, and financial factors. Depending on how it is queried, the database can yield insights into which combinations of incentives are available and most advantageous to the PV system owner or developer under particular circumstances. This is useful both for individual system developers to identify the most advantageous incentive packages that they qualify for as well as for researchers and policymakers to better understand the patch work of incentives nationwide as well as how they drive the market.

  17. Arctic & Offshore Technical Data System

    Energy Science and Technology Software Center (OSTI)

    1990-07-01

    AORIS is a computerized information system to assist the technology and planning community in the development of Arctic oil and gas resources. In general, AORIS is geographically dependent and, where possible, site specific. The main topics are sea ice, geotechnology, oceanography, meteorology, and Arctic engineering, as they relate to such offshore oil and gas activities as exploration, production, storage, and transportation. AORIS consists of a directory component that identifies 85 Arctic energy-related databases and tellsmorehow to access them; a bibliographic/management information system or bibliographic component containing over 8,000 references and abstracts on Arctic energy-related research; and a scientific and engineering information system, or data component, containing over 800 data sets, in both tabular and graphical formats, on sea ice characteristics taken from the bibliographic citations. AORIS also contains much of the so-called grey literature, i.e., data and/or locations of Arctic data collected, but never published. The three components are linked so the user may easily move from one component to another. A generic information system is provided to allow users to create their own information systems. The generic programs have the same query and updating features as AORIS, except that there is no directory component.less

  18. Battery Life Estimator (BLE) Data Analysis Software v. 1.2

    Energy Science and Technology Software Center (OSTI)

    2010-02-24

    The purpose of this software is estimate the useable life of rechargeable batteries (e.g., lithium-ion). The software employs a generalized statistical approach to model cell data in the context of accelerated aging experiments. The cell performance is modeled in two parts. The first part consists of a deterministic degradation model which models the average cell behavior. The second part relates to the statistical variation in performance of the cells (error model). Experimental data from anmore » accelerated aging experiment will be input from an Excel worksheet. The software will then query the user for a specific model form (within the generalized model framework). Model parameters will be estimated by the software using various statistical methodologies. Average cell life will be predicted using the estimated model parameters. The uncertainty in the estimated cell life will also be computed using bootstrap simulations. This software can be used in several modes: 1) fit only, 2) fit and simulation, and 3) simulation only« less

  19. Agent-based method for distributed clustering of textual information

    DOE Patents [OSTI]

    Potok, Thomas E. [Oak Ridge, TN; Reed, Joel W. [Knoxville, TN; Elmore, Mark T. [Oak Ridge, TN; Treadwell, Jim N. [Louisville, TN

    2010-09-28

    A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.

  20. Distributed Merge Trees

    SciTech Connect (OSTI)

    Morozov, Dmitriy; Weber, Gunther

    2013-01-08

    Improved simulations and sensors are producing datasets whose increasing complexity exhausts our ability to visualize and comprehend them directly. To cope with this problem, we can detect and extract significant features in the data and use them as the basis for subsequent analysis. Topological methods are valuable in this context because they provide robust and general feature definitions. As the growth of serial computational power has stalled, data analysis is becoming increasingly dependent on massively parallel machines. To satisfy the computational demand created by complex datasets, algorithms need to effectively utilize these computer architectures. The main strength of topological methods, their emphasis on global information, turns into an obstacle during parallelization. We present two approaches to alleviate this problem. We develop a distributed representation of the merge tree that avoids computing the global tree on a single processor and lets us parallelize subsequent queries. To account for the increasing number of cores per processor, we develop a new data structure that lets us take advantage of multiple shared-memory cores to parallelize the work on a single node. Finally, we present experiments that illustrate the strengths of our approach as well as help identify future challenges.

  1. Data Intensive Architecture for Scalable Cyber Analytics

    SciTech Connect (OSTI)

    Olsen, Bryan K.; Johnson, John R.; Critchlow, Terence J.

    2011-12-19

    Cyber analysts are tasked with the identification and mitigation of network exploits and threats. These compromises are difficult to identify due to the characteristics of cyber communication, the volume of traffic, and the duration of possible attack. In this paper, we describe a prototype implementation designed to provide cyber analysts an environment where they can interactively explore a months worth of cyber security data. This prototype utilized On-Line Analytical Processing (OLAP) techniques to present a data cube to the analysts. The cube provides a summary of the data, allowing trends to be easily identified as well as the ability to easily pull up the original records comprising an event of interest. The cube was built using SQL Server Analysis Services (SSAS), with the interface to the cube provided by Tableau. This software infrastructure was supported by a novel hardware architecture comprising a Netezza TwinFin for the underlying data warehouse and a cube server with a FusionIO drive hosting the data cube. We evaluated this environment on a months worth of artificial, but realistic, data using multiple queries provided by our cyber analysts. As our results indicate, OLAP technology has progressed to the point where it is in a unique position to provide novel insights to cyber analysts, as long as it is supported by an appropriate data intensive architecture.

  2. Xgrid admin guide

    SciTech Connect (OSTI)

    Strauss, Charlie E M

    2010-01-01

    Xgrid, with a capital-X is the name for Apple's grid computing system. With a lower case x, xgrid is the name of the command line utility that clients can use, among other ways, to submit jobs to a controller. An Xgrid divides into three logical components: Agent, Controller and Client. Client computers submit jobs (a set of tasks) they want run to a Controller computer. The Controller queues the Client jobs and distributes tasks to Agent computers. Agent computers run the tasks and report their output and status back to the controller where it is stored until deleted by the Client. The Clients can asynchronously query the controller about the status of a job and the results. Any OSX computer can be any of these. A single mac can be more than one: it's possible to be Agent, Controller and Client at the same time. There is one Controller per Grid. Clients can submit jobs to Controllers of different grids. Agents can work for more than one grid. Xgrid's setup has a pleasantly small palette of choices. The first two decisions to make are the kind of authentication & authorization to use and if a shared file system is needed. A shared file system that all the agents can access can be very beneficial for many computing problems, but it is not appropriate for every network.

  3. Open-Source GIS

    SciTech Connect (OSTI)

    Vatsavai, Raju; Burk, Thomas E; Lime, Steve

    2012-01-01

    The components making up an Open Source GIS are explained in this chapter. A map server (Sect. 30.1) can broadly be defined as a software platform for dynamically generating spatially referenced digital map products. The University of Minnesota MapServer (UMN Map Server) is one such system. Its basic features are visualization, overlay, and query. Section 30.2 names and explains many of the geospatial open source libraries, such as GDAL and OGR. The other libraries are FDO, JTS, GEOS, JCS, MetaCRS, and GPSBabel. The application examples include derived GIS-software and data format conversions. Quantum GIS, its origin and its applications explained in detail in Sect. 30.3. The features include a rich GUI, attribute tables, vector symbols, labeling, editing functions, projections, georeferencing, GPS support, analysis, and Web Map Server functionality. Future developments will address mobile applications, 3-D, and multithreading. The origins of PostgreSQL are outlined and PostGIS discussed in detail in Sect. 30.4. It extends PostgreSQL by implementing the Simple Feature standard. Section 30.5 details the most important open source licenses such as the GPL, the LGPL, the MIT License, and the BSD License, as well as the role of the Creative Commons.

  4. Orchestrating Distributed Resource Ensembles for Petascale Science

    SciTech Connect (OSTI)

    Baldin, Ilya; Mandal, Anirban; Ruth, Paul; Yufeng, Xin

    2014-04-24

    Distributed, data-intensive computational science applications of interest to DOE scientific com- munities move large amounts of data for experiment data management, distributed analysis steps, remote visualization, and accessing scientific instruments. These applications need to orchestrate ensembles of resources from multiple resource pools and interconnect them with high-capacity multi- layered networks across multiple domains. It is highly desirable that mechanisms are designed that provide this type of resource provisioning capability to a broad class of applications. It is also important to have coherent monitoring capabilities for such complex distributed environments. In this project, we addressed these problems by designing an abstract API, enabled by novel semantic resource descriptions, for provisioning complex and heterogeneous resources from multiple providers using their native provisioning mechanisms and control planes: computational, storage, and multi-layered high-speed network domains. We used an extensible resource representation based on semantic web technologies to afford maximum flexibility to applications in specifying their needs. We evaluated the effectiveness of provisioning using representative data-intensive ap- plications. We also developed mechanisms for providing feedback about resource performance to the application, to enable closed-loop feedback control and dynamic adjustments to resource allo- cations (elasticity). This was enabled through development of a novel persistent query framework that consumes disparate sources of monitoring data, including perfSONAR, and provides scalable distribution of asynchronous notifications.

  5. Field Trial of a Low-Cost, Distributed Plug Load Monitoring System

    SciTech Connect (OSTI)

    Auchter, B.; Cautley, D.; Ahl, D.; Earle, L.; Jin, X.

    2014-03-01

    Researchers have struggled to inventory and characterize the energy use profiles of the ever-growing category of so-called miscellaneous electric loads (MELs) because plug-load monitoring is cost-prohibitive to the researcher and intrusive to the homeowner. However, these data represent a crucial missing link to understanding how homes use energy. Detailed energy use profiles would enable the nascent automated home energy management (AHEM) industry to develop effective control algorithms that target consumer electronics and other plug loads. If utility and other efficiency programs are to incent AHEM devices, they need large-scale datasets that provide statistically meaningful justification of their investments by quantifying the aggregate energy savings achievable. To address this need, NREL researchers investigated a variety of plug-load measuring devices available commercially and tested them in the laboratory to identify the most promising candidates for field applications. This report centers around the lessons learned from a field validation of one proof-of-concept system, called Smartenit (formerly SimpleHomeNet). The system was evaluated based on the rate of successful data queries, reliability over a period of days to weeks, and accuracy. This system offers good overall performance when deployed with up to 10 end nodes in a residential environment, although deployment with more nodes and in a commercial environment is much less robust. NREL concludes that the current system is useful in selected field research projects, with the recommendation that system behavior is observed over time.

  6. Secure Information Sharing

    Energy Science and Technology Software Center (OSTI)

    2005-09-09

    We are develoing a peer-to-peer system to support secure, location independent information sharing in the scientific community. Once complete, this system will allow seamless and secure sharing of information between multiple collaborators. The owners of information will be able to control how the information is stored, managed. ano shared. In addition, users will have faster access to information updates within a collaboration. Groups collaborating on scientific experiments have a need to share information and data.more » This information and data is often represented in the form of files and database entries. In a typical scientific collaboration, there are many different locations where data would naturally be stored. This makes It difficult for collaborators to find and access the information they need. Our goal is to create a lightweight file-sharing system that makes it’easy for collaborators to find and use the data they need. This system must be easy-to-use, easy-to-administer, and secure. Our information-sharing tool uses group communication, in particular the InterGroup protocols, to reliably deliver each query to all of the current participants in a scalable manner, without having to discover all of their identities. We will use the Secure Group Layer (SGL) and Akenti to provide security to the participants of our environment, SGL will provide confldentiality, integrity, authenticity, and authorization enforcement for the InterGroup protocols and Akenti will provide access control to other resources.« less

  7. Mercury Metadata Toolset

    Energy Science and Technology Software Center (OSTI)

    2009-09-08

    Mercury is a federated metadata harvesting, search and retrieval tool based on both open source software and software developed at Oak Ridge National Laboratory. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. A major new version of Mercury (version 3.0) was developed during 2007 and released in early 2008. This Mercury 3.0 version provides orders of magnitude improvements in search speed, support for additionalmore » metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS delivery of search results, and ready customization to meet the needs of the multiple projects which use Mercury. For the end users, Mercury provides a single portal to very quickly search for data and information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data.« less

  8. Mesh infrastructure for coupled multiprocess geophysical simulations

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Garimella, Rao V.; Perkins, William A.; Buksas, Mike W.; Berndt, Markus; Lipnikov, Konstantin; Coon, Ethan; Moulton, John D.; Painter, Scott L.

    2014-01-01

    We have developed a sophisticated mesh infrastructure capability to support large scale multiphysics simulations such as subsurface flow and reactive contaminant transport at storage sites as well as the analysis of the effects of a warming climate on the terrestrial arctic. These simulations involve a wide range of coupled processes including overland flow, subsurface flow, freezing and thawing of ice rich soil, accumulation, redistribution and melting of snow, biogeochemical processes involving plant matter and finally, microtopography evolution due to melting and degradation of ice wedges below the surface. In addition to supporting the usual topological and geometric queries about themore » mesh, the mesh infrastructure adds capabilities such as identifying columnar structures in the mesh, enabling deforming of the mesh subject to constraints and enabling the simultaneous use of meshes of different dimensionality for subsurface and surface processes. The generic mesh interface is capable of using three different open source mesh frameworks (MSTK, MOAB and STKmesh) under the hood allowing the developers to directly compare them and choose one that is best suited for the application's needs. We demonstrate the results of some simulations using these capabilities as well as present a comparison of the performance of the different mesh frameworks.« less

  9. Implementation of a laboratory information management system for environmental regulatory analyses

    SciTech Connect (OSTI)

    Spencer, W.A.; Aiken, H.B.; Spatz, T.L.; Miles, W.F.; Griffin, J.C.

    1993-09-07

    The Savannah River Technology Center created a second instance of its ORACLE based PEN LIMS to support site Environmental Restoration projects. The first instance of the database had been optimized for R&D support and did not implement rigorous sample tracking, verification, and holding times needed to support regulatory commitments. Much of the R&D instance was transferable such as the work control functions for backlog reports, work assignment sheets, and hazard communication support. A major enhancement of the regulatory LIMS was the addition of features to support a {open_quotes}standardized{close_quotes} electronic data format for environmental data reporting. The electronic format, called {open_quotes}AN92{close_quotes}, was developed by the site environmental monitoring organization and applies to both onsite and offsite environmental analytical contracts. This format incorporates EPA CLP data validation codes as well as details holding time and analytical result reporting requirements. The authors support this format by using special SQL queries to the database. The data is then automatically transferred to the environmental databases for trending and geological mapping.

  10. In-Situ Microphysics from the RACORO IOP

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    McFarquhar, Greg

    These files were generated by Greg McFarquhar and Robert Jackson at the University of Illinois. Please contact mcfarq@atmos.uiuc.edu or rjackso2@atmos.uiuc.edu for more information or for assistance in interpreting the content of these files. We highly recommend that anyone wishing to use these files do so in a collaborative endeavor and we welcome queries and opportunities for collaboration. There are caveats associated with the use of the data which are difficult to thoroughly document and not all products for all time periods have been thoroughly examined. This is a value added data set of the best estimate of cloud microphysical parameters derived using data collected by the cloud microphysical probes installed on the Center for Interdisciplinary Remotely-Piloted Aircraft Studies (CIRPAS) Twin Otter during RACORO. These files contain best estimates of liquid size distributions N(D) in terms of droplet diameter D, liquid water content LWC, extinction of liquid drops beta, effective radius of cloud drops (re), total number concentration of droplets NT, and radar reflectivity factor Z at 1 second resolution.

  11. Limited-memory adaptive snapshot selection for proper orthogonal decomposition

    SciTech Connect (OSTI)

    Oxberry, Geoffrey M.; Kostova-Vassilevska, Tanya; Arrighi, Bill; Chand, Kyle

    2015-04-02

    Reduced order models are useful for accelerating simulations in many-query contexts, such as optimization, uncertainty quantification, and sensitivity analysis. However, offline training of reduced order models can have prohibitively expensive memory and floating-point operation costs in high-performance computing applications, where memory per core is limited. To overcome this limitation for proper orthogonal decomposition, we propose a novel adaptive selection method for snapshots in time that limits offline training costs by selecting snapshots according an error control mechanism similar to that found in adaptive time-stepping ordinary differential equation solvers. The error estimator used in this work is related to theory bounding the approximation error in time of proper orthogonal decomposition-based reduced order models, and memory usage is minimized by computing the singular value decomposition using a single-pass incremental algorithm. Results for a viscous Burgers’ test problem demonstrate convergence in the limit as the algorithm error tolerances go to zero; in this limit, the full order model is recovered to within discretization error. The resulting method can be used on supercomputers to generate proper orthogonal decomposition-based reduced order models, or as a subroutine within hyperreduction algorithms that require taking snapshots in time, or within greedy algorithms for sampling parameter space.

  12. SU-E-J-16: A Review of the Magnitude of Patient Imaging Shifts in Relation to Departmental Policy Changes

    SciTech Connect (OSTI)

    O'Connor, M; Sansourekidou, P

    2014-06-01

    Purpose: To evaluate how changes in imaging policy affect the magnitude of shifts applied to patients. Methods: In June 2012, the department's imaging policy was altered to require that any shifts derived from imaging throughout the course of treatment shall be considered systematic only after they were validated with two data points that are consistent in the same direction. Multiple additions and clarifications to the imaging policy were implemented throughout the course of the data collection, but they were mostly of administrative nature. Entered shifts were documented in MOSAIQ (Elekta AB) through the localization offset. The MOSAIQ database was queried to identify a possible trend. A total of 25,670 entries were analyzed, including four linear accelerators with a combination of MV planar, kV planar and kV three dimensional imaging. The monthly average of the magnitude of the vector was used. Plan relative offsets were excluded. During the evaluated period of time, one of the satellite facilities acquired and implemented Vision RT (AlignRT Inc). Results: After the new policy was implemented the shifts variance and standard deviation decreased. The decrease is linear with time elapsed. Vision RT implementation at one satellite facility reduced the number of overall shifts, specifically for breast patients. Conclusion: Changes in imaging policy have a significant effect on the magnitude of shifts applied to patients. Using two statistical points before applying a shift as persistent decreased the overall magnitude of the shifts applied to patients.

  13. Threatened and Endangered Species Evaluation for Operating Commercial Nuclear Power Generating Plants

    SciTech Connect (OSTI)

    Sackschewsky, Michael R.

    2004-01-15

    The Endangered Species Act (ESA) of 1973 requires that federal agencies ensure that any action authorized, funded, or carried out under their jurisdiction is not likely to jeopardize the continued existence of any threatened or endangered (T&E) species or result in the destruction or adverse modification of critical habitats for such species. The issuance and maintenance of a federal license, such as a construction permit or operating license issued by the U.S. Nuclear Regulatory Commission (NRC) for a commercial nuclear power generating facility is a federal action under the jurisdiction of a federal agency, and is therefore subject to the provisions of the ESA. The Office of Nuclear Reactor Regulation (NRR) staff have performed appropriate assessments of potential impacts to threatened or endangered species, and consulted with appropriate agencies with regard to protection of such species in authorizing the construction, operation, and relicensing of nuclear power generating facilities. However, the assessments and consultations concerning many facilities were performed during the 1970's or early 1980's, and have not been re-evaluated in detail or updated since those initial evaluations. A review of potential Endangered Species Act issues at licensed nuclear power facilities was completed in 1997. In that review 484 different ESA-listed species were identified as potentially occurring near one or more of the 75 facility sites that were examined. An update of the previous T&E species evaluation at this time is desired because, during the intervening 6 years: nearly 200 species have been added to the ESA list, critical habitats have been designated for many of the listed species, and significantly more information is available online, allowing for more efficient high-level evaluations of potential species presence near sites and the potential operation impacts. The updated evaluation included searching the NRC's ADAMS database to find any documents related to T&E species take, consultations, and evaluations of potential effects of operation on T&E species. This search recovered a total of approximately 100 documents from 13 sites. Sites that were in the relicensing or decommissioning processes were excluded from the ADAMS search. In general the ADAMS search did not reveal any serious deficiencies or compliance problems. The most notable finds were reports of takes of green sea turtles at Diablo Canyon. While these events were reported to both the NRC and to NOAA Fisheries, no record of interaction between the two federal agencies was found. Species potentially present at each site were determined via querying the Geographical, Environmental, and Siting Information System (GEn&SIS) database developed for the NRC by Lawrence Livermore National Laboratory. The results of these queries were compared against the 1997 review, and in the cases of sites that were in the relicensing process, with the results of those site specific evaluations. A total of 452 T&E species were identified as potentially occurring near one or more of the operating commercial nuclear power generating plants. Information about each of these species was gathered to support an assessment of the probability of occurrence at each of the reactor sites. Based on the assessments of which species are potentially affected at each site, and the information gathered through the ADAMS search, each site was assigned a priority value for follow-up evaluations. The priority listing did not include any sites that had entered the relicensing process, those where the licensee has indicated that they intend to enter the relicensing process before the end of 2005, or those that have entered the decommissioning process. Of the 39 remaining sites, those that were identified as the highest priority for follow-on evaluations are: Diablo Canyon, San Onofre, Crystal River, Harris, and Vogtle, followed by South Texas, Palo Verde, Salem, and Cooper.

  14. SU-D-BRD-04: A Logical Organizational Approach to Clinical Information Management

    SciTech Connect (OSTI)

    Shao, W; Kupelian, P; Wang, J; Low, D; Ruan, D

    2014-06-01

    Purpose: To develop a clinical information management system (CIMS) that collects, organizes physician inputs logically and supports analysis functionality. Methods: In a conventional electronic medical record system (EMR), the document manager component stores data in a pool of standalone .docx or .pdf files. The lack of a content-based logical organization makes cross-checking, reference or automatic inheritance of information challenging. We have developed an information-oriented clinical record system that addresses this shortcoming. In CIMS, a parent library predefines a set of available questions along with the data types of their expected answers. The creation of a questionnaire template is achieved by selecting questions from this parent library to form a virtual group. Instances of the same data field in different documents are linked by their question identifier. This design allows for flexible data sharing and inheritance among various forms using a longitudinal lineage of data indexed according to the modification time stamps of the documents. CIMS is designed with a web portal to facilitate querying, data entry and modification, aggregate report generation, and data adjudication. The current implementation addresses diagnostic data, medical history, vital signs, and various quantities in consult note and treatment summaries. Results: CIMS is currently storing treatment summary information of over 1,000 patients who have received treatment at UCLA Radiation Oncology between March 1, 2013 and January 31, 2014. We are in the process of incorporating a DICOM-RT dosimetry parser and patient reporting applications into CIMS, as well as continuing to define document templates to support additional forms. Conclusion: We are able to devise an alternative storage paradigm which results in an improvement in the accuracy and organizational structure of clinical information.

  15. Genetic and Pharmacological Inhibition of PDK1 in Cancer Cells: Characterization of a Selective Allosteric Kinase Inhibitor

    SciTech Connect (OSTI)

    Nagashima, Kumiko; Shumway, Stuart D.; Sathyanarayanan, Sriram; Chen, Albert H.; Dolinski, Brian; Xu, Youyuan; Keilhack, Heike; Nguyen, Thi; Wiznerowicz, Maciej; Li, Lixia; Lutterbach, Bart A.; Chi, An; Paweletz, Cloud; Allison, Timothy; Yan, Youwei; Munshi, Sanjeev K.; Klippel, Anke; Kraus, Manfred; Bobkova, Ekaterina V.; Deshmukh, Sujal; Xu, Zangwei; Mueller, Uwe; Szewczak, Alexander A.; Pan, Bo-Sheng; Richon, Victoria; Pollock, Roy; Blume-Jensen, Peter; Northrup, Alan; Andersen, Jannik N.

    2013-11-20

    Phosphoinositide-dependent kinase 1 (PDK1) is a critical activator of multiple prosurvival and oncogenic protein kinases and has garnered considerable interest as an oncology drug target. Despite progress characterizing PDK1 as a therapeutic target, pharmacological support is lacking due to the prevalence of nonspecific inhibitors. Here, we benchmark literature and newly developed inhibitors and conduct parallel genetic and pharmacological queries into PDK1 function in cancer cells. Through kinase selectivity profiling and x-ray crystallographic studies, we identify an exquisitely selective PDK1 inhibitor (compound 7) that uniquely binds to the inactive kinase conformation (DFG-out). In contrast to compounds 1-5, which are classical ATP-competitive kinase inhibitors (DFG-in), compound 7 specifically inhibits cellular PDK1 T-loop phosphorylation (Ser-241), supporting its unique binding mode. Interfering with PDK1 activity has minimal antiproliferative effect on cells growing as plastic-attached monolayer cultures (i.e. standard tissue culture conditions) despite reduced phosphorylation of AKT, RSK, and S6RP. However, selective PDK1 inhibition impairs anchorage-independent growth, invasion, and cancer cell migration. Compound 7 inhibits colony formation in a subset of cancer cell lines (four of 10) and primary xenograft tumor lines (nine of 57). RNAi-mediated knockdown corroborates the PDK1 dependence in cell lines and identifies candidate biomarkers of drug response. In summary, our profiling studies define a uniquely selective and cell-potent PDK1 inhibitor, and the convergence of genetic and pharmacological phenotypes supports a role of PDK1 in tumorigenesis in the context of three-dimensional in vitro culture systems.

  16. PARTNERWORKSV2.0

    Energy Science and Technology Software Center (OSTI)

    2000-04-12

    PartnerWorks Ver. 2.0 uses MAPI and OLE to tightly integrate the information system into the user's desktop. The applications are mail and fax enabled, and data can be linked or exported to and from all popular desktop applications at the push of a button. PartnerWorks converts financial data from Project Year to Fiscal Year. PartnerWorks also makes use of off-the-shelf software (Microsoft NT) encryption. PartnerWorks Ver, 2.0 automates the management of laboratory agreements with industry.more » PartnerWorks is a three-tier client server system. It uses MS SQL server (480 tables) as the central data repository for agreement information and document objects. The front-end applications consist of various MS Access applications. Data in remote systems is queried live or imported on a fixed schedule depending on the data type and volatility. All data is accessed via ODBC. This multi-front end application, multi-back end data source provides the end user with the illusion that all data exists in his or her custom application. PartnerWorks manages: Managing laboratory-partner agreement life cycle including contract and funding details (16 agreement-specific modules); Pending laboratory partnership agreements; Partner details for existing and potential partners; Potential agreement sources and partnership opportunities; Automating and warehousing the agreement documentation (document warehousing module); Automating standardized email communication for agreements; Enforcing business rules and work flow (action tracking module); Automated reporting including demand print, and schedule delivery (reporting module); Marketing intellectual property and past successes (Web module).« less

  17. The Role of Postoperative Radiation Therapy in the Treatment of Meningeal HemangiopericytomaExperience From the SEER Database

    SciTech Connect (OSTI)

    Stessin, Alexander M.; Sison, Cristina; Nieto, Jaime; Raifu, Muri; Li, Baoqing

    2013-03-01

    Purpose: The aim of this study was to examine the effect of postoperative radiation therapy (RT) on cause-specific survival in patients with meningeal hemangiopericytomas. Methods and Materials: The Surveillance, Epidemiology, and End Results database from 1990-2008 was queried for cases of surgically resected central nervous system hemangiopericytoma. Patient demographics, tumor location, and extent of resection were included in the analysis as covariates. The Kaplan-Meier product-limit method was used to analyze cause-specific survival. A Cox proportional hazards regression analysis was conducted to determine which factors were associated with cause-specific survival. Results: The mean follow-up time is 7.9 years (95 months). There were 76 patients included in the analysis, of these, 38 (50%) underwent gross total resection (GTR), whereas the other half underwent subtotal resection (STR). Postoperative RT was administered to 42% (16/38) of the patients in the GTR group and 50% (19/38) in the STR group. The 1-year, 10-year, and 20-year cause-specific survival rates were 99%, 75%, and 43%, respectively. On multivariate analysis, postoperative RT was associated with significantly better survival (HR = 0.269, 95% CI 0.084-0.862; P=.027), in particular for patients who underwent STR (HR = 0.088, 95% CI: 0.015-0.528; P<.008). Conclusions: In the absence of large prospective trials, the current clinical decision-making of hemangiopericytoma is mostly based on retrospective data. We recommend that postoperative RT be considered after subtotal resection for patients who could tolerate it. Based on the current literature, the practical approach is to deliver limited field RT to doses of 50-60 Gy while respecting the normal tissue tolerance. Further investigations are clearly needed to determine the optimal therapeutic strategy.

  18. Case study: An environmental database management system for the auto-body painting process

    SciTech Connect (OSTI)

    Shepard, S.; Souten, D.

    1996-12-31

    The auto-body painting process is subject to numerous environmental regulations, including those directed toward hazardous waste, water pollution prevention, workplace safety, and air pollution. Each environmental regulatory compliance area requires extensive record keeping and reporting of information. Incomplete or untimely reporting and record keeping can result in significant adverse actions by regulatory agencies. Additionally, good data record keeping allows management to have better internal knowledge of plant operations with respect to environmental concerns. The record keeping and reporting prior to the development of the database management system described here were performed using spreadsheets. Although spreadsheets are useful for conducting numerical calculations and plots, they are inflexible to the addition and deletion of different materials (such as paint colors) from year to year. They are clumsy with large amounts of data, and they do not have the querying capabilities of a database. In light of the ever changing reporting requirements to different regulatory agencies, reporting and tracking of emissions data using spreadsheets rapidly becomes extremely difficult. This paper describes the design and implementation of the air pollution portion of an environmental database management system starting with one model year`s worth of spreadsheet data. The design consisted of converting all the relevant data into the database format (including coefficients for calculations within the spreadsheets), formulating a relational model for the data, and designing the user-interface. The program implementation was done in Microsoft Access 2.0. The database design, program features, project successes and difficulties we faced are presented as our example outputs.

  19. Agua Caliente Wind/Solar Project at Whitewater Ranch

    SciTech Connect (OSTI)

    Hooks, Todd; Stewart, Royce

    2014-12-16

    Agua Caliente Band of Cahuilla Indians (ACBCI) was awarded a grant by the Department of Energy (DOE) to study the feasibility of a wind and/or solar renewable energy project at the Whitewater Ranch (WWR) property of ACBCI. Red Mountain Energy Partners (RMEP) was engaged to conduct the study. The ACBCI tribal lands in the Coachella Valley have very rich renewable energy resources. The tribe has undertaken several studies to more fully understand the options available to them if they were to move forward with one or more renewable energy projects. With respect to the resources, the WWR property clearly has excellent wind and solar resources. The DOE National Renewable Energy Laboratory (NREL) has continued to upgrade and refine their library of resource maps. The newer, more precise maps quantify the resources as among the best in the world. The wind and solar technology available for deployment is also being improved. Both are reducing their costs to the point of being at or below the costs of fossil fuels. Technologies for energy storage and microgrids are also improving quickly and present additional ways to increase the wind and/or solar energy retained for later use with the network management flexibility to provide power to the appropriate locations when needed. As a result, renewable resources continue to gain more market share. The transitioning to renewables as the major resources for power will take some time as the conversion is complex and can have negative impacts if not managed well. While the economics for wind and solar systems continue to improve, the robustness of the WWR site was validated by the repeated queries of developers to place wind and/or solar there. The robust resources and improving technologies portends toward WWR land as a renewable energy site. The business case, however, is not so clear, especially when the potential investment portfolio for ACBCI has several very beneficial and profitable alternatives.

  20. Scalla: Structured Cluster Architecture for Low Latency Access

    SciTech Connect (OSTI)

    Hanushevsky, Andrew; Wang, Daniel L.; /SLAC

    2012-03-20

    Scalla is a distributed low-latency file access system that incorporates novel techniques that minimize latency and maximize scalability over a large distributed system with a distributed namespace. Scalla's techniques have shown to be effective in nearly a decade of service for the high-energy physics community using commodity hardware and interconnects. We describe the two components used in Scalla that are instrumental in its ability to provide low-latency, fault-tolerant name resolution and load distribution, and enable its use as a high-throughput, low-latency communication layer in the Qserv system, the Large Synoptic Survey Telescope's (LSST's) prototype astronomical query system. Scalla arguably exceeded its three main design objectives: low latency, scaling, and recoverability. In retrospect, these objectives were met using a simple but effective design. Low latency was met by uniformly using linear or constant time algorithms in all high-use paths, avoiding locks whenever possible, and using compact data structures to maximize the memory caching efficiency. Scaling was achieved by architecting the system as a 64-ary tree. Nodes can be added easily and as the number of nodes increases, search performance increases at an exponential rate. Recoverability is inherent in that no permanent state information is maintained and whatever state information is needed it can be quickly constructed or reconstructed in real time. This allows dynamic changes in a cluster of servers with little impact on over-all performance or usability. Today, Scalla is being deployed in environments and for uses that were never conceived in 2001. This speaks well for the systems adaptability but the underlying reason is that the system can meet its three fundamental objectives at the same time.

  1. Assessing Adverse Events of Postprostatectomy Radiation Therapy for Prostate Cancer: Evaluation of Outcomes in the Regione Emilia-Romagna, Italy

    SciTech Connect (OSTI)

    Showalter, Timothy N.; Hegarty, Sarah E.; Rabinowitz, Carol; Maio, Vittorio; Hyslop, Terry; Dicker, Adam P.; Louis, Daniel Z.

    2015-03-15

    Purpose: Although the likelihood of radiation-related adverse events influences treatment decisions regarding radiation therapy after prostatectomy for eligible patients, the data available to inform decisions are limited. This study was designed to evaluate the genitourinary, gastrointestinal, and sexual adverse events associated with postprostatectomy radiation therapy and to assess the influence of radiation timing on the risk of adverse events. Methods: The Regione Emilia-Romagna Italian Longitudinal Health Care Utilization Database was queried to identify a cohort of men who received radical prostatectomy for prostate cancer during 2003 to 2009, including patients who received postprostatectomy radiation therapy. Patients with prior radiation therapy were excluded. Outcome measures were genitourinary, gastrointestinal, and sexual adverse events after prostatectomy. Rates of adverse events were compared between the cohorts who did and did not receive postoperative radiation therapy. Multivariable Cox proportional hazards models were developed for each class of adverse events, including models with radiation therapy as a time-varying covariate. Results: A total of 9876 men were included in the analyses: 2176 (22%) who received radiation therapy and 7700 (78%) treated with prostatectomy alone. In multivariable Cox proportional hazards models, the additional exposure to radiation therapy after prostatectomy was associated with increased rates of gastrointestinal (rate ratio [RR] 1.81; 95% confidence interval [CI] 1.44-2.27; P<.001) and urinary nonincontinence events (RR 1.83; 95% CI 1.83-2.80; P<.001) but not urinary incontinence events or erectile dysfunction. The addition of the time from prostatectomy to radiation therapy interaction term was not significant for any of the adverse event outcomes (P>.1 for all outcomes). Conclusion: Radiation therapy after prostatectomy is associated with an increase in gastrointestinal and genitourinary adverse events. However, the timing of radiation therapy did not influence the risk of radiation therapy–associated adverse events in this cohort, which contradicts the commonly held clinical tenet that delaying radiation therapy reduces the risk of adverse events.

  2. SU-D-BRD-02: A Web-Based Image Processing and Plan Evaluation Platform (WIPPEP) for Future Cloud-Based Radiotherapy

    SciTech Connect (OSTI)

    Chai, X; Liu, L; Xing, L

    2014-06-01

    Purpose: Visualization and processing of medical images and radiation treatment plan evaluation have traditionally been constrained to local workstations with limited computation power and ability of data sharing and software update. We present a web-based image processing and planning evaluation platform (WIPPEP) for radiotherapy applications with high efficiency, ubiquitous web access, and real-time data sharing. Methods: This software platform consists of three parts: web server, image server and computation server. Each independent server communicates with each other through HTTP requests. The web server is the key component that provides visualizations and user interface through front-end web browsers and relay information to the backend to process user requests. The image server serves as a PACS system. The computation server performs the actual image processing and dose calculation. The web server backend is developed using Java Servlets and the frontend is developed using HTML5, Javascript, and jQuery. The image server is based on open source DCME4CHEE PACS system. The computation server can be written in any programming language as long as it can send/receive HTTP requests. Our computation server was implemented in Delphi, Python and PHP, which can process data directly or via a C++ program DLL. Results: This software platform is running on a 32-core CPU server virtually hosting the web server, image server, and computation servers separately. Users can visit our internal website with Chrome browser, select a specific patient, visualize image and RT structures belonging to this patient and perform image segmentation running Delphi computation server and Monte Carlo dose calculation on Python or PHP computation server. Conclusion: We have developed a webbased image processing and plan evaluation platform prototype for radiotherapy. This system has clearly demonstrated the feasibility of performing image processing and plan evaluation platform through a web browser and exhibited potential for future cloud based radiotherapy.

  3. Continuous Security and Configuration Monitoring of HPC Clusters

    SciTech Connect (OSTI)

    Garcia-Lomeli, H. D.; Bertsch, A. D.; Fox, D. M.

    2015-05-08

    Continuous security and configuration monitoring of information systems has been a time consuming and laborious task for system administrators at the High Performance Computing (HPC) center. Prior to this project, system administrators had to manually check the settings of thousands of nodes, which required a significant number of hours rendering the old process ineffective and inefficient. This paper explains the application of Splunk Enterprise, a software agent, and a reporting tool in the development of a user application interface to track and report on critical system updates and security compliance status of HPC Clusters. In conjunction with other configuration management systems, the reporting tool is to provide continuous situational awareness to system administrators of the compliance state of information systems. Our approach consisted of the development, testing, and deployment of an agent to collect any arbitrary information across a massively distributed computing center, and organize that information into a human-readable format. Using Splunk Enterprise, this raw data was then gathered into a central repository and indexed for search, analysis, and correlation. Following acquisition and accumulation, the reporting tool generated and presented actionable information by filtering the data according to command line parameters passed at run time. Preliminary data showed results for over six thousand nodes. Further research and expansion of this tool could lead to the development of a series of agents to gather and report critical system parameters. However, in order to make use of the flexibility and resourcefulness of the reporting tool the agent must conform to specifications set forth in this paper. This project has simplified the way system administrators gather, analyze, and report on the configuration and security state of HPC clusters, maintaining ongoing situational awareness. Rather than querying each cluster independently, compliance checking can be managed from one central location.

  4. Adjuvant Radiation Therapy Treatment Time Impacts Overall Survival in Gastric Cancer

    SciTech Connect (OSTI)

    McMillan, Matthew T.; Ojerholm, Eric; Roses, Robert E.; Plastaras, John P.; Metz, James M.; Mamtani, Ronac; Stripp, Diana; Ben-Josef, Edgar; Datta, Jashodeep

    2015-10-01

    Purpose: Prolonged radiation therapy treatment time (RTT) is associated with worse survival in several tumor types. This study investigated whether delays during adjuvant radiation therapy impact overall survival (OS) in gastric cancer. Methods and Materials: The National Cancer Data Base was queried for patients with resected gastric cancer who received adjuvant radiation therapy with National Comprehensive Cancer Network–recommended doses (45 or 50.4 Gy) between 1998 and 2006. RTT was classified as standard (45 Gy: 33-36 days, 50.4 Gy: 38-41 days) or prolonged (45 Gy: >36 days, 50.4 Gy: >41 days). Cox proportional hazards models evaluated the association between the following factors and OS: RTT, interval from surgery to radiation therapy initiation, interval from surgery to radiation therapy completion, radiation therapy dose, demographic/pathologic and operative factors, and other elements of adjuvant multimodality therapy. Results: Of 1591 patients, RTT was delayed in 732 (46%). Factors associated with prolonged RTT were non-private health insurance (OR 1.3, P=.005) and treatment at non-academic facilities (OR 1.2, P=.045). Median OS and 5-year actuarial survival were significantly worse in patients with prolonged RTT compared with standard RTT (36 vs 51 months, P=.001; 39 vs 47%, P=.005); OS worsened with each cumulative week of delay (P<.0004). On multivariable analysis, prolonged RTT was associated with inferior OS (hazard ratio 1.2, P=.002); the intervals from surgery to radiation therapy initiation or completion were not. Prolonged RTT was particularly detrimental in patients with node positivity, inadequate nodal staging (<15 nodes examined), and those undergoing a cycle of chemotherapy before chemoradiation therapy. Conclusions: Delays during adjuvant radiation therapy appear to negatively impact survival in gastric cancer. Efforts to minimize cumulative interruptions to <7 days should be considered.

  5. AGR-2 Data Qualification Report for ATR Cycles 149B, 150A, 150B, 151A, and 151B

    SciTech Connect (OSTI)

    Michael L. Abbott; Binh T. Pham

    2012-06-01

    This report provides the data qualification status of AGR-2 fuel irradiation experimental data from Advanced Test Reactor (ATR) cycles 149B, 150A, 150B, 151A, and 151B), as recorded in the Nuclear Data Management and Analysis System (NDMAS). The AGR-2 data streams addressed include thermocouple temperatures, sweep gas data (flow rate, pressure, and moisture content), and fission product monitoring system (FPMS) data for each of the six capsules in the experiment. A total of 3,307,500 5-minute thermocouple and sweep gas data records were received and processed by NDMAS for this period. There are no AGR-2 data for cycle 150A because the experiment was removed from the reactor. Of these data, 82.2% were determined to be Qualified based on NDMAS accuracy testing and data validity assessment. There were 450,557 Failed temperature records due to thermocouple failures, and 138,528 Failed gas flow records due to gas flow cross-talk and leakage problems that occurred in the capsules after cycle 150A. For FPMS data, NDMAS received and processed preliminary release rate and release-to-birth rate ratio (R/B) data for the first three reactor cycles (cycles 149B, 150B, and 151B). This data consists of 45,983 release rate records and 45,235 R/B records for the 12 radionuclides reported. The qualification status of these FPMS data has been set to In Process until receipt of QA-approved data generator reports. All of the above data have been processed and tested using a SAS-based enterprise application software system, stored in a secure Structured Query Language database, and made available on the NDMAS Web portal (http://ndmas.inl.gov) for both internal and external VHTR project participants.

  6. Evaluating the Potential of Commercial GIS for Accelerator Configuration Management

    SciTech Connect (OSTI)

    T.L. Larrieu; Y.R. Roblin; K. White; R. Slominski

    2005-10-10

    The Geographic Information System (GIS) is a tool used by industries needing to track information about spatially distributed assets. A water utility, for example, must know not only the precise location of each pipe and pump, but also the respective pressure rating and flow rate of each. In many ways, an accelerator such as CEBAF (Continuous Electron Beam Accelerator Facility) can be viewed as an ''electron utility''. Whereas the water utility uses pipes and pumps, the ''electron utility'' uses magnets and RF cavities. At Jefferson lab we are exploring the possibility of implementing ESRI's ArcGIS as the framework for building an all-encompassing accelerator configuration database that integrates location, configuration, maintenance, and connectivity details of all hardware and software. The possibilities of doing so are intriguing. From the GIS, software such as the model server could always extract the most-up-to-date layout information maintained by the Survey & Alignment for lattice modeling. The Mechanical Engineering department could use ArcGIS tools to generate CAD drawings of machine segments from the same database. Ultimately, the greatest benefit of the GIS implementation could be to liberate operators and engineers from the limitations of the current system-by-system view of machine configuration and allow a more integrated regional approach. The commercial GIS package provides a rich set of tools for database-connectivity, versioning, distributed editing, importing and exporting, and graphical analysis and querying, and therefore obviates the need for much custom development. However, formidable challenges to implementation exist and these challenges are not only technical and manpower issues, but also organizational ones. The GIS approach would crosscut organizational boundaries and require departments, which heretofore have had free reign to manage their own data, to cede some control and agree to a centralized framework.

  7. Modeling of fluidized-bed combustion of coal: Phase II, final reports. Volume VII. FBC Data-Base-Management System (FBC-DBMS) users manual

    SciTech Connect (OSTI)

    Louis, J.F.; Tung, S.E.

    1980-10-01

    The primary goal of the Fluidized Bed Combustor Data Base (FBCDB) is to establish a data repository for the express use of designers and research personnel involved in FBC development. FBCDB is implemented on MIT's 370/168 computer, using the Model 204 Data Base Management System (DBMS) developed by Computer Corporation of America. DBMS is a software that provides an efficient way of storing, retrieving, updating and manipulating data using an English-like query language. The primary content of FBCDB is a collection of data points defined by the value of a number of specific FBC variables. A user may interactively access the data base from a computer terminal at any location, retrieve, examine, and manipulate the data as well as produce tables or graphs of the results. More than 20 program segments are currently available in M204 User Language to simplify the user interface for the FBC design or research personnel. However, there are still many complex and advanced retrieving as well as applications programs to be written for this purpose. Although there are currently 71 entries, and about 2000 groups reposited in the system, this size of data is only an intermediate portion of our selection. The usefulness of the system at the present time is, therefore, limited. This version of FBCDB will be released on a limited scale to obtain review and comments. The document is intended as a reference guide to the use of FBCDB. It has been structured to introduce the user to the basics of FBCDB, summarize what the available segments in FBCDB can do, and give detailed information on the operation of FBCDB. This document represents a preliminary draft of a Users Manual. The draft will be updated when the data base system becomes fully implemented. Any suggestions as to how this manual may be improved will be appreciated.

  8. Modeling of fluidized-bed combustion of coal: Phase II, final reports. Volume VI. FBC-Data Base-Management-System (FBC-DBMS) development

    SciTech Connect (OSTI)

    Louis, J.F.; Tung, S.E.

    1980-10-01

    The primary goal of the Fluidized Bed Combustor Data Base, (FBCDB), situated in MIT's Energy laboratory, is to establish a data repository for the express use of designers and research personnel involved in FBC development. DBMS is a software that provides an efficient way of storing, retrieving, updating and manipulating data using an English-like query language. It is anticipated that the FBCDB would play an active and a direct role in the development of FBC technology as well as in the FBC commercial application. After some in-house experience and after a careful and extensive review of commercially available database systems, it was determined that the Model 204 DBMS by Computer Corporation of America was the most suitable to our needs. The setup of a prototype in-house database also allowed us to investigate and understand fully the particular problems involved in coordinating FBC development with a DBMS. Various difficult aspects were encountered and solutions had been sought. For instance, we found that it was necessary to rename the variables to avoid repetition as well as to increase usefulness of our database and, hence, we had designed a classification system for which variables were classified under category to achieve standardization of variable names. The primary content of FBCDB is a collection of data points defined by the value of a number of specific FBC variables. A user may interactively access the database from a computer terminal at any location, retrieve, examine, and manipulate the data as well as produce tables or graphs of the results.

  9. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    SciTech Connect (OSTI)

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.

  10. Supporting large-scale computational science

    SciTech Connect (OSTI)

    Musick, R., LLNL

    1998-02-19

    Business needs have driven the development of commercial database systems since their inception. As a result, there has been a strong focus on supporting many users, minimizing the potential corruption or loss of data, and maximizing performance metrics like transactions per second, or TPC-C and TPC-D results. It turns out that these optimizations have little to do with the needs of the scientific community, and in particular have little impact on improving the management and use of large-scale high-dimensional data. At the same time, there is an unanswered need in the scientific community for many of the benefits offered by a robust DBMS. For example, tying an ad-hoc query language such as SQL together with a visualization toolkit would be a powerful enhancement to current capabilities. Unfortunately, there has been little emphasis or discussion in the VLDB community on this mismatch over the last decade. The goal of the paper is to identify the specific issues that need to be resolved before large-scale scientific applications can make use of DBMS products. This topic is addressed in the context of an evaluation of commercial DBMS technology applied to the exploration of data generated by the Department of Energy`s Accelerated Strategic Computing Initiative (ASCI). The paper describes the data being generated for ASCI as well as current capabilities for interacting with and exploring this data. The attraction of applying standard DBMS technology to this domain is discussed, as well as the technical and business issues that currently make this an infeasible solution.

  11. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Daily, Jeffrey A.

    2016-02-10

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less

  12. SU-E-P-07: Evaluation of Productivity Systems for Radiation Therapy

    SciTech Connect (OSTI)

    Ramsey, C; Usynin, A [Thompson Cancer Survival Center, Knoxville, TN (United States)

    2014-06-01

    Purpose: Health systems throughout the United States are under increased financial pressure to reduce operating cost. As a result, productivity models developed by third-party consultants are being used to optimize staff to treatment volumes. The purpose of this study was to critically evaluate productivity systems for radiation oncology. Methods: Staffing efficiency was evaluated using multiple productivity models. The first model evaluated staffing levels using equal weighting of procedure codes and hours worked. A second productivity model was developed using hours worked by job class and relative value units for each procedure code. A third model was developed using the measured procedure times extracted from the electronic medical record, which tracks the wait and treatment times for each patient for each treatment fraction. A MatLab program was developed to query and analyze the daily treatment data. A model was then created to determine any theoretical gains in treatment productivity. Results: Productivity was evaluated for six radiation therapy departments operating nine linear accelerators delivering over 40,000 treatment fractions per year. Third party productivity models that do not take into consideration the unique nature of radiation therapy can be counterproductive. For example, other outpatient departments can compress their daily schedule to decrease the worked hours. This approach was tested using the treatment schedule evaluation tool developed as part of this study. It was determined that the maximum possible savings for treatment schedule compression was $32,000 per year per linac. All annual cost savings would be lost if only two patients per year choose to be treated elsewhere because of limited or restricted appointment times. Conclusion: The use of productivity models in radiation therapy can easily result in a loss of treatment revenue that is greater than any potential cost savings in reduced hours worked by staff.

  13. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  14. Practice Patterns of Radiotherapy in Cervical Cancer Among Member Groups of the Gynecologic Cancer Intergroup (GCIG)

    SciTech Connect (OSTI)

    Gaffney, David K. . E-mail: david.gaffney@hci.utah.edu; Du Bois, Andreas; Narayan, Kailash; Reed, Nick; Toita, Takafumi; Pignata, Sandro; Blake, Peter; Portelance, Lorraine; Sadoyze, Azmat; Poetter, Richard; Colombo, Alessandro; Randall, Marcus; Mirza, Mansoor R.; Trimble, Edward L.

    2007-06-01

    Purpose: The aim of this study was to describe radiotherapeutic practice of the treatment of cervical cancer in member groups of the Gynecologic Cancer Intergroup (GCIG). Methods and Materials: A survey was developed and distributed to the members of the GCIG focusing on details of radiotherapy practice. Different scenarios were queried including advanced cervical cancer, postoperative patients, and para-aortic-positive lymph node cases. Items focused on indications for radiation therapy, radiation fields, dose, use of chemotherapy, brachytherapy and others. The cooperative groups from North America were compared with the other groups to evaluate potential differences in radiotherapy doses. Results: A total of 39 surveys were returned from 13 different cooperative groups. For the treatment of advanced cervical cancer, external beam pelvic doses and total doses to point A were 47 + 3.5 Gy (mean + SD) and 79.1 + 7.9 Gy, respectively. Point A doses were not different between the North American cooperative groups compared with the others (p = 0.103). All groups used concomitant chemotherapy, with 30 of 36 respondents using weekly cisplatin. Of 33 respondents, 31 intervened for a low hemoglobin level. For a para-aortic field, the upper border was most commonly (15 of 24) at the T12-L1 interspace. Maintenance chemotherapy (after radiotherapy) was not performed by 68% of respondents. For vaginal brachytherapy after hysterectomy, 23 groups performed HDR brachytherapy and four groups used LDR brachytherapy. In the use of brachytherapy, there was no uniformity in dose prescription. Conclusions: Radiotherapy practices among member groups of the GCIG are similar in terms of both doses and use of chemotherapy.

  15. GenomeVista

    Energy Science and Technology Software Center (OSTI)

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suitemore » of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program to find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  16. Field Trial of a Low-Cost, Distributed Plug Load Monitoring System

    SciTech Connect (OSTI)

    Auchter, B.; Cautley, D.; Ahl, D.; Earle, L.; Jin, X.

    2014-03-01

    Researchers have struggled to inventory and characterize the energy use profiles of the ever-growing category of so-called miscellaneous electric loads (MELs) because plug-load monitoring is cost-prohibitive to the researcher and intrusive to the homeowner. However, these data represent a crucial missing link to our understanding of how homes use energy, and we cannot control what we do not understand. Detailed energy use profiles would enable the nascent automated home energy management (AHEM) industry to develop effective control algorithms that target consumer electronics and other plug loads. If utility and other efficiency programs are to incent AHEM devices, they need large-scale datasets that provide statistically meaningful justification of their investments by quantifying the aggregate energy savings achievable. To address this need, we have investigated a variety of plug-load measuring devices available commercially and tested them in the laboratory to identify the most promising candidates for field applications. The scope of this report centers around the lessons learned from a field validation of one proof-of-concept system, called Smartenit (formerly SimpleHomeNet). The system was evaluated based on the rate of successful data queries, reliability over a period of days to weeks, and accuracy. This system offers good overall performance when deployed with up to ten end nodes in a residential environment, although deployment with more nodes and in a commercial environment is much less robust. We conclude that the current system is useful in selected field research projects, with the recommendation that system behavior is observed over time.

  17. Historical Trends in the Use of Radiation Therapy for Pediatric Cancers: 1973-2008

    SciTech Connect (OSTI)

    Jairam, Vikram; Roberts, Kenneth B.; Yale Cancer Center, New Haven, Connecticut; Cancer Outcomes, Public Policy, and Effectiveness Research Center at Yale, New Haven, Connecticut ; Yu, James B.

    2013-03-01

    Purpose: This study was undertaken to assess historical trends in the use of radiation therapy (RT) for pediatric cancers over the past 4 decades. Methods: The National Cancer Institute's Surveillance, Epidemiology, and End Results database of the 9 original tumor registries (SEER-9) was queried to identify patients aged 0 to 19 years with acute lymphoblastic leukemia, acute myeloid leukemia, bone and joint cancer, cancer of the brain and nervous system, Hodgkin lymphoma, neuroblastoma, non-Hodgkin lymphoma, soft tissue cancer, Wilms tumor, or retinoblastoma from 1973 to 2008. Patients were grouped into 4-year time epochs. The number and percentage of patients who received RT as part of their initial treatment were calculated per epoch by each diagnosis group from 1973 to 2008. Results: RT use for acute lymphoblastic leukemia, non-Hodgkin lymphoma, and retinoblastoma declined sharply from 57%, 57%, and 30% in 1973 to 1976 to 11%, 15%, and 2%, respectively, in 2005 to 2008. Similarly, smaller declines in RT use were also seen in brain cancer (70%-39%), bone cancer (41%-21%), Wilms tumor (75%-53%), and neuroblastoma (60%-25%). RT use curves for Wilms tumor and neuroblastoma were nonlinear with nadirs in 1993 to 1996 at 39% and 19%, respectively. There were minimal changes in RT use for Hodgkin lymphoma, soft tissue cancer, or acute myeloid leukemia, roughly stable at 72%, 40%, and 11%, respectively. Almost all patients treated with RT were given external beam RT exclusively. However, from 1985 to 2008, treatments involving brachytherapy, radioisotopes, or combination therapy increased in frequency, comprising 1.8%, 4.6%, and 11.9% of RT treatments in brain cancer, soft tissue cancer, and retinoblastoma, respectively. Conclusions: The use of RT is declining over time in 7 of 10 pediatric cancer categories. A limitation of this study is a potential under-ascertainment of RT use in the SEER-9 database including the delayed use of RT.

  18. Adaptable Computing Environment/Self-Assembling Software

    Energy Science and Technology Software Center (OSTI)

    2007-09-25

    Complex software applications are difficult to learn to use and to remember how to use. Further, the user has no control over the functionality available in a given application. The software we use can be created and modified only by a relatively small group of elite, highly skilled artisans known as programmers. "Normal users" are powerless to create and modify software themselves, because the tools for software development, designed by and for programmers, are amore » barrier to entry. This software, when completed, will be a user-adaptable computing environment in which the user is really in control of his/her own software, able to adapt the system, make new parts of the system interactive, and even modify the behavior of the system itself. Som key features of the basic environment that have been implemented are (a) books in bookcases, where all data is stored, (b) context-sensitive compass menus (compass, because the buttons are located in compass directions relative to the mouose cursor position), (c) importing tabular data and displaying it in a book, (d) light-weight table querying/sorting, (e) a Reach&Get capability (sort of a "smart" copy/paste that prevents the user from copying invalid data), and (f) a LogBook that automatically logs all user actions that change data or the system itself. To bootstrap toward full end-user adaptability, we implemented a set of development tools. With the development tools, compass menus can be made and customized.« less

  19. SpArcFiRe: Scalable automated detection of spiral galaxy arm segments

    SciTech Connect (OSTI)

    Davis, Darren R.; Hayes, Wayne B. E-mail: whayes@uci.edu

    2014-08-01

    Given an approximately centered image of a spiral galaxy, we describe an entirely automated method that finds, centers, and sizes the galaxy (possibly masking nearby stars and other objects if necessary in order to isolate the galaxy itself) and then automatically extracts structural information about the spiral arms. For each arm segment found, we list the pixels in that segment, allowing image analysis on a per-arm-segment basis. We also perform a least-squares fit of a logarithmic spiral arc to the pixels in that segment, giving per-arc parameters, such as the pitch angle, arm segment length, location, etc. The algorithm takes about one minute per galaxies, and can easily be scaled using parallelism. We have run it on all ?644,000 Sloan objects that are larger than 40 pixels across and classified as 'galaxies'. We find a very good correlation between our quantitative description of a spiral structure and the qualitative description provided by Galaxy Zoo humans. Our objective, quantitative measures of structure demonstrate the difficulty in defining exactly what constitutes a spiral 'arm', leading us to prefer the term 'arm segment'. We find that pitch angle often varies significantly segment-to-segment in a single spiral galaxy, making it difficult to define the pitch angle for a single galaxy. We demonstrate how our new database of arm segments can be queried to find galaxies satisfying specific quantitative visual criteria. For example, even though our code does not explicitly find rings, a good surrogate is to look for galaxies having one long, low-pitch-angle armwhich is how our code views ring galaxies. SpArcFiRe is available at http://sparcfire.ics.uci.edu.

  20. Scaling Semantic Graph Databases in Size and Performance

    SciTech Connect (OSTI)

    Morari, Alessandro; Castellana, Vito G.; Villa, Oreste; Tumeo, Antonino; Weaver, Jesse R.; Haglin, David J.; Choudhury, Sutanay; Feo, John T.

    2014-08-06

    In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grained data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.

  1. Visual Sample Plan (VSP) - FIELDS Integration

    SciTech Connect (OSTI)

    Pulsipher, Brent A.; Wilson, John E.; Gilbert, Richard O.; Hassig, Nancy L.; Carlson, Deborah K.; Bing-Canar, John; Cooper, Brian; Roth, Chuck

    2003-04-19

    Two software packages, VSP 2.1 and FIELDS 3.5, are being used by environmental scientists to plan the number and type of samples required to meet project objectives, display those samples on maps, query a database of past sample results, produce spatial models of the data, and analyze the data in order to arrive at defensible decisions. VSP 2.0 is an interactive tool to calculate optimal sample size and optimal sample location based on user goals, risk tolerance, and variability in the environment and in lab methods. FIELDS 3.0 is a set of tools to explore the sample results in a variety of ways to make defensible decisions with quantified levels of risk and uncertainty. However, FIELDS 3.0 has a small sample design module. VSP 2.0, on the other hand, has over 20 sampling goals, allowing the user to input site-specific assumptions such as non-normality of sample results, separate variability between field and laboratory measurements, make two-sample comparisons, perform confidence interval estimation, use sequential search sampling methods, and much more. Over 1,000 copies of VSP are in use today. FIELDS is used in nine of the ten U.S. EPA regions, by state regulatory agencies, and most recently by several international countries. Both software packages have been peer-reviewed, enjoy broad usage, and have been accepted by regulatory agencies as well as site project managers as key tools to help collect data and make environmental cleanup decisions. Recently, the two software packages were integrated, allowing the user to take advantage of the many design options of VSP, and the analysis and modeling options of FIELDS. The transition between the two is simple for the user VSP can be called from within FIELDS, automatically passing a map to VSP and automatically retrieving sample locations and design information when the user returns to FIELDS. This paper will describe the integration, give a demonstration of the integrated package, and give users download instructions and software requirements for running the integrated package.

  2. Final technical report for: Insertional Mutagenesis of Brachypodium distachyon DE-AI02-07ER64452

    SciTech Connect (OSTI)

    John, Vogel P.

    2015-10-29

    Several bioenergy grasses are poised to become a major source of energy in the United States. Despite their increasing importance, we know little about the basic biology underlying the traits that control the utility of grasses as energy crops. Better knowledge of grass biology (e.g. identification of the genes that control cell wall composition, plant architecture, cell size, cell division, reproduction, nutrient uptake, carbon flux, etc.) could be used to design rational strategies for crop improvement and shorten the time required to domesticate these species. The use of an appropriate model system is an efficient way to gain this knowledge. Brachypodium distachyon is a small annual grass with all the attributes needed to be a modern model organism including simple growth requirements, fast generation time, small stature, small genome size and self-fertility. These attributes led to the recommendation in the DOE’s “Breaking the Biological Barriers to Cellulosic Ethanol: A Joint Research Agenda” report to propose developing and using B. distachyon as a model for energy crops to accelerate their domestication. Strategic investments (e.g. genome sequencing) in B. distachyon by the DOE are now bearing fruit and B. distachyon is being used as a model grass by hundreds of laboratories worldwide. Sequence indexed insertional mutants are an extremely powerful tool for both forward and reverse genetics. They allow researchers to order mutants in any gene tagged in the collection by simply emailing a request. The goal of this project was to create a collection of sequence indexed insertional mutants (T-DNA lines) for the model grass Brachypodium distachyon in order to facilitate research by the scientific community. During the course of this grant we created a collection of 23,649 B. distachyon T-DNA lines and identified 26,112 unique insertion sites. The collection can be queried through the project website (http://jgi.doe.gov/our-science/science-programs/plant-genomics/brachypodium/brachypodium-t-dna-collection/) and through the Phytozome genome browser (http://phytozome.jgi.doe.gov/pz/portal.html). The collection has been heavily utilized by the research community and, as of October 23, 2015, 223 orders for 12,069 seeds packets have been filled. In addition to creating this resource, we also optimized methods for transformation and sequencing DNA flanking insertion sites.

  3. Management of Male Breast Cancer in the United States: A Surveillance, Epidemiology and End Results Analysis

    SciTech Connect (OSTI)

    Fields, Emma C.; DeWitt, Peter; Fisher, Christine M.; Rabinovitch, Rachel

    2013-11-15

    Purpose: To analyze the stage-specific management of male breast cancer (MBC) with surgery and radiation therapy (RT) and relate them to outcomes and to female breast cancer (FBC). Methods and Materials: The Surveillance, Epidemiology, and End Results database was queried for all primary invasive MBC and FBC diagnosed from 1973 to 2008. Analyzable data included age, race, registry, grade, stage, estrogen and progesterone receptor status, type of surgery, and use of RT. Stage was defined as localized (LocD): confined to the breast; regional (RegD): involving skin, chest wall, and/or regional lymph nodes; and distant: M1. The primary endpoint was cause-specific survival (CSS). Results: A total of 4276 cases of MBC and 718,587 cases of FBC were identified. Male breast cancer constituted 0.6% of all breast cancer. Comparing MBC with FBC, mastectomy (M) was used in 87.4% versus 38.3%, and breast-conserving surgery in 12.6% versus 52.6% (P<10{sup ?4}). For males with LocD, CSS was not significantly different for the 4.6% treated with lumpectomy/RT versus the 70% treated with M alone (hazard ratio [HR] 1.33; 95% confidence interval [CI] 0.49-3.61; P=.57). Postmastectomy RT was delivered in 33% of males with RegD and was not associated with an improvement in CSS (HR 1.11; 95% CI 0.88-1.41; P=.37). There was a significant increase in the use of postmastectomy RT in MBC over time: 24.3%, 27.2%, and 36.8% for 1973-1987, 1988-1997, and 1998-2008, respectively (P<.0001). Cause-specific survival for MBC has improved: the largest significant change was identified for men diagnosed in 1998-2008 compared with 1973-1987 (HR 0.73; 95% CI 0.60-0.88; P=.0004). Conclusions: Surgical management of MBC is dramatically different than for FBC. The majority of males with LocD receive M despite equivalent CSS with lumpectomy/RT. Postmastectomy RT is greatly underutilized in MBC with RegD, although a CSS benefit was not demonstrated. Outcomes for MBC are improving, attributable to improved therapy and its use in this unscreened population.

  4. Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

    SciTech Connect (OSTI)

    Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; Vondervoot, Peter J.I. van de; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristen F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; Dijck, Piet W.M. van; Hofmann, Gerald; Lasure, Linda L.; Magnusson, Jon K.; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; Ooyen, Albert J.J. van; Panther, Kathyrn S.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hen; Tsang, Adrian; Brink, Johannes M. van den; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Kubicek, Christian P.; Martinez, Diego; Peij, Noel N.M.E. van; Roubos, Johannes A.; Nielsen, Jens

    2011-04-28

    The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758

  5. Assessing Risk and Driving Risk Mitigation for First-of-a-Kind Advanced Reactors

    SciTech Connect (OSTI)

    John W. Collins

    2011-09-01

    Planning and decision making amidst programmatic and technological risks represent significant challenges for projects. This presentation addresses the four step risk-assessment process needed to determine clear path forward to mature needed technology and design, license, and construct advanced nuclear power plants, which have never been built before, including Small Modular Reactors. This four step process has been carefully applied to the Next Generation Nuclear Plant. STEP 1 - Risk Identification Risks are identified, collected, and categorized as technical risks, programmatic risks, and project risks, each of which result in cost and schedule impacts if realized. These include risks arising from the use of technologies not previously demonstrated in a relevant application. These risks include normal and accident scenarios which the SMR could experience including events that cause the disablement of engineered safety features (typically documented in Phenomena Identification Ranking Tables (PIRT) as produced with the Nuclear Regulatory Commission) and design needs which must be addressed to further detail the design. Product - Project Risk Register contained in a database with sorting, presentation, rollup, risk work off functionality similar to the NGNP Risk Management System . STEP 2 - Risk Quantification The risks contained in the risk register are then scored for probability of occurrence and severity of consequence, if realized. Here the scoring methodology is established and the basis for the scoring is well documented. Product - Quantified project risk register with documented basis for scoring. STEP 3 - Risk Handling Strategy Risks are mitigated by applying a systematic approach to maturing the technology through Research and Development, modeling, test, and design. A Technology Readiness Assessment is performed to determine baseline Technology Readiness Levels (TRL). Tasks needed to mature the technology are developed and documented in a roadmap. Product - Risk Handling Strategy. STEP 4 - Residual Risk Work off The risk handling strategy is entered into the Project Risk Allocation Tool (PRAT) to analyze each task for its ability to reduce risk. The result is risk-informed task prioritization. The risk handling strategy is captured in the Risk Management System, a relational database that provides conventional database utility, including data maintenance, archiving, configuration control, and query ability. The tool's Hierarchy Tree allows visualization and analyses of complex relationships between risks, risk mitigation tasks, design needs, and PIRTs. Product - Project Risk Allocation Tool and Risk Management System which depict project plan to reduce risk and current progress in doing so.

  6. Support of US CLIVAR Project Office 2012

    SciTech Connect (OSTI)

    Cummings, Donna

    2013-11-21

    SUBJECT: CLOSEOUT OF AWARD NO DE-SC0008494 FINAL REPORT: SUPPORT OF US CLIVAR PROJECT OFFICE 2012 UNIVERSITY CORPORATION FOR ATMOSPHERIC RESEARCH Director of JOSS, supervised the U.S. CLIVAR Project Office Director and helped direct the officer to enhance the goals and objectives of the U.S. CLIVAR Project and budget. Financial Manager of JOSS, worked to complete proposals and monitor compliance with award requirements and funding limitations and ensure the U.S. CLIVAR Project Office complied with UCAR policies and procedures. Project Coordinator administered the funding for the U.S. CLIVAR Project Office and was responsible for coordinating special projects that required additional support from JOSS technical staff. These projects included activities such as website updates, technology upgrades, production of printed reports, and development of graphic elements like logos. Web Developer worked both on web development and graphic work and the work consisted of the following: Maintaining the site ? installing updates to Drupal CMS (Content Management System). Creating new templates for webpages and styling them with CSS and JavaScript/jQuery code. Fixing the styling on webpages that the content contributor/manager (Jenn Mays) created and has had trouble with. Creating new web forms for abstract uploading, subscriptions, and meeting registrations. Created 4 webpages for the ?ASP: Key Uncertainties in the Global Carbon-Cycle? meeting. Developed a document review form, instruction webpages, login redirect, dynamic table with form submissions for the US CLIVAR SSC Science Plan Document Review. This was open to the public from June 12, 2013 until July 10, 2013. During this time the user accounts had to be checked (daily) that were created by the public, to delete any spam ones. Graphics work: preparing images for general use on webpages, webpage banners, and for meeting name badges, creating a US CLIVAR letterhead, redesigning the US AMOC logo. System Administrator spent time working on the migration of the US CLIVAR site from the USGCRP office to UCAR here Boulder. This was done to increase the general speed of the site & to allow the web developer to work in it more efficiently. Main tasks were to Archive the old Site, create new development site for web developer, and move web address to new website when web developer was finished with development. There are no patients or equipment related to this proposals

  7. AGR-2 Data Qualification Report for ATR Cycles 151B-2, 152A, 152B, 153A, 153B and 154A

    SciTech Connect (OSTI)

    Binh T. Pham; Jeffrey J. Einerson

    2013-09-01

    This report documents the data qualification status of AGR-2 fuel irradiation experimental data from Advanced Test Reactor (ATR) Cycles 152A, 152B, 153A, 153B, and 154A, as recorded in the Nuclear Data Management and Analysis System (NDMAS). The AGR-2 data streams addressed include thermocouple (TC) temperatures, sweep gas data (flow rate, pressure, and moisture content), and fission product monitoring system (FPMS) data for each of the six capsules in the experiment. A total of 13,400,520 every minute instantaneous TC and sweep gas data records were received and processed by NDMAS for this period. Of these data, 8,911,791 records (66.5% of the total) were determined to be Qualified based on NDMAS accuracy testing and data validity assessment. For temperature, there were 4,266,081 records (74% of the total TC data) that were Failed due to TC instrument failures. For sweep gas flows, there were 222,648 gas flow records (2.91% of the flow data) that were Failed. The inlet gas flow failures due to gas flow cross-talk and leakage problems that occurred after Cycle 150A were corrected by using the same gas mixture in all six capsules and the Leadout. For FPMS data, NDMAS received and processed preliminary release rate and release-to-birth rate ratio (R/B) data for three reactor cycles (Cycles 149B, 150B, and 151A) . This data consists of 45,983 release rate records and 45,235 R/B records for the 12 radionuclides reported. The qualification status of these FPMS data has been set to In Process until receipt of Quality Assurance-approved data generator reports. All of the above data have been processed and tested using a SAS-based enterprise application software system, stored in a secure Structured Query Language database, made available on the NDMAS Web portal (http://ndmas.inl.gov), and approved by the INL STIM for release to both internal and appropriate external Very High Temperature Reactor Program participants.

  8. Cross-language information retrieval using PARAFAC2.

    SciTech Connect (OSTI)

    Bader, Brett William; Chew, Peter; Abdelali, Ahmed; Kolda, Tamara Gibson

    2007-05-01

    A standard approach to cross-language information retrieval (CLIR) uses Latent Semantic Analysis (LSA) in conjunction with a multilingual parallel aligned corpus. This approach has been shown to be successful in identifying similar documents across languages - or more precisely, retrieving the most similar document in one language to a query in another language. However, the approach has severe drawbacks when applied to a related task, that of clustering documents 'language-independently', so that documents about similar topics end up closest to one another in the semantic space regardless of their language. The problem is that documents are generally more similar to other documents in the same language than they are to documents in a different language, but on the same topic. As a result, when using multilingual LSA, documents will in practice cluster by language, not by topic. We propose a novel application of PARAFAC2 (which is a variant of PARAFAC, a multi-way generalization of the singular value decomposition [SVD]) to overcome this problem. Instead of forming a single multilingual term-by-document matrix which, under LSA, is subjected to SVD, we form an irregular three-way array, each slice of which is a separate term-by-document matrix for a single language in the parallel corpus. The goal is to compute an SVD for each language such that V (the matrix of right singular vectors) is the same across all languages. Effectively, PARAFAC2 imposes the constraint, not present in standard LSA, that the 'concepts' in all documents in the parallel corpus are the same regardless of language. Intuitively, this constraint makes sense, since the whole purpose of using a parallel corpus is that exactly the same concepts are expressed in the translations. We tested this approach by comparing the performance of PARAFAC2 with standard LSA in solving a particular CLIR problem. From our results, we conclude that PARAFAC2 offers a very promising alternative to LSA not only for multilingual document clustering, but also for solving other problems in cross-language information retrieval.

  9. The Impact of Adjuvant Radiation Therapy for High-Grade Gliomas by Histology in the United States Population

    SciTech Connect (OSTI)

    Rusthoven, Chad G.; Carlson, Julie A.; Waxweiler, Timothy V.; Dally, Miranda J.; Barón, Anna E.; Yeh, Norman; Gaspar, Laurie E.; Liu, Arthur K.; Ney, Douglas E.; Damek, Denise M.; Lillehei, Kevin O.; Kavanagh, Brian D.

    2014-11-15

    Purpose: To compare the survival impact of adjuvant external beam radiation therapy (RT) for malignant gliomas of glioblastoma (GBM), anaplastic astrocytoma (AA), anaplastic oligodendroglioma (AO), and mixed anaplastic oligoastrocytoma (AOA) histology. Methods and Materials: The Surveillance, Epidemiology, and End Results (SEER) database was queried from 1998 to 2007 for patients aged ≥18 years with high-grade gliomas managed with upfront surgical resection, treated with and without adjuvant RT. Results: The primary analysis totaled 14,461 patients, with 12,115 cases of GBM (83.8%), 1312 AA (9.1%), 718 AO (4.9%), and 316 AOA (2.2%). On univariate analyses, adjuvant RT was associated with significantly improved overall survival (OS) for GBMs (2-year OS, 17% vs 7%, p<.001), AAs (5-year OS, 38% vs 24%, p<.001), and AOAs (5-year OS, 55% vs 44%, p=.026). No significant differences in OS were observed for AOs (5-year OS, with RT 50% vs 56% without RT, p=.277). In multivariate Cox proportional hazards models accounting for extent of resection, age, sex, race, year, marital status, and tumor registry, RT was associated with significantly improved OS for both GBMs (HR, 0.52; 95% CI, 0.50-0.55; P<.001) and AAs (HR, 0.57; 95% CI, 0.48-0.68; P<.001) but only a trend toward improved OS for AOAs (HR, 0.70; 95% CI, 0.45-1.09; P=.110). Due to the observation of nonproportional hazards, Cox regressions were not performed for AOs. A significant interaction was observed between the survival impact of RT and histology overall (interaction P<.001) and in a model limited to the anaplastic (WHO grade 3) histologies. (interaction P=.024), characterizing histology as a significant predictive factor for the impact of RT. Subgroup analyses demonstrated greater hazard reductions with RT among patients older than median age for both GBMs and AAs (all interaction P≤.001). No significant interactions were observed between RT and extent of resection. Identical patterns of significance were observed for cause-specific survival and OS across analyses. Conclusions: In this large population-based cohort, glioma histology represented a significant predictor for the survival impact of RT. Adjuvant RT was associated with improved survival for AAs, with benefits comparable to those observed for GBMs over the same 10-year interval. No survival advantage was observed with adjuvant RT for AOs.

  10. Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

    SciTech Connect (OSTI)

    Ruebel, Oliver

    2009-12-01

    Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research covered in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.

  11. camclnt

    Energy Science and Technology Software Center (OSTI)

    1998-07-07

    Camclnt is a Java application that provides a graphical user interface for controlling the video devices used in a videoconference. These devices can be located anywhere on the Internet (e.g., in a colocated conference room or at a geographically remote site). The user who is watching the video can request camera pan, tilt, zoom, or picture-in-picture motions and can switch between camera views. Camclnt is meant to be run in conjunction with the device servermore » (devserv). A camera control language was designed and inplemented for communication with the device server. Network communication is via unicast UDP and IP multicast. The operation is as follows. A user can launch camclnt with the name of the host running the devserv or, if camclnt is running on a multicast-enabled computer, camclnt will multicast a query. All the device servers that are connected to this multicast channel will reply with their host names, ip addresses, and the devices they support. After receiving these replies, camclnt will configure its main window. The user can then select the deverv to control, select the devices to control (when there are multiple devices), and point-and-click to control the video devices. Requests are sent to the devserv to which the camclnt is connected and after the request has been carried out, devserv multicasts a reply in the form of a status message. Camclnt has been enhanced with the Java Media Framwork (JMF) for receiving and presenting the video. Therefore users can use camclnt to view (but not transmit) video or they can run an external video tool such as vic. In either case, the actual video is displayed in a different window from that used to request device motion. Camclnt can optionally be used in conjunction with the CIF communication library and/or the Akenti authentication and authorization system. These features should be enabled only if they are enabled on the devserv to which camclnt will connect.« less

  12. The Influence of Radiation Modality and Lymph Node Dissection on Survival in Early-Stage Endometrial Cancer

    SciTech Connect (OSTI)

    Chino, Junzo P., E-mail: junzo.chino@duke.edu [Department of Radiation Oncology, Duke University Medical Center, Durham, NC (United States); Jones, Ellen [Department of Radiation Oncology, University of North Caroline, Chapel Hill, NC (United States); Berchuck, Andrew; Secord, Angeles Alvarez; Havrilesky, Laura J. [Division of Gynecologic Oncology, Duke University Medical Center, Durham, NC (United States)

    2012-04-01

    Background: The appropriate uses of lymph node dissection (LND) and adjuvant radiation therapy (RT) for Stage I endometrial cancer are controversial. We explored the impact of specific RT modalities (whole pelvic RT [WPRT], vaginal brachytherapy [VB]) and LND status on survival. Materials and Methods: The Surveillance Epidemiology and End Results dataset was queried for all surgically treated International Federation of Gynecology and Obstetrics (FIGO) Stage I endometrial cancers; subjects were stratified into low, intermediate and high risk cohorts using modifications of Gynecologic Oncology Group (GOG) protocol 99 and PORTEC (Postoperative Radiation Therapy in Endometrial Cancer) trial criteria. Five-year overall survival was estimated, and comparisons were performed via the log-rank test. Results: A total of 56,360 patients were identified: 70.4% low, 26.2% intermediate, and 3.4% high risk. A total of 41.6% underwent LND and 17.6% adjuvant RT. In low-risk disease, LND was associated with higher survival (93.7 LND vs. 92.7% no LND, p < 0.001), whereas RT was not (91.6% RT vs. 92.9% no RT, p = 0.23). In intermediate-risk disease, LND (82.1% LND vs. 76.5% no LND, p < 0.001) and RT (80.6% RT vs. 74.9% no RT, p < 0.001) were associated with higher survival without differences between RT modalities. In high-risk disease, LND (68.8% LND vs. 54.1% no LND, p < 0.001) and RT (66.9% RT vs. 57.2% no RT, p < 0.001) were associated with increased survival; if LND was not performed, VB alone was inferior to WPRT (p = 0.01). Conclusion: Both WPRT and VB alone are associated with increased survival in the intermediate-risk group. In the high-risk group, in the absence of LND, only WPRT is associated with increased survival. LND was also associated with increased survival.

  13. MicroRNAs expression in ox-LDL treated HUVECs: MiR-365 modulates apoptosis and Bcl-2 expression

    SciTech Connect (OSTI)

    Qin, Bing; Xiao, Bo; Liang, Desheng; Xia, Jian; Li, Ye; Yang, Huan

    2011-06-24

    Highlights: {yields} We evaluated the role of miRNAs in ox-LDL induced apoptosis in ECs. {yields} We found 4 up-regulated and 11 down-regulated miRNAs in apoptotic ECs. {yields} Target genes of the dysregulated miRNAs regulate ECs apoptosis and atherosclerosis. {yields} MiR-365 promotes ECs apoptosis via suppressing Bcl-2 expression. {yields} MiR-365 inhibitor alleviates ECs apoptosis induced by ox-LDL. -- Abstract: Endothelial cells (ECs) apoptosis induced by oxidized low-density lipoprotein (ox-LDL) is thought to play a critical role in atherosclerosis. MicroRNAs (miRNAs) are a class of noncoding RNAs that posttranscriptionally regulate the expression of genes involved in diverse cell functions, including differentiation, growth, proliferation, and apoptosis. However, whether miRNAs are associated with ox-LDL induced apoptosis and their effect on ECs is still unknown. Therefore, this study evaluated potential miRNAs and their involvement in ECs apoptosis in response to ox-LDL stimulation. Microarray and qRT-PCR analysis performed on human umbilical vein endothelial cells (HUVECs) exposed to ox-LDL identified 15 differentially expressed (4 up- and 11 down-regulated) miRNAs. Web-based query tools were utilized to predict the target genes of the differentially expressed miRNAs, and the potential target genes were classified into different function categories with the gene ontology (GO) term and KEGG pathway annotation. In particular, bioinformatics analysis suggested that anti-apoptotic protein B-cell CLL/lymphoma 2 (Bcl-2) is a target gene of miR-365, an apoptomir up-regulated by ox-LDL stimulation in HUVECs. We further showed that transfection of miR-365 inhibitor partly restored Bcl-2 expression at both mRNA and protein levels, leading to a reduction of ox-LDL-mediated apoptosis in HUVECs. Taken together, our findings indicate that miRNAs participate in ox-LDL-mediated apoptosis in HUVECs. MiR-365 potentiates ox-LDL-induced ECs apoptosis by regulating the expression of Bcl-2, suggesting potential novel therapeutic targets for atherosclerosis.

  14. Machine Learning for Big Data: A Study to Understand Limits at Scale

    SciTech Connect (OSTI)

    Sukumar, Sreenivas R.; Del-Castillo-Negrete, Carlos Emilio

    2015-12-21

    This report aims to empirically understand the limits of machine learning when applied to Big Data. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny, evaluation and application for gleaning insights from the data than ever before. Much is expected from algorithms without understanding their limitations at scale while dealing with massive datasets. In that context, we pose and address the following questions How does a machine learning algorithm perform on measures such as accuracy and execution time with increasing sample size and feature dimensionality? Does training with more samples guarantee better accuracy? How many features to compute for a given problem? Do more features guarantee better accuracy? Do efforts to derive and calculate more features and train on larger samples worth the effort? As problems become more complex and traditional binary classification algorithms are replaced with multi-task, multi-class categorization algorithms do parallel learners perform better? What happens to the accuracy of the learning algorithm when trained to categorize multiple classes within the same feature space? Towards finding answers to these questions, we describe the design of an empirical study and present the results. We conclude with the following observations (i) accuracy of the learning algorithm increases with increasing sample size but saturates at a point, beyond which more samples do not contribute to better accuracy/learning, (ii) the richness of the feature space dictates performance - both accuracy and training time, (iii) increased dimensionality often reflected in better performance (higher accuracy in spite of longer training times) but the improvements are not commensurate the efforts for feature computation and training and (iv) accuracy of the learning algorithms drop significantly with multi-class learners training on the same feature matrix and (v) learning algorithms perform well when categories in labeled data are independent (i.e., no relationship or hierarchy exists among categories).

  15. Parallel In Situ Indexing for Data-intensive Computing

    SciTech Connect (OSTI)

    Kim, Jinoh; Abbasi, Hasan; Chacon, Luis; Docan, Ciprian; Klasky, Scott; Liu, Qing; Podhorszki, Norbert; Shoshani, Arie; Wu, Kesheng

    2011-09-09

    As computing power increases exponentially, vast amount of data is created by many scientific re- search activities. However, the bandwidth for storing the data to disks and reading the data from disks has been improving at a much slower pace. These two trends produce an ever-widening data access gap. Our work brings together two distinct technologies to address this data access issue: indexing and in situ processing. From decades of database research literature, we know that indexing is an effective way to address the data access issue, particularly for accessing relatively small fraction of data records. As data sets increase in sizes, more and more analysts need to use selective data access, which makes indexing an even more important for improving data access. The challenge is that most implementations of in- dexing technology are embedded in large database management systems (DBMS), but most scientific datasets are not managed by any DBMS. In this work, we choose to include indexes with the scientific data instead of requiring the data to be loaded into a DBMS. We use compressed bitmap indexes from the FastBit software which are known to be highly effective for query-intensive workloads common to scientific data analysis. To use the indexes, we need to build them first. The index building procedure needs to access the whole data set and may also require a significant amount of compute time. In this work, we adapt the in situ processing technology to generate the indexes, thus removing the need of read- ing data from disks and to build indexes in parallel. The in situ data processing system used is ADIOS, a middleware for high-performance I/O. Our experimental results show that the indexes can improve the data access time up to 200 times depending on the fraction of data selected, and using in situ data processing system can effectively reduce the time needed to create the indexes, up to 10 times with our in situ technique when using identical parallel settings.

  16. Integrated Genome-Based Studies of Shewanella Echophysiology

    SciTech Connect (OSTI)

    Margrethe H. Serres

    2012-06-29

    Shewanella oneidensis MR-1 is a motile, facultative {gamma}-Proteobacterium with remarkable respiratory versatility; it can utilize a range of organic and inorganic compounds as terminal electronacceptors for anaerobic metabolism. The ability to effectively reduce nitrate, S0, polyvalent metals andradionuclides has established MR-1 as an important model dissimilatory metal-reducing microorganism for genome-based investigations of biogeochemical transformation of metals and radionuclides that are of concern to the U.S. Department of Energy (DOE) sites nationwide. Metal-reducing bacteria such as Shewanella also have a highly developed capacity for extracellular transfer of respiratory electrons to solid phase Fe and Mn oxides as well as directly to anode surfaces in microbial fuel cells. More broadly, Shewanellae are recognized free-living microorganisms and members of microbial communities involved in the decomposition of organic matter and the cycling of elements in aquatic and sedimentary systems. To function and compete in environments that are subject to spatial and temporal environmental change, Shewanella must be able to sense and respond to such changes and therefore require relatively robust sensing and regulation systems. The overall goal of this project is to apply the tools of genomics, leveraging the availability of genome sequence for 18 additional strains of Shewanella, to better understand the ecophysiology and speciation of respiratory-versatile members of this important genus. To understand these systems we propose to use genome-based approaches to investigate Shewanella as a system of integrated networks; first describing key cellular subsystems - those involved in signal transduction, regulation, and metabolism - then building towards understanding the function of whole cells and, eventually, cells within populations. As a general approach, this project will employ complimentary "top-down" - bioinformatics-based genome functional predictions, high-throughput expression analyses, and functional genomics approaches to uncover key genes as well as metabolic and regulatory networks. The "bottom-up" component employs more traditional approaches including genetics, physiology and biochemistry to test or verify predictions. This information will ultimately be linked to analyses of signal transduction and transcriptional regulatory systems and used to develop a linked model that will contribute to understanding the ecophysiology of Shewanella in redox stratified environments. A central component of this effort is the development of a data and knowledge integration environment that will allow investigators to query across the individual research domains, link to analysis applications, visualize data in a cell systems context, and produce new knowledge, while minimizing the effort, time and complexity to participating institutions.

  17. Biogenic iron oxyhydroxide formation at mid-ocean ridge hydrothermal vents: Juan de Fuca Ridge

    SciTech Connect (OSTI)

    Toner, Brandy M.; Santelli, Cara M.; Marcus, Matthew A.; Wirth, Richard; Chan, Clara S.; McCollom, Thomas; Bach, Wolfgang; Edwards, Katrina J.

    2008-05-22

    Here we examine Fe speciation within Fe-encrusted biofilms formed during 2-month seafloor incubations of sulfide mineral assemblages at the Main Endeavor Segment of the Juan de Fuca Ridge. The biofilms were distributed heterogeneously across the surface of the incubated sulfide and composed primarily of particles with a twisted stalk morphology resembling those produced by some aerobic Fe-oxidizing microorganisms. Our objectives were to determine the form of biofilm-associated Fe, and identify the sulfide minerals associated with microbial growth. We used micro-focused synchrotron-radiation X-ray fluorescence mapping (mu XRF), X-ray absorption spectroscopy (mu EXAFS), and X-ray diffraction (mu XRD) in conjunction with focused ion beam (FIB) sectioning, and highresolution transmission electron microscopy (HRTEM). The chemical and mineralogical composition of an Fe-encrusted biofilm was queried at different spatial scales, and the spatial relationship between primary sulfide and secondary oxyhydroxide minerals was resolved. The Fe-encrusted biofilms formed preferentially at pyrrhotite-rich (Fe1-xS, 0<_ x<_ 0.2) regions of the incubated chimney sulfide. At the nanometer spatial scale, particles within the biofilm exhibiting lattice fringing and diffraction patterns consistent with 2-line ferrihydrite were identified infrequently. At the micron spatial scale, Fe mu EXAFS spectroscopy and mu XRD measurements indicate that the dominant form of biofilm Fe is a short-range ordered Fe oxyhydroxide characterized by pervasive edge-sharing Fe-O6 octahedral linkages. Double corner-sharing Fe-O6 linkages, which are common to Fe oxyhydroxide mineral structures of 2-line ferrihydrite, 6-line ferrihydrite, and goethite, were not detected in the biogenic iron oxyhydroxide (BIO). The suspended development of the BIO mineral structure is consistent with Fe(III) hydrolysis and polymerization in the presence of high concentrations of Fe-complexing ligands. We hypothesize that microbiologically produced Fe-complexing ligands may play critical roles in both the delivery of Fe(II) to oxidases, and the limited Fe(III) oxyhydroxide crystallinity observed within the biofilm. Our research provides insight into the structure and formation of naturally occurring, microbiologically produced Fe oxyhydroxide minerals in the deep-sea. We describe the initiation of microbial seafloor weathering, and the morphological and mineralogical signals that result from that process. Our observations provide a starting point from which progressively older and more extensively weathered seafloor sulfide minerals may be examined, with the ultimate goal of improved interpretation of ancient microbial processes and associated biological signatures.

  18. Distributed Data Integration Infrastructure

    SciTech Connect (OSTI)

    Critchlow, T; Ludaescher, B; Vouk, M; Pu, C

    2003-02-24

    The Internet is becoming the preferred method for disseminating scientific data from a variety of disciplines. This can result in information overload on the part of the scientists, who are unable to query all of the relevant sources, even if they knew where to find them, what they contained, how to interact with them, and how to interpret the results. A related issue is keeping up with current trends in information technology often taxes the end-user's expertise and time. Thus instead of benefiting from this information rich environment, scientists become experts on a small number of sources and technologies, use them almost exclusively, and develop a resistance to innovations that can enhance their productivity. Enabling information based scientific advances, in domains such as functional genomics, requires fully utilizing all available information and the latest technologies. In order to address this problem we are developing a end-user centric, domain-sensitive workflow-based infrastructure, shown in Figure 1, that will allow scientists to design complex scientific workflows that reflect the data manipulation required to perform their research without an undue burden. We are taking a three-tiered approach to designing this infrastructure utilizing (1) abstract workflow definition, construction, and automatic deployment, (2) complex agent-based workflow execution and (3) automatic wrapper generation. In order to construct a workflow, the scientist defines an abstract workflow (AWF) in terminology (semantics and context) that is familiar to him/her. This AWF includes all of the data transformations, selections, and analyses required by the scientist, but does not necessarily specify particular data sources. This abstract workflow is then compiled into an executable workflow (EWF, in our case XPDL) that is then evaluated and executed by the workflow engine. This EWF contains references to specific data source and interfaces capable of performing the desired actions. In order to provide access to the largest number of resources possible, our lowest level utilizes automatic wrapper generation techniques to create information and data wrappers capable of interacting with the complex interfaces typical in scientific analysis. The remainder of this document outlines our work in these three areas, the impact our work has made, and our plans for the future.

  19. Utilization of the St. Peter Sandstone in the Illinois Basin for CO2 Sequestration

    SciTech Connect (OSTI)

    Will, Robert; Smith, Valerie; Leetaru, Hannes

    2014-09-30

    This project is part of a larger project co-funded by the United States Department of Energy (US DOE) under cooperative agreement DE-FE0002068 from 12/08/2009 through 9/31/2014. The study is to evaluate the potential of formations within the Cambro-Ordovician strata above the Mt. Simon Sandstone as potential targets for carbon dioxide (CO2) sequestration in the Illinois and Michigan Basins. This report evaluates the potential injectivity of the Ordovician St. Peter Sandstone. The evaluation of this formation was accomplished using wireline data, core data, pressure data, and seismic data acquired through funding in this project as well as existing data from two additional, separately funded projects: the US DOE funded Illinois Basin Decatur Project (IBDP) being conducted by the Midwest Geological Sequestration Consortium (MGSC) in Macon County, Illinois, and the Illinois Industrial Carbon Capture and Sequestration (ICCS) Project funded through the American Recovery and Reinvestment Act (ARRA), which received a phase two award from DOE. This study addresses the question of whether or not the St. Peter Sandstone may serve as a suitable target for CO2 sequestration at locations within the Illinois Basin where it lies at greater depths (below the underground source of drinking water (USDW)) than at the IBDP site. The work performed included numerous improvements to the existing St. Peter reservoir model created in 2010. Model size and spatial resolution were increased resulting in a 3 fold increase in the number of model cells. Seismic data was utilized to inform spatial porosity distribution and an extensive core database was used to develop porosity-permeability relationships. The analysis involved a Base Model representative of the St. Peter at in-situ conditions, followed by the creation of two hypothetical models at in-situ + 1,000 feet (ft.) (300 m) and in-situ + 2,000 ft. (600 m) depths through systematic depthdependent adjustment of the Base Model parameters. Properties for the depth shifted models were based on porosity versus depth relationship extracted from the core database followed by application of the porosity-permeability relationship. Each of the three resulting models were used as input to dynamic simulations with the single well injection target of 3.2 million tons per annum (MTPA) for 30 years using an appropriate fracture gradient based bottom hole pressure limit for each injection level. Modeling results are presented in terms of well bottomhole pressure (BHP), injection rate profiles, and three-dimensional (3D) saturation and differential pressure volumes at selected simulation times. Results suggest that the target CO2 injection rate of 3.2 MTPA may be achieved in the St. Peter Sandstone at in-situ conditions and at the in-situ +1,000 ft. (300 m) depth using a single injector well. In the latter case the target injection rate is achieved after a ramp up period which is caused by multi-phase flow effects and thus subject to increased modeling uncertainty. Results confirm that the target rate may not be achieved at the in-situ +2,000 ft. (600 m) level even with multiple wells. These new modeling results for the in-situ case are more optimistic than previous modeling results. This difference is attributed to the difference in methods and data used to develop model permeability distributions. Recommendations for further work include restriction of modeling activity to the in-situ +1,000 ft. (300 m) and shallower depth interval, sensitivity and uncertainty analysis, and refinement of porosity and permeability estimates through depth and area selective querying of the available core database. It is also suggested that further modeling efforts include scope for evaluating project performance in terms of metrics directly related to the Environmental Protection Agency (EPA) Class VI permit requirements for the area of review (AoR) definition and post injection site closure monitoring.

  20. Algorithms and architectures for high performance analysis of semantic graphs.

    SciTech Connect (OSTI)

    Hendrickson, Bruce Alan

    2005-09-01

    Semantic graphs offer one promising avenue for intelligence analysis in homeland security. They provide a mechanism for describing a wide variety of relationships between entities of potential interest. The vertices are nouns of various types, e.g. people, organizations, events, etc. Edges in the graph represent different types of relationships between entities, e.g. 'is friends with', 'belongs-to', etc. Semantic graphs offer a number of potential advantages as a knowledge representation system. They allow information of different kinds, and collected in differing ways, to be combined in a seamless manner. A semantic graph is a very compressed representation of some of relationship information. It has been reported that the semantic graph can be two orders of magnitude smaller than the processed intelligence data. This allows for much larger portions of the data universe to be resident in computer memory. Many intelligence queries that are relevant to the terrorist threat are naturally expressed in the language of semantic graphs. One example is the search for 'interesting' relationships between two individuals or between an individual and an event, which can be phrased as a search for short paths in the graph. Another example is the search for an analyst-specified threat pattern, which can be cast as an instance of subgraph isomorphism. It is important to note than many kinds of analysis are not relationship based, so these are not good candidates for semantic graphs. Thus, a semantic graph should always be used in conjunction with traditional knowledge representation and interface methods. Operations that involve looking for chains of relationships (e.g. friend of a friend) are not efficiently executable in a traditional relational database. However, the semantic graph can be thought of as a pre-join of the database, and it is ideally suited for these kinds of operations. Researchers at Sandia National Laboratories are working to facilitate semantic graph analysis. Since intelligence datasets can be extremely large, the focus of this work is on the use of parallel computers. We have been working to develop scalable parallel algorithms that will be at the core of a semantic graph analysis infrastructure. Our work has involved two different thrusts, corresponding to two different computer architectures. The first architecture of interest is distributed memory, message passing computers. These machines are ubiquitous and affordable, but they are challenging targets for graph algorithms. Much of our distributed-memory work to date has been collaborative with researchers at Lawrence Livermore National Laboratory and has focused on finding short paths on distributed memory parallel machines. Our implementation on 32K processors of BlueGene/Light finds shortest paths between two specified vertices in just over a second for random graphs with 4 billion vertices.

  1. Phase 1 Development Report for the SESSA Toolkit.

    SciTech Connect (OSTI)

    Knowlton, Robert G.; Melton, Brad J; Anderson, Robert J.

    2014-09-01

    The Site Exploitation System for Situational Awareness ( SESSA ) tool kit , developed by Sandia National Laboratories (SNL) , is a comprehensive de cision support system for crime scene data acquisition and Sensitive Site Exploitation (SSE). SESSA is an outgrowth of another SNL developed decision support system , the Building R estoration Operations Optimization Model (BROOM), a hardware/software solution for data acquisition, data management, and data analysis. SESSA was designed to meet forensic crime scene needs as defined by the DoD's Military Criminal Investigation Organiza tion (MCIO) . SESSA is a very comprehensive toolki t with a considerable amount of database information managed through a Microsoft SQL (Structured Query Language) database engine, a Geographical Information System (GIS) engine that provides comprehensive m apping capabilities, as well as a an intuitive Graphical User Interface (GUI) . An electronic sketch pad module is included. The system also has the ability to efficiently generate necessary forms for forensic crime scene investigations (e.g., evidence submittal, laboratory requests, and scene notes). SESSA allows the user to capture photos on site, and can read and generate ba rcode labels that limit transcription errors. SESSA runs on PC computers running Windows 7, but is optimized for touch - screen tablet computers running Windows for ease of use at crime scenes and on SSE deployments. A prototype system for 3 - dimensional (3 D) mapping and measur e ments was also developed to complement the SESSA software. The mapping system employs a visual/ depth sensor that captures data to create 3D visualizations of an interior space and to make distance measurements with centimeter - level a ccuracy. Output of this 3D Model Builder module provides a virtual 3D %22walk - through%22 of a crime scene. The 3D mapping system is much less expensive and easier to use than competitive systems. This document covers the basic installation and operation of th e SESSA tool kit in order to give the user enough information to start using the tool kit . SESSA is currently a prototype system and this documentation covers the initial release of the tool kit . Funding for SESSA was provided by the Department of Defense (D oD), Assistant Secretary of Defense for Research and Engineering (ASD(R&E)) Rapid Fielding (RF) organization. The project was managed by the Defense Forensic Science Center (DFSC) , formerly known as the U.S. Army Criminal Investigation Laboratory (USACIL) . ACKNOWLEDGEMENTS The authors wish to acknowledge the funding support for the development of the Site Exploitation System for Situational Awareness (SESSA) toolkit from the Department of Defense (DoD), Assistant Secretary of Defense for Research and Engineering (ASD(R&E)) Rapid Fielding (RF) organization. The project was managed by the Defense Forensic Science Center (DFSC) , formerly known as the U.S. Army Criminal Investigation Laboratory (USACIL). Special thanks to Mr. Garold Warner, of DFSC, who served as the Project Manager. Individuals that worked on the design, functional attributes, algorithm development, system arc hitecture, and software programming include: Robert Knowlton, Brad Melton, Robert Anderson, and Wendy Amai.

  2. ETDEWEB versus the World-Wide-Web: a specific database/web comparison

    SciTech Connect (OSTI)

    Cutler, D.

    2010-06-28

    A study was performed comparing user search results from the specialized scientific database on energy-related information, ETDEWEB, with search results from the internet search engines Google and Google Scholar. The primary objective of the study was to determine if ETDEWEB (the Energy Technology Data Exchange – World Energy Base) continues to bring the user search results that are not being found by Google and Google Scholar. As a multilateral information exchange initiative, ETDE’s member countries and partners contribute cost- and task-sharing resources to build the largest database of energy-related information in the world. As of early 2010, the ETDEWEB database has 4.3 million citations to world-wide energy literature. One of ETDEWEB’s strengths is its focused scientific content and direct access to full text for its grey literature (over 300,000 documents in PDF available for viewing from the ETDE site and over a million additional links to where the documents can be found at research organizations and major publishers globally). Google and Google Scholar are well-known for the wide breadth of the information they search, with Google bringing in news, factual and opinion-related information, and Google Scholar also emphasizing scientific content across many disciplines. The analysis compared the results of 15 energy-related queries performed on all three systems using identical words/phrases. A variety of subjects was chosen, although the topics were mostly in renewable energy areas due to broad international interest. Over 40,000 search result records from the three sources were evaluated. The study concluded that ETDEWEB is a significant resource to energy experts for discovering relevant energy information. For the 15 topics in this study, ETDEWEB was shown to bring the user unique results not shown by Google or Google Scholar 86.7% of the time. Much was learned from the study beyond just metric comparisons. Observations about the strengths of each system and factors impacting the search results are also shared along with background information and summary tables of the results. If a user knows a very specific title of a document, all three systems are helpful in finding the user a source for the document. But if the user is looking to discover relevant documents on a specific topic, each of the three systems will bring back a considerable volume of data, but quite different in focus. Google is certainly a highly-used and valuable tool to find significant ‘non-specialist’ information, and Google Scholar does help the user focus on scientific disciplines. But if a user’s interest is scientific and energy-specific, ETDEWEB continues to hold a strong position in the energy research, technology and development (RTD) information field and adds considerable value in knowledge discovery. (auth)

  3. NM WAIDS: A PRODUCED WATER QUALITY AND INFRASTRUCTURE GIS DATABASE FOR NEW MEXICO OIL PRODUCERS

    SciTech Connect (OSTI)

    Martha Cather; Robert Lee; Ibrahim Gundiler; Andrew Sung

    2003-09-24

    The New Mexico Water and Infrastructure Data System (NM WAIDS) seeks to alleviate a number of produced water-related issues in southeast New Mexico. The project calls for the design and implementation of a Geographical Information System (GIS) and integral tools that will provide operators and regulators with necessary data and useful information to help them make management and regulatory decisions. The major components of this system are: (1) Databases on produced water quality, cultural and groundwater data, oil pipeline and infrastructure data, and corrosion information. (2) A web site capable of displaying produced water and infrastructure data in a GIS or accessing some of the data by text-based queries. (3) A fuzzy logic-based, site risk assessment tool that can be used to assess the seriousness of a spill of produced water. (4) A corrosion management toolkit that will provide operators with data and information on produced waters that will aid them in deciding how to address corrosion issues. The various parts of NM WAIDS will be integrated into a website with a user-friendly interface that will provide access to previously difficult-to-obtain data and information. Primary attention during the first six months of this project was focused on creating the water quality databases for produced water and surface water, along with collecting of corrosion information and building parts of the corrosion toolkit. Work on the project to date includes: (1) Creation of a water quality database for produced water analyses. The database was compiled from a variety of sources and currently has over 7000 entries for New Mexico. (2) Creation of a web-based data entry system for the water quality database. This system allows a user to view, enter, or edit data from a web page rather than having to directly access the database. (3) Creation of a semi-automated data capturing system for use with standard water quality analysis forms. This system improves the accuracy and speed of water quality data entry. (4) Acquisition of ground water data from the New Mexico State Engineer's office, including chloride content and TDS (Total Dissolved Solids) for over 30,000 data points in southeast New Mexico. (5) Creation of a web-based scale prediction tool, again with a web-based interface, that uses two common scaling indices to predict the likelihood of scaling. This prediction tool can either run from user input data, or the user can select samples from the water analysis database. (6) Creation of depth-to-groundwater maps for the study area. (7) Analysis of water quality data by formation. (8) Continuation of efforts to collect produced water quality information from operators in the southeast New Mexico area. (9) Qualitative assessment of produced water from various formations regarding corrosivity. (10) Efforts at corrosion education in the region through operator visits. Future work on this project will include: (1) Development of an integrated web and GIS interface for all the information collected in this effort. (2) Continued development of a fuzzy logic spill risk assessment tool that was initially developed prior to this project. Improvements will include addition of parameters found to be significant in determining the impact of a brine spill at a specific site. (3) Compilation of both hard copy and online corrosion toolkit material.