National Library of Energy BETA

Sample records for kilowatt-hour parallel generation

  1. NREL Finds Up to 6-cent per Kilowatt-Hour Extra Value with Concentrated

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Solar Power - News Releases | NREL Finds Up to 6-cent per Kilowatt-Hour Extra Value with Concentrated Solar Power The greater the penetration of renewables in California, the greater the value of CSP with thermal storage capacity June 9, 2014 Concentrating Solar Power (CSP) projects would add additional value of 5 or 6 cents per kilowatt hour to utility-scale solar energy in California where 33 percent renewables will be mandated in six years, a new report by the Energy Department's National

  2. Communication Graph Generator for Parallel Programs

    Energy Science and Technology Software Center (OSTI)

    2014-04-08

    Graphator is a collection of relatively simple sequential programs that generate communication graphs/matrices for commonly occurring patterns in parallel programs. Currently, there is support for five communication patterns: two-dimensional 4-point stencil, four-dimensional 8-point stencil, all-to-alls over sub-communicators, random near-neighbor communication, and near-neighbor communication.

  3. Massively parallel mesh generation for physics codes

    SciTech Connect (OSTI)

    Hardin, D.D.

    1996-06-01

    Massively parallel processors (MPPs) will soon enable realistic 3-D physical modeling of complex objects and systems. Work is planned or presently underway to port many of LLNL`s physical modeling codes to MPPs. LLNL`s DSI3D electromagnetics code already can solve 40+ million zone problems on the 256 processor Meiko. However, the author lacks the software necessary to generate and manipulate the large meshes needed to model many complicated 3-D geometries. State-of-the-art commercial mesh generators run on workstations and have a practical limit of several hundred thousand elements. In the foreseeable future MPPs will solve problems with a billion mesh elements. The objective of the Parallel Mesh Generation (PMESH) Project is to develop a unique mesh generation system that can construct large 3-D meshes (up to a billion elements) on MPPs. Such a capability will remove a critical roadblock to unleashing the power of MPPs for physical analysis and will put LLNL at the forefront of mesh generation technology. PMESH will ``front-end`` a variety of LLNL 3-D physics codes, including those in the areas of electromagnetics, structural mechanics, thermal analysis, and hydrodynamics. The DSI3D and DYNA3D codes are already running on MPPs. The primary goal of the PMESH project is to provide the robust generation of large meshes for complicated 3-D geometries through the appropriate distribution of the generation task between the user`s workstation and the MPP. Secondary goals are to support the unique features of LLNL physics codes (e.g., unusual elements) and to minimize the user effort required to generate different meshes for the same geometry. PMESH`s capabilities are essential because mesh generation is presently a major limiting factor in simulating larger and more complex 3-D geometries. PMESH will significantly enhance LLNL`s capabilities in physical simulation by advancing the state-of-the-art in large mesh generation by 2 to 3 orders of magnitude.

  4. SPRNG Scalable Parallel Random Number Generator LIbrary

    Energy Science and Technology Software Center (OSTI)

    2010-03-16

    This revision corrects some errors in SPRNG 1. Users of newer SPRNG versions can obtain the corrected files and build their version with it. This version also improves the scalability of some of the application-based tests in the SPRNG test suite. It also includes an interface to a parallel Mersenne Twister, so that if users install the Mersenne Twister, then they can test this generator with the SPRNG test suite and also use some SPRNGmore » features with that generator.« less

  5. Building the Next Generation of Parallel Applications: Co-Design...

    Office of Scientific and Technical Information (OSTI)

    Applications: Co-Design Opportunities and Challenges. Citation Details In-Document Search Title: Building the Next Generation of Parallel Applications: Co-Design Opportunities and ...

  6. Building the Next Generation of Parallel Applications: Co-Design

    Office of Scientific and Technical Information (OSTI)

    Opportunities and Challenges. (Conference) | SciTech Connect Building the Next Generation of Parallel Applications: Co-Design Opportunities and Challenges. Citation Details In-Document Search Title: Building the Next Generation of Parallel Applications: Co-Design Opportunities and Challenges. Abstract not provided. Authors: Heroux, Michael Allen Publication Date: 2011-04-01 OSTI Identifier: 1108313 Report Number(s): SAND2011-2822C 470544 DOE Contract Number: AC04-94AL85000 Resource Type:

  7. Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

    SciTech Connect (OSTI)

    Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

    1997-03-01

    Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.

  8. Generating unstructured nuclear reactor core meshes in parallel

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore » examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less

  9. Generating unstructured nuclear reactor core meshes in parallel

    SciTech Connect (OSTI)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor core examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.

  10. Parallelization

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallelization of DEBS By Brian Cornille A Senior Thesis submitted in partial fulfillment of the requirements for the degree of Bachelor's of Science (Engineering Physics) at the UNIVERSITY OF WISCONSIN - MADISON 2014 Date of final oral examination: December 16 th , 2014 i Abstract This thesis presents the parallel development and testing of the DEBS resistive mag- netohydrodynamics (MHD) code [D. D. Schnack et al., "Semiimplicit magnetohydrody- namic calculations", Journal of

  11. Asynchronous parallel generating set search for linearly-constrained optimization.

    SciTech Connect (OSTI)

    Lewis, Robert Michael; Griffin, Joshua D.; Kolda, Tamara Gibson

    2006-08-01

    Generating set search (GSS) is a family of direct search methods that encompasses generalized pattern search and related methods. We describe an algorithm for asynchronous linearly-constrained GSS, which has some complexities that make it different from both the asynchronous bound-constrained case as well as the synchronous linearly-constrained case. The algorithm has been implemented in the APPSPACK software framework and we present results from an extensive numerical study using CUTEr test problems. We discuss the results, both positive and negative, and conclude that GSS is a reliable method for solving small-to-medium sized linearly-constrained optimization problems without derivatives.

  12. Bit error rate tester using fast parallel generation of linear recurring sequences

    DOE Patents [OSTI]

    Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.

    2003-05-06

    A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.

  13. Feasibility Study of Biomass Electrical Generation on Tribal Lands

    SciTech Connect (OSTI)

    Tom Roche; Richard Hartmann; Joohn Luton; Warren Hudelson; Roger Blomguist; Jan Hacker; Colene Frye

    2005-03-29

    The goals of the St. Croix Tribe are to develop economically viable energy production facilities using readily available renewable biomass fuel sources at an acceptable cost per kilowatt hour ($/kWh), to provide new and meaningful permanent employment, retain and expand existing employment (logging) and provide revenues for both producers and sellers of the finished product. This is a feasibility study including an assessment of available biomass fuel, technology assessment, site selection, economics viability given the foreseeable fuel and generation costs, as well as an assessment of the potential markets for renewable energy.

  14. Experimental and cost analyses of a one kilowatt-hour/day domestic refrigerator-freezer

    SciTech Connect (OSTI)

    Vineyard, E.A.; Sand, J.R.

    1997-05-01

    Over the past ten years, government regulations for energy standards, coupled with the utility industry`s promotion of energy-efficient appliances, have prompted appliance manufacturers to reduce energy consumption in refrigerator-freezers by approximately 40%. Global concerns over ozone depletion have also required the appliance industry to eliminate CFC-12 and CFC-11 while concurrently improving energy efficiency to reduce greenhouse emissions. In response to expected future regulations that will be more stringent, several design options were investigated for improving the energy efficiency of a conventionally designed, domestic refrigerator-freezer. The options, such as cabinet and door insulation improvements and a high-efficiency compressor were incorporated into a prototype refrigerator-freezer cabinet and refrigeration system. Baseline energy consumption of the original 1996 production refrigerator-freezer, along with cabinet heat load and compressor calorimeter test results, were extensively documented to provide a firm basis for experimentally measured energy savings. The goal for the project was to achieve an energy consumption that is 50% below in 1993 National Appliance Energy Conservation Act (NAECA) standard for 20 ft{sup 3} (570 l) units. Based on discussions with manufacturers to determine the most promising energy-saving options, a laboratory prototype was fabricated and tested to experimentally verify the energy consumption of a unit with vacuum insulation around the freezer, increased door thicknesses, a high-efficiency compressor, a low wattage condenser fan, a larger counterflow evaporator, and adaptive defrost control.

  15. Fridge of the future: Designing a one-kilowatt-hour/day domestic refrigerator-freezer

    SciTech Connect (OSTI)

    Vineyard, E.A.; Sand, J.R.

    1998-03-01

    An industry/government Cooperative Research and Development Agreement (CRADA) was established to evaluate and test design concepts for a domestic refrigerator-freezer unit that represents approximately 60% of the US market. The goal of the CRADA was to demonstrate advanced technologies which reduce, by 50 percent, the 1993 NAECA standard energy consumption for a 20 ft{sup 3} (570 I) top-mount, automatic-defrost, refrigerator-freezer. For a unit this size, the goal translated to an energy consumption of 1.003 kWh/d. The general objective of the research was to facilitate the introduction of cost-efficient technologies by demonstrating design changes that can be effectively incorporated into new products. A 1996 model refrigerator-freezer was selected as the baseline unit for testing. Since the unit was required to meet the 1993 NAECA standards, the energy consumption was quite low (1.676 kWh/d), thus making further reductions in energy consumption very challenging. Among the energy saving features incorporated into the original design of the baseline unit were a low-wattage evaporator fan, increased insulation thicknesses, and liquid line flange heaters.

  16. Hydropower generation management under uncertainty via scenario analysis and parallel computation

    SciTech Connect (OSTI)

    Escudero, L.F.; Garcia, C.; Fuente, J.L. de la; Prieto, F.J.

    1996-05-01

    The authors present a modeling framework for the robust solution of hydroelectric power management problems with uncertainty in the values of the water inflows and outflows. A deterministic treatment of the problem provides unsatisfactory results, except for very short time horizons. The authors describe a model based on scenario analysis that allows a satisfactory treatment of uncertainty in the model data for medium and long-term planning problems. Their approach results in a huge model with a network submodel per scenario plus coupling constraints. The size of the problem and the structure of the constraints are adequate for the use of decomposition techniques and parallel computation tools. The authors present computational results for both sequential and parallel implementation versions of the codes, running on a cluster of workstations. The codes have been tested on data obtained from the reservoir network of Iberdrola, a power utility owning 50% of the total installed hydroelectric capacity of Spain, and generating 40% of the total energy demand.

  17. Hydropower generation management under uncertainty via scenario analysis and parallel computation

    SciTech Connect (OSTI)

    Escudero, L.F.; Garcia, C.; Fuente, J.L. de la; Prieto, F.J.

    1995-12-31

    The authors present a modeling framework for the robust solution of hydroelectric power management problems and uncertainty in the values of the water inflows and outflows. A deterministic treatment of the problem provides unsatisfactory results, except for very short time horizons. The authors describe a model based on scenario analysis that allows a satisfactory treatment of uncertainty in the model data for medium and long-term planning problems. This approach results in a huge model with a network submodel per scenario plus coupling constraints. The size of the problem and the structure of the constraints are adequate for the use of decomposition techniques and parallel computation tools. The authors present computational results for both sequential and parallel implementation versions of the codes, running on a cluster of workstations. The code have been tested on data obtained from the reservoir network of Iberdrola, a power utility owning 50% of the total installed hydroelectric capacity of Spain, and generating 40% of the total energy demand.

  18. Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms

    SciTech Connect (OSTI)

    Chang, Christopher H.; Long, Hai; Sides, Scott; Vaidhynathan, Deepthi; Jones, Wesley

    2015-10-15

    Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL" tm s Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement of future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.

  19. Parallel octree-based hexahedral mesh generation for eulerian to lagrangian conversion.

    SciTech Connect (OSTI)

    Staten, Matthew L.; Owen, Steven James

    2010-09-01

    Computational simulation must often be performed on domains where materials are represented as scalar quantities or volume fractions at cell centers of an octree-based grid. Common examples include bio-medical, geotechnical or shock physics calculations where interface boundaries are represented only as discrete statistical approximations. In this work, we introduce new methods for generating Lagrangian computational meshes from Eulerian-based data. We focus specifically on shock physics problems that are relevant to ASC codes such as CTH and Alegra. New procedures for generating all-hexahedral finite element meshes from volume fraction data are introduced. A new primal-contouring approach is introduced for defining a geometric domain. New methods for refinement, node smoothing, resolving non-manifold conditions and defining geometry are also introduced as well as an extension of the algorithm to handle tetrahedral meshes. We also describe new scalable MPI-based implementations of these procedures. We describe a new software module, Sculptor, which has been developed for use as an embedded component of CTH. We also describe its interface and its use within the mesh generation code, CUBIT. Several examples are shown to illustrate the capabilities of Sculptor.

  20. Energy Intensity Indicators: Electricity Generation Energy Intensity

    Broader source: Energy.gov [DOE]

    A kilowatt-hour (kWh) of electric energy delivered to the final user has an energy equivalent to 3,412 British thermal units (Btu). Figure E1, below, tracks how much energy was used by the various...

  1. Property:PotentialHydropowerGeneration | Open Energy Information

    Open Energy Info (EERE)

    for a particular place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  2. Property:PotentialOnshoreWindGeneration | Open Energy Information

    Open Energy Info (EERE)

    onshore wind in a place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  3. Property:PotentialBiopowerSolidGeneration | Open Energy Information

    Open Energy Info (EERE)

    for a particular place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  4. Nevada: Geothermal Brine Brings Low-Cost Power with Big Potential

    Broader source: Energy.gov [DOE]

    Utilizing EERE funds, ElectraTherm developed a geothermal technology that will generate electricity for less than $0.06 per kilowatt hour.

  5. TRANSIMS Parallelization

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    TRANSIMS Parallelization This email address is being protected from spambots. You need JavaScript enabled to view it. - TRACC Director This email address is being protected from spambots. You need JavaScript enabled to view it. - Associate Computational Transportation Engineer Background TRANSIMS was originally developed by Los Alamos National Laboratory to run exclusively on a Linux cluster environment. In this initial version, the only parallelized component was the microsimulator. It worked

  6. Parallel Batch Scripts

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Batch Scripts Parallel Batch Scripts Parallel Environments on Genepool You can run parallel jobs that use MPI or OpenMP on Genepool as long as you make the appropriate changes to your submission script! To investigate the parallel environments that are available on Genepool, you can use Command Description qconf -sp <pename> Show the configuration for the specified parallel environment. qconf -spl Show a list of all currently configured parallel environments. Basic Parallel

  7. Special parallel processing workshop

    SciTech Connect (OSTI)

    1994-12-01

    This report contains viewgraphs from the Special Parallel Processing Workshop. These viewgraphs deal with topics such as parallel processing performance, message passing, queue structure, and other basic concept detailing with parallel processing.

  8. Parallel Python GDB

    Energy Science and Technology Software Center (OSTI)

    2012-08-05

    PGDB is a lightweight parallel debugger softrware product. It utilizes the open souce gdb debugger inside of a parallel python framework.

  9. Net Metering

    Broader source: Energy.gov [DOE]

    Net excess generation (NEG) is treated as a kilowatt-hour (kWh) credit or other compensation on the customer's following bill.* At the beginning of the calendar year, a utility will purchase any...

  10. Net Metering

    Broader source: Energy.gov [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the end of a 12-month...

  11. Super Bowl of Energy: Solar Smashes Records | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Addthis MetLife Stadium, the site of yesterday's Super Bowl, features a ring of 1,350 solar panels that can generate 350,000 kilowatt hours of electricity annually. The number of ...

  12. Tax Credits, Rebates & Savings | Department of Energy

    Broader source: Energy.gov (indexed) [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the end of...

  13. Tax Credits, Rebates & Savings | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Metering Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At...

  14. Tax Credits, Rebates & Savings | Department of Energy

    Broader source: Energy.gov (indexed) [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the...

  15. Applications of Parallel Computers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Computers Applications of Parallel Computers UCB CS267 Spring 2015 Tuesday & Thursday, 9:30-11:00 Pacific Time Applications of Parallel Computers, CS267, is a graduate-level course...

  16. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, Hsu-Chi; Cheng, Yung-Sung

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  17. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  18. PISTON (Portable Data Parallel Visualization and Analysis)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    in a data-parallel way. By using nVidia's freely downloadable Thrust library and our own tools, we can generate executable codes for different acceleration hardware architectures...

  19. Life Cycle Greenhouse Gas Emissions of Coal-Fired Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Whitaker, M.; Heath, G. A.; O'Donoughue, P.; Vorum, M.

    2012-04-01

    This systematic review and harmonization of life cycle assessments (LCAs) of utility-scale coal-fired electricity generation systems focuses on reducing variability and clarifying central tendencies in estimates of life cycle greenhouse gas (GHG) emissions. Screening 270 references for quality LCA methods, transparency, and completeness yielded 53 that reported 164 estimates of life cycle GHG emissions. These estimates for subcritical pulverized, integrated gasification combined cycle, fluidized bed, and supercritical pulverized coal combustion technologies vary from 675 to 1,689 grams CO{sub 2}-equivalent per kilowatt-hour (g CO{sub 2}-eq/kWh) (interquartile range [IQR]= 890-1,130 g CO{sub 2}-eq/kWh; median = 1,001) leading to confusion over reasonable estimates of life cycle GHG emissions from coal-fired electricity generation. By adjusting published estimates to common gross system boundaries and consistent values for key operational input parameters (most importantly, combustion carbon dioxide emission factor [CEF]), the meta-analytical process called harmonization clarifies the existing literature in ways useful for decision makers and analysts by significantly reducing the variability of estimates ({approx}53% in IQR magnitude) while maintaining a nearly constant central tendency ({approx}2.2% in median). Life cycle GHG emissions of a specific power plant depend on many factors and can differ from the generic estimates generated by the harmonization approach, but the tightness of distribution of harmonized estimates across several key coal combustion technologies implies, for some purposes, first-order estimates of life cycle GHG emissions could be based on knowledge of the technology type, coal mine emissions, thermal efficiency, and CEF alone without requiring full LCAs. Areas where new research is necessary to ensure accuracy are also discussed.

  20. Parallel Atomistic Simulations

    SciTech Connect (OSTI)

    HEFFELFINGER,GRANT S.

    2000-01-18

    Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

  1. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    LaboratoryNational Security Education Center Menu About Contact Educational Prog Computer System, Cluster and Networking Summer Institute (CSCNSI) IS&T Data Science at Scale Summer School IS&T Co-Design Summer School Parallel Computing Summer Research Internship Univ Partnerships CMU/LANL Institute for Reliable High Performance Technology (IRHPIT) Missouri S&T/LANL Cyber Security Sciences Institute (CSSI) UC, Davis/LANL Institute for Next Generation Visualization and Analysis (INGVA)

  2. Parallel integrated thermal management

    DOE Patents [OSTI]

    Bennion, Kevin; Thornton, Matthew

    2014-08-19

    Embodiments discussed herein are directed to managing the heat content of two vehicle subsystems through a single coolant loop having parallel branches for each subsystem.

  3. Eclipse Parallel Tools Platform

    Energy Science and Technology Software Center (OSTI)

    2005-02-18

    Designing and developing parallel programs is an inherently complex task. Developers must choose from the many parallel architectures and programming paradigms that are available, and face a plethora of tools that are required to execute, debug, and analyze parallel programs i these environments. Few, if any, of these tools provide any degree of integration, or indeed any commonality in their user interfaces at all. This further complicates the parallel developer's task, hampering software engineering practices,more » and ultimately reducing productivity. One consequence of this complexity is that best practice in parallel application development has not advanced to the same degree as more traditional programming methodologies. The result is that there is currently no open-source, industry-strength platform that provides a highly integrated environment specifically designed for parallel application development. Eclipse is a universal tool-hosting platform that is designed to providing a robust, full-featured, commercial-quality, industry platform for the development of highly integrated tools. It provides a wide range of core services for tool integration that allow tool producers to concentrate on their tool technology rather than on platform specific issues. The Eclipse Integrated Development Environment is an open-source project that is supported by over 70 organizations, including IBM, Intel and HP. The Eclipse Parallel Tools Platform (PTP) plug-in extends the Eclipse framwork by providing support for a rich set of parallel programming languages and paradigms, and a core infrastructure for the integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration, support for a small number of parallel architectures, and basis Fortran integration. Future versions will extend the functionality substantially, provide a number of core parallel tools, and provide support across a wide rang of parallel architectures and languages.« less

  4. Parallel computing works

    SciTech Connect (OSTI)

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  5. Life Cycle Greenhouse Gas Emissions of Nuclear Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Warner, E. S.; Heath, G. A.

    2012-04-01

    A systematic review and harmonization of life cycle assessment (LCA) literature of nuclear electricity generation technologies was performed to determine causes of and, where possible, reduce variability in estimates of life cycle greenhouse gas (GHG) emissions to clarify the state of knowledge and inform decision making. LCA literature indicates that life cycle GHG emissions from nuclear power are a fraction of traditional fossil sources, but the conditions and assumptions under which nuclear power are deployed can have a significant impact on the magnitude of life cycle GHG emissions relative to renewable technologies. Screening 274 references yielded 27 that reported 99 independent estimates of life cycle GHG emissions from light water reactors (LWRs). The published median, interquartile range (IQR), and range for the pool of LWR life cycle GHG emission estimates were 13, 23, and 220 grams of carbon dioxide equivalent per kilowatt-hour (g CO{sub 2}-eq/kWh), respectively. After harmonizing methods to use consistent gross system boundaries and values for several important system parameters, the same statistics were 12, 17, and 110 g CO{sub 2}-eq/kWh, respectively. Harmonization (especially of performance characteristics) clarifies the estimation of central tendency and variability. To explain the remaining variability, several additional, highly influential consequential factors were examined using other methods. These factors included the primary source energy mix, uranium ore grade, and the selected LCA method. For example, a scenario analysis of future global nuclear development examined the effects of a decreasing global uranium market-average ore grade on life cycle GHG emissions. Depending on conditions, median life cycle GHG emissions could be 9 to 110 g CO{sub 2}-eq/kWh by 2050.

  6. Parallel programming with PCN

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  7. UPC (Unified Parallel C)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    UPC (Unified Parallel C) Description Unified Parallel C is a partitioned global address space (PGAS) language and an extension of the C programming language. Availability UPC is available on Edison and Hopper via both the Cray compilers, as well as through Berkeley UPC, a portable high-performance UPC compiler and runtime implementation. Using UPC To compile a UPC source file using the Cray compilers, you must first swap the Cray compiler with the default compiler. On Hopper: % module swap

  8. Charting a Parallel Course

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Charting National Security Science Latest Issue:April 2016 past issues All Issues » submit Charting a Parallel Course Los Alamos has partnered with the U.S. Navy since the Manhattan Project to ensure U.S. national security. March 22, 2016 Charting a Parallel Course A Regulus nuclear-armed cruise missile sits aboard the USS Grayback submarine. The Regulus, designed by Los Alamos, was the first nuclear weapon to enter the Navy's stockpile. (Photo: Open Source) Contact Managing Editor Clay

  9. Parallel phase model : a programming model for high-end parallel machines with manycores.

    SciTech Connect (OSTI)

    Wu, Junfeng; Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

    2009-04-01

    This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.

  10. Parallel time integration software

    Energy Science and Technology Software Center (OSTI)

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds mustmore » come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.« less

  11. Parallel optical sampler

    DOE Patents [OSTI]

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  12. Parallel programming with Ada

    SciTech Connect (OSTI)

    Kok, J.

    1988-01-01

    To the human programmer the ease of coding distributed computing is highly dependent on the suitability of the employed programming language. But with a particular language it is also important whether the possibilities of one or more parallel architectures can efficiently be addressed by available language constructs. In this paper the possibilities are discussed of the high-level language Ada and in particular of its tasking concept as a descriptional tool for the design and implementation of numerical and other algorithms that allow execution of parts in parallel. Language tools are explained and their use for common applications is shown. Conclusions are drawn about the usefulness of several Ada concepts.

  13. Parallel Dislocation Simulator

    Energy Science and Technology Software Center (OSTI)

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  14. Parallel Total Energy

    Energy Science and Technology Software Center (OSTI)

    2004-10-21

    This is a total energy electronic structure code using Local Density Approximation (LDA) of the density funtional theory. It uses the plane wave as the wave function basis set. It can sue both the norm conserving pseudopotentials and the ultra soft pseudopotentials. It can relax the atomic positions according to the total energy. It is a parallel code using MP1.

  15. Parallel programming with PCN

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  16. Parallel Multigrid Equation Solver

    Energy Science and Technology Software Center (OSTI)

    2001-09-07

    Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.

  17. Ultrascalable petaflop parallel supercomputer

    DOE Patents [OSTI]

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  18. Parallel grid population

    DOE Patents [OSTI]

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  19. Xyce parallel electronic simulator.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

  20. Hybrid Optimization Parallel Search PACKage

    Energy Science and Technology Software Center (OSTI)

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework providesmore » a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less

  1. Parallel Harness for Informatic Stream Hashing

    Energy Science and Technology Software Center (OSTI)

    2012-09-11

    PHISH is a lightweight framework which a set of independent processes can use to exchange data as they run on the same desktop machine, on processors of a parallel machine, or on different machines across a network. This enables them to work in a coordinated parallel fashion to perform computations on either streaming, archived, or self-generated data. The PHISH distribution includes a simple, portable library for performing data exchanges in useful patterns either via MPImore » message-passing or ZMQ sockets. PHISH input scripts are used to describe a data-processing algorithm, and additional tools provided in the PHISH distribution convert the script into a form that can be launched as a parallel job.« less

  2. Parallel ptychographic reconstruction

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps tomoretake in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.less

  3. Parallel ptychographic reconstruction

    SciTech Connect (OSTI)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.

  4. Unified Parallel Software

    Energy Science and Technology Software Center (OSTI)

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use ofmore » EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.« less

  5. Multilingual interfaces for parallel coupling in multiphysics and multiscale systems.

    SciTech Connect (OSTI)

    Ong, E. T.; Larson, J. W.; Norris, B.; Jacob, R. L.; Tobis, M.; Steder, M.; Mathematics and Computer Science; Univ. of Wisconsin; Australian National Univ.; Univ. of Chicago

    2007-01-01

    Multiphysics and multiscale simulation systems are emerging as a new grand challenge in computational science, largely because of increased computing power provided by the distributed-memory parallel programming model on commodity clusters. These systems often present a parallel coupling problem in their intercomponent data exchanges. Another potential problem in these coupled systems is language interoperability between their various constituent codes. In anticipation of combined parallel coupling/language interoperability challenges, we have created a set of interlanguage bindings for a successful parallel coupling library, the Model Coupling Toolkit. We describe the method used for automatically generating the bindings using the Babel language interoperability tool, and illustrate with short examples how MCT can be used from the C++ and Python languages. We report preliminary performance reports for the MCT interpolation benchmark. We conclude with a discussion of the significance of this work to the rapid prototyping of large parallel coupled systems.

  6. Apply for the Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Computing » How to Apply Apply for the Parallel Computing Summer Research Internship Creating next-generation leaders in HPC research and applications development Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nicole Aguilar Garcia (505) 665-3048 Email Current application deadline is February 5, 2016 with notification by early March 2016. Who can apply? Upper division undergraduate

  7. Template based parallel checkpointing in a massively parallel computer system

    DOE Patents [OSTI]

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  8. Parallel_HDF5.pptx

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    ... * Some parallel formats using HDF * CFD: CGNS * Meshless Methods: H5Part * FEM: MOAB * General: NetCDF * Hides the complexity of HDF May 21, 2015 167 Mira Performance Boot ...

  9. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  10. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  11. Small file aggregation in a parallel computing system

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Zhang, Jingwang

    2014-09-02

    Techniques are provided for small file aggregation in a parallel computing system. An exemplary method for storing a plurality of files generated by a plurality of processes in a parallel computing system comprises aggregating the plurality of files into a single aggregated file; and generating metadata for the single aggregated file. The metadata comprises an offset and a length of each of the plurality of files in the single aggregated file. The metadata can be used to unpack one or more of the files from the single aggregated file.

  12. Global synchronization of parallel processors using clock pulse width modulation

    DOE Patents [OSTI]

    Chen, Dong; Ellavsky, Matthew R.; Franke, Ross L.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Jeanson, Mark J.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Littrell, Daniel; Ohmacht, Martin; Reed, Don D.; Schenck, Brandon E.; Swetz, Richard A.

    2013-04-02

    A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

  13. An integrated approach to improving the parallel applications development process

    SciTech Connect (OSTI)

    Rasmussen, Craig E; Watson, Gregory R; Tibbitts, Beth R

    2009-01-01

    The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical phenomenon. It is becoming increasingly clear, however, that continued hardware performance improvements through clock scaling and feature-size reduction are simply not going to be achievable for much longer. The hardware vendor's approach to addressing this issue is to employ parallelism through multi-processor and multi-core technologies. While there is little doubt that this approach produces scaling improvements, there are still many significant hurdles to be overcome before parallelism can be employed as a general replacement to more traditional programming techniques. The Parallel Tools Platform (PTP) Project was created in 2005 in an attempt to provide developers with new tools aimed at addressing some of the parallel development issues. Since then, the introduction of a new generation of peta-scale and multi-core systems has highlighted the need for such a platform. In this paper, we describe some of the challenges facing parallel application developers, present the current state of PTP, and provide a simple case study that demonstrates how PTP can be used to locate a potential deadlock situation in an MPI code.

  14. Computing contingency statistics in parallel.

    SciTech Connect (OSTI)

    Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

    2010-09-01

    Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy, and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel.We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.

  15. An efficient parallel algorithm for matrix-vector multiplication

    SciTech Connect (OSTI)

    Hendrickson, B.; Leland, R.; Plimpton, S.

    1993-03-01

    The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.

  16. BAGEL: New-generation parallel quantum chemistry program | Argonne

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    National Nuclear Security Administration B61-12 Life Extension Program Undergoes First Full-Scale Wind Tunnel Test April 14, 2014 WASHINGTON, D.C. - The National Nuclear Security Administration (NNSA) announced today that its Sandia National Laboratories successfully completed the first full-scale wind tunnel test of the B61-12 as part of the NNSA's ongoing effort to refurbish the B61 nuclear bomb. The purpose of this test was to characterize counter torque, the interaction between the spin

  17. Designing a parallel simula machine

    SciTech Connect (OSTI)

    Papazoglou, M.P.; Georgiadis, P.I.; Maritsas, D.G.

    1983-10-01

    The parallel simula machine (PSM) architecture is based upon a master/slave topology, incorporating a master microprocessor. Interconnection circuitry between the master and slave processor modules uses a timesharing system bus and various programmable interrupt control units. Common and private memory modules reside in the PSM, and direct memory access transfers ease the master processor's workload. 5 references.

  18. Soboba Band of Luiseño Indians – 2015 Project

    Broader source: Energy.gov [DOE]

    The Soboba Community Solar Energy Project proposes installation of a 1.0-megawatt (MW) AC ground-mounted photovoltaic (PV) system that, once installed, will generate approximately 1,884,686 kilowatt-hours (kWh)/year, meeting 80% of the annual energy needs of key community facilities.

  19. Project Reports for Soboba Band of Luiseño Indians – 2015 Project

    Broader source: Energy.gov [DOE]

    Under this grant, the Soboba Band of Luiseño Indians plans to install the Soboba Community Solar Energy Project, a 1.0-megawatt (MW) AC ground-mounted photovoltaic (PV) system that, once installed, will generate approximately 1,884,686 kilowatt-hours (kWh)/year, meeting 80% of the annual energy needs of key community facilities.

  20. PREP | National Nuclear Security Administration

    National Nuclear Security Administration (NNSA)

    Home PREP Under Secretary Klotz delivers remarks at PREP ribbon-cutting Under Secretary Klotz delivered remarks at the Pantex Renewable Energy Project (PREP) ribbon-cutting this week. PREP establishes the largest federally-owned wind farm in the country and will generate approximately 47 million kilowatt-hours of electricity annually, more than 60 percent of the...

  1. Cooperative storage of shared files in a parallel computing system with dynamic block size

    DOE Patents [OSTI]

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2015-11-10

    Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).

  2. Parallel Power Grid Simulation Toolkit

    Energy Science and Technology Software Center (OSTI)

    2015-09-14

    ParGrid is a 'wrapper' that integrates a coupled Power Grid Simulation toolkit consisting of a library to manage the synchronization and communication of independent simulations. The included library code in ParGid, named FSKIT, is intended to support the coupling multiple continuous and discrete even parallel simulations. The code is designed using modern object oriented C++ methods utilizing C++11 and current Boost libraries to ensure compatibility with multiple operating systems and environments.

  3. Compact Mesh Generator

    Energy Science and Technology Software Center (OSTI)

    2007-02-02

    The CMG is a small, lightweight, structured mesh generation code. It features a simple text input parser that allows setup of various meshes via a small set of text commands. Mesh generation data can be output to text, the silo file format, or the API can be directly queried by applications. It can run serially or in parallel via MPI. The CMG includes the ability to specify varius initial conditions on a mesh via meshmore » tags.« less

  4. Xyce parallel electronic simulator design.

    SciTech Connect (OSTI)

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  5. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect (OSTI)

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  6. Device for balancing parallel strings

    DOE Patents [OSTI]

    Mashikian, Matthew S.

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  7. Information hiding in parallel programs

    SciTech Connect (OSTI)

    Foster, I.

    1992-01-30

    A fundamental principle in program design is to isolate difficult or changeable design decisions. Application of this principle to parallel programs requires identification of decisions that are difficult or subject to change, and the development of techniques for hiding these decisions. We experiment with three complex applications, and identify mapping, communication, and scheduling as areas in which decisions are particularly problematic. We develop computational abstractions that hide such decisions, and show that these abstractions can be used to develop elegant solutions to programming problems. In particular, they allow us to encode common structures, such as transforms, reductions, and meshes, as software cells and templates that can reused in different applications. An important characteristic of these structures is that they do not incorporate mapping, communication, or scheduling decisions: these aspects of the design are specified separately, when composing existing structures to form applications. This separation of concerns allows the same cells and templates to be reused in different contexts.

  8. Parallel computing in enterprise modeling.

    SciTech Connect (OSTI)

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  9. Data communications for a collective operation in a parallel...

    Office of Scientific and Technical Information (OSTI)

    interface of a parallel computer Title: Data communications for a collective operation in a parallel active messaging interface of a parallel computer Algorithm selection for ...

  10. High voltage pulse generator

    DOE Patents [OSTI]

    Fasching, George E.

    1977-03-08

    An improved high-voltage pulse generator has been provided which is especially useful in ultrasonic testing of rock core samples. An N number of capacitors are charged in parallel to V volts and at the proper instance are coupled in series to produce a high-voltage pulse of N times V volts. Rapid switching of the capacitors from the paralleled charging configuration to the series discharging configuration is accomplished by using silicon-controlled rectifiers which are chain self-triggered following the initial triggering of a first one of the rectifiers connected between the first and second of the plurality of charging capacitors. A timing and triggering circuit is provided to properly synchronize triggering pulses to the first SCR at a time when the charging voltage is not being applied to the parallel-connected charging capacitors. Alternate circuits are provided for controlling the application of the charging voltage from a charging circuit to be applied to the parallel capacitors which provides a selection of at least two different intervals in which the charging voltage is turned "off" to allow the SCR's connecting the capacitors in series to turn "off" before recharging begins. The high-voltage pulse-generating circuit including the N capacitors and corresponding SCR's which connect the capacitors in series when triggered "on" further includes diodes and series-connected inductors between the parallel-connected charging capacitors which allow sufficiently fast charging of the capacitors for a high pulse repetition rate and yet allow considerable control of the decay time of the high-voltage pulses from the pulse-generating circuit.

  11. Parallel Programming and Optimization for Intel Architecture

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Programming and Optimization for Intel Architecture Parallel Programming and Optimization for Intel Architecture August 14, 2015 by Richard Gerber Intel is sponsoring a series of webinars entitled "Parallel Programming and Optimization for Intel Architecture." Here's the schedule for August (Registration link is: https://attendee.gotowebinar.com/register/6325131222429932289) Mon, August 17 - "Hello world from Intel Xeon Phi coprocessors". Overview of architecture,

  12. Parallel Programming with MPI | Argonne Leadership Computing...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Programming with MPI Event Sponsor: Mathematics and Computer Science Division ...permalinksargonne16mpi.php The Mathematics and Computer Science division of ...

  13. Optimize Parallel Pumping Systems | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    This tip sheet describes how to optimize the performance of multiple pumps operating continuously as part of a parallel pumping system. PUMPING SYSTEMS TIP SHEET 8 PDF icon ...

  14. Parallel auto-correlative statistics with VTK.

    SciTech Connect (OSTI)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  15. Petascale Parallelization of the Gyrokinetic Toroidal Code

    SciTech Connect (OSTI)

    Ethier, Stephane; Adams, Mark; Carter, Jonathan; Oliker, Leonid

    2010-05-01

    The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application developed to study microturbulence in tokamak fusion devices. The global capability of GTC is unique, allowing researchers to systematically analyze important dynamics such as turbulence spreading. In this work we examine a new radial domain decomposition approach to allow scalability onto the latest generation of petascale systems. Extensive performance evaluation is conducted on three high performance computing systems: the IBM BG/P, the Cray XT4, and an Intel Xeon Cluster. Overall results show that the radial decomposition approach dramatically increases scalability, while reducing the memory footprint - allowing for fusion device simulations at an unprecedented scale. After a decade where high-end computing (HEC) was dominated by the rapid pace of improvements to processor frequencies, the performance of next-generation supercomputers is increasingly differentiated by varying interconnect designs and levels of integration. Understanding the tradeoffs of these system designs is a key step towards making effective petascale computing a reality. In this work, we examine a new parallelization scheme for the Gyrokinetic Toroidal Code (GTC) [?] micro-turbulence fusion application. Extensive scalability results and analysis are presented on three HEC systems: the IBM BlueGene/P (BG/P) at Argonne National Laboratory, the Cray XT4 at Lawrence Berkeley National Laboratory, and an Intel Xeon cluster at Lawrence Livermore National Laboratory. Overall results indicate that the new radial decomposition approach successfully attains unprecedented scalability to 131,072 BG/P cores by overcoming the memory limitations of the previous approach. The new version is well suited to utilize emerging petascale resources to access new regimes of physical phenomena.

  16. Composing Data Parallel Code for a SPARQL Graph Engine

    SciTech Connect (OSTI)

    Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste; Haglin, David J.; Feo, John

    2013-09-08

    Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basic graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.

  17. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  18. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  19. Differences Between Distributed and Parallel Systems

    SciTech Connect (OSTI)

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  20. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Berg, Jeremy E.; Faraj, Ahmad A.

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  1. Buffered coscheduling for parallel programming and enhanced fault tolerance

    DOE Patents [OSTI]

    Petrini, Fabrizio; Feng, Wu-chun

    2006-01-31

    A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

  2. Final Report: Center for Programming Models for Scalable Parallel Computing

    SciTech Connect (OSTI)

    Mellor-Crummey, John

    2011-09-13

    As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

  3. Automated Parallel Capillary Electrophoretic System

    DOE Patents [OSTI]

    Li, Qingbo; Kane, Thomas E.; Liu, Changsheng; Sonnenschein, Bernard; Sharer, Michael V.; Kernan, John R.

    2000-02-22

    An automated electrophoretic system is disclosed. The system employs a capillary cartridge having a plurality of capillary tubes. The cartridge has a first array of capillary ends projecting from one side of a plate. The first array of capillary ends are spaced apart in substantially the same manner as the wells of a microtitre tray of standard size. This allows one to simultaneously perform capillary electrophoresis on samples present in each of the wells of the tray. The system includes a stacked, dual carousel arrangement to eliminate cross-contamination resulting from reuse of the same buffer tray on consecutive executions from electrophoresis. The system also has a gel delivery module containing a gel syringe/a stepper motor or a high pressure chamber with a pump to quickly and uniformly deliver gel through the capillary tubes. The system further includes a multi-wavelength beam generator to generate a laser beam which produces a beam with a wide range of wavelengths. An off-line capillary reconditioner thoroughly cleans a capillary cartridge to enable simultaneous execution of electrophoresis with another capillary cartridge. The streamlined nature of the off-line capillary reconditioner offers the advantage of increased system throughput with a minimal increase in system cost.

  4. Xyce parallel electronic simulator : users' guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  5. Data communications for a collective operation in a parallel active

    Office of Scientific and Technical Information (OSTI)

    messaging interface of a parallel computer (Patent) | DOEPatents Data Explorer Search Results Data communications for a collective operation in a parallel active messaging interface of a parallel computer Title: Data communications for a collective operation in a parallel active messaging interface of a parallel computer Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints,

  6. Stochastic Parallel PARticle Kinetic Simulator

    Energy Science and Technology Software Center (OSTI)

    2008-07-01

    SPPARKS is a kinetic Monte Carlo simulator which implements kinetic and Metropolis Monte Carlo solvers in a general way so that they can be hooked to applications of various kinds. Specific applications are implemented in SPPARKS as physical models which generate events (e.g. a diffusive hop or chemical reaction) and execute them one-by-one. Applications can run in paralle so long as the simulation domain can be partitoned spatially so that multiple events can be invokedmore » simultaneously. SPPARKS is used to model various kinds of mesoscale materials science scenarios such as grain growth, surface deposition and growth, and reaction kinetics. It can also be used to develop new Monte Carlo models that hook to the existing solver and paralle infrastructure provided by the code.« less

  7. Parallel Climate Analysis Toolkit (ParCAT)

    Energy Science and Technology Software Center (OSTI)

    2013-06-30

    The parallel analysis toolkit (ParCAT) provides parallel statistical processing of large climate model simulation datasets. ParCAT provides parallel point-wise average calculations, frequency distributions, sum/differences of two datasets, and difference-of-average and average-of-difference for two datasets for arbitrary subsets of simulation time. ParCAT is a command-line utility that can be easily integrated in scripts or embedded in other application. ParCAT supports CMIP5 post-processed datasets as well as non-CMIP5 post-processed datasets. ParCAT reads and writes standard netCDF files.

  8. Distributed parallel messaging for multiprocessor systems

    DOE Patents [OSTI]

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  9. Asynchronous parallel pattern search for nonlinear optimization

    SciTech Connect (OSTI)

    P. D. Hough; T. G. Kolda; V. J. Torczon

    2000-01-01

    Parallel pattern search (PPS) can be quite useful for engineering optimization problems characterized by a small number of variables (say 10--50) and by expensive objective function evaluations such as complex simulations that take from minutes to hours to run. However, PPS, which was originally designed for execution on homogeneous and tightly-coupled parallel machine, is not well suited to the more heterogeneous, loosely-coupled, and even fault-prone parallel systems available today. Specifically, PPS is hindered by synchronization penalties and cannot recover in the event of a failure. The authors introduce a new asynchronous and fault tolerant parallel pattern search (AAPS) method and demonstrate its effectiveness on both simple test problems as well as some engineering optimization problems

  10. Feature Clustering for Accelerating Parallel Coordinate Descent

    SciTech Connect (OSTI)

    Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh; Haglin, David J.

    2012-12-06

    We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.

  11. Parallel I/O in Practice

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    art. This tutorial sheds light on the state-of-the-art in parallel IO and provides the knowledge necessary for attendees to best leverage IO resources available to them. We...

  12. Parallel programming with PCN. Revision 1

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  13. HOPSPACK: Hybrid Optimization Parallel Search Package.

    SciTech Connect (OSTI)

    Gray, Genetha A.; Kolda, Tamara G.; Griffin, Joshua; Taddy, Matt; Martinez-Canales, Monica

    2008-12-01

    In this paper, we describe the technical details of HOPSPACK (Hybrid Optimization Parallel SearchPackage), a new software platform which facilitates combining multiple optimization routines into asingle, tightly-coupled, hybrid algorithm that supports parallel function evaluations. The frameworkis designed such that existing optimization source code can be easily incorporated with minimalcode modification. By maintaining the integrity of each individual solver, the strengths and codesophistication of the original optimization package are retained and exploited.4

  14. Evaluating parallel relational databases for medical data analysis.

    SciTech Connect (OSTI)

    Rintoul, Mark Daniel; Wilson, Andrew T.

    2012-03-01

    Hospitals have always generated and consumed large amounts of data concerning patients, treatment and outcomes. As computers and networks have permeated the hospital environment it has become feasible to collect and organize all of this data. This raises naturally the question of how to deal with the resulting mountain of information. In this report we detail a proof-of-concept test using two commercially available parallel database systems to analyze a set of real, de-identified medical records. We examine database scalability as data sizes increase as well as responsiveness under load from multiple users.

  15. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A.M.; Draper, R.

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row. 5 figures.

  16. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A. Michael; Draper, Robert

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row.

  17. Solid state pulsed power generator

    DOE Patents [OSTI]

    Tao, Fengfeng; Saddoughi, Seyed Gholamali; Herbon, John Thomas

    2014-02-11

    A power generator includes one or more full bridge inverter modules coupled to a semiconductor opening switch (SOS) through an inductive resonant branch. Each module includes a plurality of switches that are switched in a fashion causing the one or more full bridge inverter modules to drive the semiconductor opening switch SOS through the resonant circuit to generate pulses to a load connected in parallel with the SOS.

  18. Parallel phase-sensitive three-dimensional imaging camera

    DOE Patents [OSTI]

    Smithpeter, Colin L.; Hoover, Eddie R.; Pain, Bedabrata; Hancock, Bruce R.; Nellums, Robert O.

    2007-09-25

    An apparatus is disclosed for generating a three-dimensional (3-D) image of a scene illuminated by a pulsed light source (e.g. a laser or light-emitting diode). The apparatus, referred to as a phase-sensitive 3-D imaging camera utilizes a two-dimensional (2-D) array of photodetectors to receive light that is reflected or scattered from the scene and processes an electrical output signal from each photodetector in the 2-D array in parallel using multiple modulators, each having inputs of the photodetector output signal and a reference signal, with the reference signal provided to each modulator having a different phase delay. The output from each modulator is provided to a computational unit which can be used to generate intensity and range information for use in generating a 3-D image of the scene. The 3-D camera is capable of generating a 3-D image using a single pulse of light, or alternately can be used to generate subsequent 3-D images with each additional pulse of light.

  19. Java Parallel Secure Stream for Grid Computing

    SciTech Connect (OSTI)

    Chen, Jie; Akers, Walter; Chen, Ying; Watson, William

    2001-09-01

    The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve the bandwidth and to reduce latency on a high speed wide area network. This paper presents a pure Java package called JPARSS (Java Par-allel Secure Stream) that divides data into partitions that are sent over several parallel Java streams simultaneously and allows Java or Web applications to achieve optimal TCP performance in a gird environment without the necessity of tuning the TCP window size. Several experimental results are provided to show that using parallel stream is more effective than tuning TCP window size. In addi-tion X.509 certificate based single sign-on mechanism and SSL based connection establishment are integrated into this package. Finally a few applications using this package will be discussed.

  20. Berkeley Unified Parallel C (UPC) Compiler

    Energy Science and Technology Software Center (OSTI)

    2003-04-06

    This program is a portable, open-source, compiler for the UPC language, which is based on the Open64 framework, and has extensive support for optimizations. This compiler operated by translating UPC into ANS/ISO C for compilation by a native compiler and linking with a UPC Runtime Library. This design eases portability to both shared and distributed memory parallel architectures. For proper operation the "Berkeley Unified Parallel C (UPC) Runtime Library" and its dependencies are required. Compatiblemore » replacements which implement "The Berkeley UPC Runtime Specification" are possible.« less

  1. Xyce parallel electronic simulator release notes.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.

  2. Parallel Implementation of Power System Dynamic Simulation

    SciTech Connect (OSTI)

    Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng; Wu, Di; Chen, Yousu

    2013-07-21

    Dynamic simulation of power system transient stability is important for planning, monitoring, operation, and control of electrical power systems. However, modeling the system dynamics and network involves the computationally intensive time-domain solution of numerous differential and algebraic equations (DAE). This results in a transient stability implementation that may not maintain the real-time constraints of an online security assessment. This paper presents a parallel implementation of the dynamic simulation on a high-performance computing (HPC) platform using parallel simulation algorithms and computation architectures. It enables the simulation to run even faster than real time, enabling the look-ahead capability of upcoming stability problems in the power grid.

  3. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-11-12

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.

  4. Light beam frequency comb generator

    DOE Patents [OSTI]

    Priatko, G.J.; Kaskey, J.A.

    1992-11-24

    A light beam frequency comb generator uses an acousto-optic modulator to generate a plurality of light beams with frequencies which are uniformly separated and possess common noise and drift characteristics. A well collimated monochromatic input light beam is passed through this modulator to produce a set of both frequency shifted and unshifted optical beams. An optical system directs one or more frequency shifted beams along a path which is parallel to the path of the input light beam such that the frequency shifted beams are made incident on the modulator proximate to but separated from the point of incidence of the input light beam. After the beam is thus returned to and passed through the modulator repeatedly, a plurality of mutually parallel beams are generated which are frequency-shifted different numbers of times and possess common noise and drift characteristics. 2 figs.

  5. Light beam frequency comb generator

    DOE Patents [OSTI]

    Priatko, Gordon J.; Kaskey, Jeffrey A.

    1992-01-01

    A light beam frequency comb generator uses an acousto-optic modulator to generate a plurality of light beams with frequencies which are uniformly separated and possess common noise and drift characteristics. A well collimated monochromatic input light beam is passed through this modulator to produce a set of both frequency shifted and unshifted optical beams. An optical system directs one or more frequency shifted beams along a path which is parallel to the path of the input light beam such that the frequency shifted beams are made incident on the modulator proximate to but separated from the point of incidence of the input light beam. After the beam is thus returned to and passed through the modulator repeatedly, a plurality of mutually parallel beams are generated which are frequency-shifted different numbers of times and possess common noise and drift characteristics.

  6. Parallel Performance of a Combustion Chemistry Simulation

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Skinner, Gregg; Eigenmann, Rudolf

    1995-01-01

    We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.

  7. Message passing with parallel queue traversal

    DOE Patents [OSTI]

    Underwood, Keith D.; Brightwell, Ronald B.; Hemmert, K. Scott

    2012-05-01

    In message passing implementations, associative matching structures are used to permit list entries to be searched in parallel fashion, thereby avoiding the delay of linear list traversal. List management capabilities are provided to support list entry turnover semantics and priority ordering semantics.

  8. Linked-View Parallel Coordinate Plot Renderer

    Energy Science and Technology Software Center (OSTI)

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  9. The parallel virtual file system for portals.

    SciTech Connect (OSTI)

    Schutt, James Alan

    2004-04-01

    This report presents the result of an effort to re-implement the Parallel Virtual File System (PVFS) using Portals as the transport. This report provides short overviews of PVFS and Portals, and describes the design and implementation of PVFS over Portals. Finally, the results of performance testing of both stock PVFS and PVFS over Portals are presented.

  10. Parallel programming with PCN. Revision 2

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  11. Collectively loading an application in a parallel computer

    DOE Patents [OSTI]

    Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Miller, Samuel J.; Mundy, Michael B.

    2016-01-05

    Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.

  12. Ramona Band of Cahuilla Mission Indians- 2002 Project

    Broader source: Energy.gov [DOE]

    The Ramona Band of Cahuilla Mission Indians ("Ramona Band" or "tribe") will be the first tribe to develop its entire reservation off-grid, using renewable energy as the primary power source. The tribe will purchase and install the primary components for a 65-80 kilowatt-hours per day central wind/PV/propane generator hybrid system that will power the reservation's housing, offices, ecotourism, and training businesses. The electricity is planned to be distributed through an underground mini-grid.

  13. Project Reports for Ramona Band of Cahuilla Mission Indians- 2002 Project

    Broader source: Energy.gov [DOE]

    The Ramona Band of Cahuilla Mission Indians ("Ramona Band" or "tribe") will be the first tribe to develop its entire reservation off-grid, using renewable energy as the primary power source. The tribe will purchase and install the primary components for a 65-80 kilowatt-hours per day central wind/PV/propane generator hybrid system that will power the reservation's housing, offices, ecotourism, and training businesses. The electricity is planned to be distributed through an underground mini-grid.

  14. GRIDS: Grid-Scale Rampable Intermittent Dispatchable Storage

    SciTech Connect (OSTI)

    2010-09-01

    GRIDS Project: The 12 projects that comprise ARPA-Es GRIDS Project, short for Grid-Scale Rampable Intermittent Dispatchable Storage, are developing storage technologies that can store renewable energy for use at any location on the grid at an investment cost less than $100 per kilowatt hour. Flexible, large-scale storage would create a stronger and more robust electric grid by enabling renewables to contribute to reliable power generation.

  15. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-10-29

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.

  16. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Data communications in a parallel active messaging interface ('PAMI') or a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution of a compute node, including specification of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications instruction, the instruction characterized by instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance witht the instruction type, the transfer data from the origin endpoin to the target endpoint.

  17. Users manual for the Chameleon parallel programming tools

    SciTech Connect (OSTI)

    Gropp, W.; Smith, B.

    1993-06-01

    Message passing is a common method for writing programs for distributed-memory parallel computers. Unfortunately, the lack of a standard for message passing has hampered the construction of portable and efficient parallel programs. In an attempt to remedy this problem, a number of groups have developed their own message-passing systems, each with its own strengths and weaknesses. Chameleon is a second-generation system of this type. Rather than replacing these existing systems, Chameleon is meant to supplement them by providing a uniform way to access many of these systems. Chameleon`s goals are to (a) be very lightweight (low over-head), (b) be highly portable, and (c) help standardize program startup and the use of emerging message-passing operations such as collective operations on subsets of processors. Chameleon also provides a way to port programs written using PICL or Intel NX message passing to other systems, including collections of workstations. Chameleon is tracking the Message-Passing Interface (MPI) draft standard and will provide both an MPI implementation and an MPI transport layer. Chameleon provides support for heterogeneous computing by using p4 and PVM. Chameleon`s support for homogeneous computing includes the portable libraries p4, PICL, and PVM and vendor-specific implementation for Intel NX, IBM EUI (SP-1), and Thinking Machines CMMD (CM-5). Support for Ncube and PVM 3.x is also under development.

  18. Locating hardware faults in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-04-13

    Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.

  19. Impact analysis on a massively parallel computer

    SciTech Connect (OSTI)

    Zacharia, T.; Aramayo, G.A.

    1994-06-01

    Advanced mathematical techniques and computer simulation play a major role in evaluating and enhancing the design of beverage cans, industrial, and transportation containers for improved performance. Numerical models are used to evaluate the impact requirements of containers used by the Department of Energy (DOE) for transporting radioactive materials. Many of these models are highly compute-intensive. An analysis may require several hours of computational time on current supercomputers despite the simplicity of the models being studied. As computer simulations and materials databases grow in complexity, massively parallel computers have become important tools. Massively parallel computational research at the Oak Ridge National Laboratory (ORNL) and its application to the impact analysis of shipping containers is briefly described in this paper.

  20. Parallel machine architecture for production rule systems

    DOE Patents [OSTI]

    Allen, Jr., John D.; Butler, Philip L.

    1989-01-01

    A parallel processing system for production rule programs utilizes a host processor for storing production rule right hand sides (RHS) and a plurality of rule processors for storing left hand sides (LHS). The rule processors operate in parallel in the recognize phase of the system recognize -Act Cycle to match their respective LHS's against a stored list of working memory elements (WME) in order to find a self consistent set of WME's. The list of WME is dynamically varied during the Act phase of the system in which the host executes or fires rule RHS's for those rules for which a self-consistent set has been found by the rule processors. The host transmits instructions for creating or deleting working memory elements as dictated by the rule firings until the rule processors are unable to find any further self-consistent working memory element sets at which time the production rule system is halted.

  1. Parallel Integrated Thermal Management - Energy Innovation Portal

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Vehicles and Fuels Vehicles and Fuels Early Stage R&D Early Stage R&D Find More Like This Return to Search Parallel Integrated Thermal Management National Renewable Energy Laboratory Contact NREL About This Technology Technology Marketing Summary Many current cooling systems for hybrid electric vehicles (HEVs) with a high power electric drive system utilize a low temperature liquid cooling loop for cooling the power electronics system and electric machines associated with the electric

  2. Parallel Molecular Dynamics Program for Molecules

    Energy Science and Technology Software Center (OSTI)

    1995-03-07

    ParBond is a parallel classical molecular dynamics code that models bonded molecular systems, typically of an organic nature. It uses classical force fields for both non-bonded Coulombic and Van der Waals interactions and for 2-, 3-, and 4-body bonded (bond, angle, dihedral, and improper) interactions. It integrates Newton''s equation of motion for the molecular system and evaluates various thermodynamical properties of the system as it progresses.

  3. Parallel Heuristics for Scalable Community Detection

    SciTech Connect (OSTI)

    Lu, Howard; Kalyanaraman, Anantharaman; Halappanavar, Mahantesh; Choudhury, Sutanay

    2014-05-17

    Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability to problems that can be solved on desktops. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose multiple heuristics that are designed to break the sequential barrier. Our heuristics are agnostic to the underlying parallel architecture. For evaluation purposes, we implemented our heuristics on shared memory (OpenMP) and distributed memory (MapReduce-MPI) machines, and tested them over real world graphs derived from multiple application domains (internet, biological, natural language processing). Experimental results demonstrate the ability of our heuristics to converge to high modularity solutions comparable to those output by the serial algorithm in nearly the same number of iterations, while also drastically reducing time to solution.

  4. FORTRAN Extensions for Modular Parallel Processing

    Energy Science and Technology Software Center (OSTI)

    1996-01-12

    FORTRAN M is a small set of extensions to FORTRAN that supports a modular approach to the construction of sequential and parallel programs. FORTRAN M programs use channels to plug together processes which may be written in FORTRAN M or FORTRAN 77. Processes communicate by sending and receiving messages on channels. Channels and processes can be created dynamically, but programs remain deterministic unless specialized nondeterministic constructs are used.

  5. Runtime System Library for Parallel Weather Modules

    Energy Science and Technology Software Center (OSTI)

    1997-07-22

    RSL is a Fortran-callable runtime library for use in implementing regular-grid weather forecast models, with nesting, on scalable distributed memory parallel computers. It provides high-level routines for finite-difference stencil communications and inter-domain exchange of data for nested forcing and feedback. RSL supports a unique point-wise domain-decomposition strategy to facilitate load-balancing.

  6. Xyce(™) Parallel Electronic Simulator

    Energy Science and Technology Software Center (OSTI)

    2013-10-03

    The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.! ! Xyce is primarily used to simulate the voltage and current behavior of a circuitmore » network (a network of electronic devices connected via a conductive network). As a tool, it is mainly used for the design and analysis of electronic circuits.! ! Kirchoff's conservation laws are enforced over a network using modified nodal analysis. This results in a set of differential algebraic equations (DAEs). The resulting nonlinear problem is solved iteratively using a fully coupled Newton method, which in turn results in a linear system that is solved by either a standard sparse-direct solver or iteratively using Trilinos linear solver packages, also developed at Sandia National Laboratories.« less

  7. Xyce parallel electronic simulator : reference guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.

  8. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect (OSTI)

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  9. Parallel processor-based raster graphics system architecture

    DOE Patents [OSTI]

    Littlefield, Richard J.

    1990-01-01

    An apparatus for generating raster graphics images from the graphics command stream includes a plurality of graphics processors connected in parallel, each adapted to receive any part of the graphics command stream for processing the command stream part into pixel data. The apparatus also includes a frame buffer for mapping the pixel data to pixel locations and an interconnection network for interconnecting the graphics processors to the frame buffer. Through the interconnection network, each graphics processor may access any part of the frame buffer concurrently with another graphics processor accessing any other part of the frame buffer. The plurality of graphics processors can thereby transmit concurrently pixel data to pixel locations in the frame buffer.

  10. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOE Patents [OSTI]

    Engh, G.J. van den; Stokdijk, W.

    1992-09-22

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate. 17 figs.

  11. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOE Patents [OSTI]

    van den Engh, Gerrit J.; Stokdijk, Willem

    1992-01-01

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate.

  12. Parallel Environment for the Creation of Stochastics 1.0

    Energy Science and Technology Software Center (OSTI)

    2011-01-06

    PECOS is a computational library for creating and manipulating realizations of stochastic quantities, including scalar uncertain variables, random fields, and stochastic processes. It offers a unified interface to univariate and multivariate polynomial approximations using either orthogonal or interpolation polynomials; numerical integration drivers for Latin hypercube sampling, quadrature, cubature, and sparse grids; and fast Fourier transforms using third party libraries. The PECOS core also offers statistical utilities and transformations between various representations of stochastic uncertainty. PECOSmore » provides a C++ API through which users can generate and transform realizations of stochastic quantities. It is currently used by Sandia’s DAKOTA, Stokhos, and Encore software packages for uncertainty quantification and verification. PECOS generates random sample sets and multi-dimensional integration grids, typically used in forward propagation of scalar uncertainty in computational models (uncertainty quantification (UQ)). PECOS also generates samples of random fields (RFs) and stochastic processes (SPs) from a set of user-defined power spectral densities (PSDs). The RF/SP may be either Gaussian or non-Gaussian and either stationary or nonstationary, and the resulting sample is intended for run-time query by parallel finite element simulation codes. Finally, PECOS supports nonlinear transformations of random variables via the Nataf transformation and extensions.« less

  13. Generation Planning (pbl/generation)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Generation Hydro Power Wind Power Monthly GSP BPA White Book Dry Year Tools Firstgov Generation Planning Thumbnail image of BPA White Book BPA White Book (1998-2014) Draft Dry...

  14. Electrostatic generator/motor configurations

    DOE Patents [OSTI]

    Post, Richard F

    2014-02-04

    Electrostatic generators/motors designs are provided that generally may include a first cylindrical stator centered about a longitudinal axis; a second cylindrical stator centered about the axis, a first cylindrical rotor centered about the axis and located between the first cylindrical stator and the second cylindrical stator. The first cylindrical stator, the second cylindrical stator and the first cylindrical rotor may be concentrically aligned. A magnetic field having field lines about parallel with the longitudinal axis is provided.

  15. Parallel log structured file system collective buffering to achieve a compact representation of scientific and/or dimensional data

    DOE Patents [OSTI]

    Grider, Gary A.; Poole, Stephen W.

    2015-09-01

    Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.

  16. Storing files in a parallel computing system based on user-specified parser function

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  17. A mirror for lab-based quasi-monochromatic parallel x-rays

    SciTech Connect (OSTI)

    Nguyen, Thanhhai; Lu, Xun; Lee, Chang Jun; Jeon, Insu; Jung, Jin-Ho; Jin, Gye-Hwan; Kim, Sung Youb

    2014-09-15

    A multilayered parabolic mirror with six W/Al bilayers was designed and fabricated to generate monochromatic parallel x-rays using a lab-based x-ray source. Using this mirror, curved bright bands were obtained in x-ray images as reflected x-rays. The parallelism of the reflected x-rays was investigated using the shape of the bands. The intensity and monochromatic characteristics of the reflected x-rays were evaluated through measurements of the x-ray spectra in the band. High intensity, nearly monochromatic, and parallel x-rays, which can be used for high resolution x-ray microscopes and local radiation therapy systems, were obtained.

  18. A Massively Parallel Solver for the Mechanical Harmonic Analysis...

    Office of Scientific and Technical Information (OSTI)

    Details In-Document Search Title: A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities ACE3P is a 3D massively parallel simulation suite that...

  19. Characterizing the parallelism in rule-based expert systems

    SciTech Connect (OSTI)

    Douglass, R.J.

    1984-01-01

    A brief review of two classes of rule-based expert systems is presented, followed by a detailed analysis of potential sources of parallelism at the production or rule level, the subrule level (including match, select, and act parallelism), and at the search level (including AND, OR, and stream parallelism). The potential amount of parallelism from each source is discussed and characterized in terms of its granularity, inherent serial constraints, efficiency, speedup, dynamic behavior, and communication volume, frequency, and topology. Subrule parallelism will yield, at best, two- to tenfold speedup, and rule level parallelism will yield a modest speedup on the order of 5 to 10 times. Rule level can be combined with OR, AND, and stream parallelism in many instances to yield further parallel speedups.

  20. Parallelizing AT with MatlabMPI

    SciTech Connect (OSTI)

    Li, Evan Y.; /Brown U. /SLAC

    2011-06-22

    The Accelerator Toolbox (AT) is a high-level collection of tools and scripts specifically oriented toward solving problems dealing with computational accelerator physics. It is integrated into the MATLAB environment, which provides an accessible, intuitive interface for accelerator physicists, allowing researchers to focus the majority of their efforts on simulations and calculations, rather than programming and debugging difficulties. Efforts toward parallelization of AT have been put in place to upgrade its performance to modern standards of computing. We utilized the packages MatlabMPI and pMatlab, which were developed by MIT Lincoln Laboratory, to set up a message-passing environment that could be called within MATLAB, which set up the necessary pre-requisites for multithread processing capabilities. On local quad-core CPUs, we were able to demonstrate processor efficiencies of roughly 95% and speed increases of nearly 380%. By exploiting the efficacy of modern-day parallel computing, we were able to demonstrate incredibly efficient speed increments per processor in AT's beam-tracking functions. Extrapolating from prediction, we can expect to reduce week-long computation runtimes to less than 15 minutes. This is a huge performance improvement and has enormous implications for the future computing power of the accelerator physics group at SSRL. However, one of the downfalls of parringpass is its current lack of transparency; the pMatlab and MatlabMPI packages must first be well-understood by the user before the system can be configured to run the scripts. In addition, the instantiation of argument parameters requires internal modification of the source code. Thus, parringpass, cannot be directly run from the MATLAB command line, which detracts from its flexibility and user-friendliness. Future work in AT's parallelization will focus on development of external functions and scripts that can be called from within MATLAB and configured on multiple nodes, while expending minimal communication overhead with the integrated MATLAB library.

  1. Parallel heater system for subsurface formations

    DOE Patents [OSTI]

    Harris, Christopher Kelvin (Houston, TX); Karanikas, John Michael (Houston, TX); Nguyen, Scott Vinh (Houston, TX)

    2011-10-25

    A heating system for a subsurface formation is disclosed. The system includes a plurality of substantially horizontally oriented or inclined heater sections located in a hydrocarbon containing layer in the formation. At least a portion of two of the heater sections are substantially parallel to each other. The ends of at least two of the heater sections in the layer are electrically coupled to a substantially horizontal, or inclined, electrical conductor oriented substantially perpendicular to the ends of the at least two heater sections.

  2. Carbothermic reduction with parallel heat sources

    DOE Patents [OSTI]

    Troup, Robert L. (Murrysville, PA); Stevenson, David T. (Washington Township, Washington County, PA)

    1984-12-04

    Disclosed are apparatus and method of carbothermic direct reduction for producing an aluminum alloy from a raw material mix including aluminum oxide, silicon oxide, and carbon wherein parallel heat sources are provided by a combustion heat source and by an electrical heat source at essentially the same position in the reactor, e.g., such as at the same horizontal level in the path of a gravity-fed moving bed in a vertical reactor. The present invention includes providing at least 79% of the heat energy required in the process by the electrical heat source.

  3. Requirements for Parallel I/O,

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Requirements for Parallel I/O, ! Visualization and Analysis Prabhat 1 , Q uincey K oziol 2 1 LBL/NERSC 2 The H DF G roup NERSC A SCR R equirements f or 2 017 January 1 5, 2 014 LBNL 1. Project Description! * m636 r epo * LBL V is B ase P rogram ( Bethel P I) [ PM: N owell] * Conduct f undamental a nd a pplied vis/analyEcs R &D t o address e xascale c hallenges * ExaHDF5 P roject ( Prabhat, Q uincey P Is) [ PM: Nowell] * Scale P arallel I /O, a nd d ata m anagement t echnologies f or current

  4. A brief parallel I/O tutorial.

    SciTech Connect (OSTI)

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about the I/O transfer per request.

  5. Parallel State Estimation Assessment with Practical Data

    SciTech Connect (OSTI)

    Chen, Yousu; Jin, Shuangshuang; Rice, Mark J.; Huang, Zhenyu

    2014-10-31

    This paper presents a full-cycle parallel state estimation (PSE) implementation using a preconditioned conjugate gradient algorithm. The developed code is able to solve large-size power system state estimation within 5 seconds using real-world data, comparable to the Supervisory Control And Data Acquisition (SCADA) rate. This achievement allows the operators to know the system status much faster to help improve grid reliability. Case study results of the Bonneville Power Administration (BPA) system with real measurements are presented. The benefits of fast state estimation are also discussed.

  6. SimFS: A Large Scale Parallel File System Simulator

    Energy Science and Technology Software Center (OSTI)

    2011-08-30

    The software provides both framework and tools to simulate a large-scale parallel file system such as Lustre.

  7. Processing data communications events by awakening threads in parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2016-03-15

    Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.

  8. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, Dario B.

    1994-01-01

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination.

  9. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, D.B.

    1994-07-19

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination. 9 figs.

  10. Parallel tetrahedral mesh refinement with MOAB.

    SciTech Connect (OSTI)

    Thompson, David C.; Pebay, Philippe Pierre

    2008-12-01

    In this report, we present the novel functionality of parallel tetrahedral mesh refinement which we have implemented in MOAB. This report details work done to implement parallel, edge-based, tetrahedral refinement into MOAB. The theoretical basis for this work is contained in [PT04, PT05, TP06] while information on design, performance, and operation specific to MOAB are contained herein. As MOAB is intended mainly for use in pre-processing and simulation (as opposed to the post-processing bent of previous papers), the primary use case is different: rather than refining elements with non-linear basis functions, the goal is to increase the number of degrees of freedom in some region in order to more accurately represent the solution to some system of equations that cannot be solved analytically. Also, MOAB has a unique mesh representation which impacts the algorithm. This introduction contains a brief review of streaming edge-based tetrahedral refinement. The remainder of the report is broken into three sections: design and implementation, performance, and conclusions. Appendix A contains instructions for end users (simulation authors) on how to employ the refiner.

  11. Magnetohydrodynamic generator electrode

    DOE Patents [OSTI]

    Marchant, David D.; Killpatrick, Don H.; Herman, Harold; Kuczen, Kenneth D.

    1979-01-01

    An improved electrode for use as a current collector in the channel of a magnetohydrodynamid (MHD) generator utilizes an elongated monolithic cap of dense refractory material compliantly mounted to the MHD channel frame for collecting the current. The cap has a central longitudinal channel which contains a first layer of porous refractory ceramic as a high-temperature current leadout from the cap and a second layer of resilient wire mesh in contact with the first layer as a low-temperature current leadout between the first layer and the frame. Also described is a monolithic ceramic insulator compliantly mounted to the frame parallel to the electrode by a plurality of flexible metal strips.

  12. Sub-Second Parallel State Estimation

    SciTech Connect (OSTI)

    Chen, Yousu; Rice, Mark J.; Glaesemann, Kurt R.; Wang, Shaobu; Huang, Zhenyu

    2014-10-31

    This report describes the performance of Pacific Northwest National Laboratory (PNNL) sub-second parallel state estimation (PSE) tool using the utility data from the Bonneville Power Administrative (BPA) and discusses the benefits of the fast computational speed for power system applications. The test data were provided by BPA. They are two-days’ worth of hourly snapshots that include power system data and measurement sets in a commercial tool format. These data are extracted out from the commercial tool box and fed into the PSE tool. With the help of advanced solvers, the PSE tool is able to solve each BPA hourly state estimation problem within one second, which is more than 10 times faster than today’s commercial tool. This improved computational performance can help increase the reliability value of state estimation in many aspects: (1) the shorter the time required for execution of state estimation, the more time remains for operators to take appropriate actions, and/or to apply automatic or manual corrective control actions. This increases the chances of arresting or mitigating the impact of cascading failures; (2) the SE can be executed multiple times within time allowance. Therefore, the robustness of SE can be enhanced by repeating the execution of the SE with adaptive adjustments, including removing bad data and/or adjusting different initial conditions to compute a better estimate within the same time as a traditional state estimator’s single estimate. There are other benefits with the sub-second SE, such as that the PSE results can potentially be used in local and/or wide-area automatic corrective control actions that are currently dependent on raw measurements to minimize the impact of bad measurements, and provides opportunities to enhance the power grid reliability and efficiency. PSE also can enable other advanced tools that rely on SE outputs and could be used to further improve operators’ actions and automated controls to mitigate effects of severe events on the grid. The power grid continues to grow and the number of measurements is increasing at an accelerated rate due to the variety of smart grid devices being introduced. A parallel state estimation implementation will have better performance than traditional, sequential state estimation by utilizing the power of high performance computing (HPC). This increased performance positions parallel state estimators as valuable tools for operating the increasingly more complex power grid.

  13. Digitally programmable signal generator and method

    DOE Patents [OSTI]

    Priatko, G.J.; Kaskey, J.A.

    1989-11-14

    Disclosed is a digitally programmable waveform generator for generating completely arbitrary digital or analog waveforms from very low frequencies to frequencies in the gigasample per second range. A memory array with multiple parallel outputs is addressed; then the parallel output data is latched into buffer storage from which it is serially multiplexed out at a data rate many times faster than the access time of the memory array itself. While data is being multiplexed out serially, the memory array is accessed with the next required address and presents its data to the buffer storage before the serial multiplexing of the last group of data is completed, allowing this new data to then be latched into the buffer storage for smooth continuous serial data output. In a preferred implementation, a plurality of these serial data outputs are paralleled to form the input to a digital to analog converter, providing a programmable analog output. 6 figs.

  14. Digitally programmable signal generator and method

    DOE Patents [OSTI]

    Priatko, Gordon J.; Kaskey, Jeffrey A.

    1989-01-01

    A digitally programmable waveform generator for generating completely arbitrary digital or analog waveforms from very low frequencies to frequencies in the gigasample per second range. A memory array with multiple parallel outputs is addressed; then the parallel output data is latched into buffer storage from which it is serially multiplexed out at a data rate many times faster than the access time of the memory array itself. While data is being multiplexed out serially, the memory array is accessed with the next required address and presents its data to the buffer storage before the serial multiplexing of the last group of data is completed, allowing this new data to then be latched into the buffer storage for smooth continuous serial data output. In a preferred implementation, a plurality of these serial data outputs are paralleled to form the input to a digital to analog converter, providing a programmable analog output.

  15. Paradyn a parallel nonlinear, explicit, three-dimensional finite-element code for solid and structural mechanics user manual

    SciTech Connect (OSTI)

    Hoover, C G; DeGroot, A J; Sherwood, R J

    2000-06-01

    ParaDyn is a parallel version of the DYNA3D computer program, a three-dimensional explicit finite-element program for analyzing the dynamic response of solids and structures. The ParaDyn program has been used as a production tool for over three years for analyzing problems which range in size from a few tens of thousands of elements to between one-million and ten-million elements. ParaDyn runs on parallel computers provided by the Department of Energy Accelerated Strategic Computing Initiative (ASCI) and the Department of Defense High Performance Computing and Modernization Program. Preprocessing and post-processing software utilities and tools are designed to facilitate the generation of partitioned domains for processors on a massively parallel computer and the visualization of both resultant data and boundary data generated in a parallel simulation. This manual provides a brief overview of the parallel implementation; describes techniques for running the ParaDyn program, tools and utilities; and provides examples of parallel simulations.

  16. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    2014-04-30

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  17. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  18. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2014-10-21

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  19. LAPACK BLAS Parallel BLAS ScaLAPACK

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    LAPACK BLAS Parallel BLAS ScaLAPACK (E.g., MPI, PVM) PBLAS Local Addressing Global Addressing man intro_blas3 man intro_blacs man intro_lapack BLACS Message Passing Primitives man intro_scalapack Basic Lin. Alg. Communication Subprograms 0 1 2 3 0 4 5 0 1 2 1 NB M N MB a a a a a a a a a a a a a a a a a a a a a a a a 11 12 13 14 a a a a a 15 16 17 18 19 a a a a a a a a a 21 22 23 24 25 26 27 28 29 a a a a a a a a a a a a a a a a a a a a a a a a a 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 47 48

  20. Parallelism of the SANDstorm hash algorithm.

    SciTech Connect (OSTI)

    Torgerson, Mark Dolan; Draelos, Timothy John; Schroeppel, Richard Crabtree

    2009-09-01

    Mainstream cryptographic hashing algorithms are not parallelizable. This limits their speed and they are not able to take advantage of the current trend of being run on multi-core platforms. Being limited in speed limits their usefulness as an authentication mechanism in secure communications. Sandia researchers have created a new cryptographic hashing algorithm, SANDstorm, which was specifically designed to take advantage of multi-core processing and be parallelizable on a wide range of platforms. This report describes a late-start LDRD effort to verify the parallelizability claims of the SANDstorm designers. We have shown, with operating code and bench testing, that the SANDstorm algorithm may be trivially parallelized on a wide range of hardware platforms. Implementations using OpenMP demonstrates a linear speedup with multiple cores. We have also shown significant performance gains with optimized C code and the use of assembly instructions to exploit particular platform capabilities.

  1. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2014-08-19

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  2. CS-Studio Scan System Parallelization

    SciTech Connect (OSTI)

    Kasemir, Kay; Pearson, Matthew R

    2015-01-01

    For several years, the Control System Studio (CS-Studio) Scan System has successfully automated the operation of beam lines at the Oak Ridge National Laboratory (ORNL) High Flux Isotope Reactor (HFIR) and Spallation Neutron Source (SNS). As it is applied to additional beam lines, we need to support simultaneous adjustments of temperatures or motor positions. While this can be implemented via virtual motors or similar logic inside the Experimental Physics and Industrial Control System (EPICS) Input/Output Controllers (IOCs), doing so requires a priori knowledge of experimenters requirements. By adding support for the parallel control of multiple process variables (PVs) to the Scan System, we can better support ad hoc automation of experiments that benefit from such simultaneous PV adjustments.

  3. Parallel detecting, spectroscopic ellipsometers/polarimeters

    DOE Patents [OSTI]

    Furtak, Thomas E.

    2002-01-01

    The parallel detecting spectroscopic ellipsometer/polarimeter sensor has no moving parts and operates in real-time for in-situ monitoring of the thin film surface properties of a sample within a processing chamber. It includes a multi-spectral source of radiation for producing a collimated beam of radiation directed towards the surface of the sample through a polarizer. The thus polarized collimated beam of radiation impacts and is reflected from the surface of the sample, thereby changing its polarization state due to the intrinsic material properties of the sample. The light reflected from the sample is separated into four separate polarized filtered beams, each having individual spectral intensities. Data about said four individual spectral intensities is collected within the processing chamber, and is transmitted into one or more spectrometers. The data of all four individual spectral intensities is then analyzed using transformation algorithms, in real-time.

  4. Link failure detection in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Megerian, Mark G.; Smith, Brian E.

    2010-11-09

    Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.

  5. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Daniel A

    2014-11-18

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  6. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  7. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Ahmad A

    2013-04-16

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  8. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Miller, Douglas R.; Parker, Jeffrey J.; Ratterman, Joseph D.; Smith, Brian E.

    2013-09-03

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  9. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2013-07-23

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a compute node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  10. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2014-01-07

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a computer node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  11. Substantially parallel flux uncluttered rotor machines

    SciTech Connect (OSTI)

    Hsu, John S.

    2012-12-11

    A permanent magnet-less and brushless synchronous system includes a stator that generates a magnetic rotating field when sourced by polyphase alternating currents. An uncluttered rotor is positioned within the magnetic rotating field and is spaced apart from the stator. An excitation core is spaced apart from the stator and the uncluttered rotor and magnetically couples the uncluttered rotor. The brushless excitation source generates a magnet torque by inducing magnetic poles near an outer peripheral surface of the uncluttered rotor, and the stator currents also generate a reluctance torque by a reaction of the difference between the direct and quadrature magnetic paths of the uncluttered rotor. The system can be used either as a motor or a generator

  12. Fuel dissipater for pressurized fuel cell generators

    DOE Patents [OSTI]

    Basel, Richard A.; King, John E.

    2003-11-04

    An apparatus and method are disclosed for eliminating the chemical energy of fuel remaining in a pressurized fuel cell generator (10) when the electrical power output of the fuel cell generator is terminated during transient operation, such as a shutdown; where, two electrically resistive elements (two of 28, 53, 54, 55) at least one of which is connected in parallel, in association with contactors (26, 57, 58, 59), a multi-point settable sensor relay (23) and a circuit breaker (24), are automatically connected across the fuel cell generator terminals (21, 22) at two or more contact points, in order to draw current, thereby depleting the fuel inventory in the generator.

  13. Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale

    SciTech Connect (OSTI)

    Daily, Jeffrey A.

    2015-04-21

    The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore’s law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of already annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K cores, representing a time-to-solution of 33 seconds. We extend this work with a detailed analysis of single-node sequence alignment performance using the latest CPU vector instruction set extensions. Preliminary results reveal that current sequence alignment algorithms are unable to fully utilize widening vector registers.

  14. Microwave generator

    DOE Patents [OSTI]

    Kwan, T.J.T.; Snell, C.M.

    1987-03-31

    A microwave generator is provided for generating microwaves substantially from virtual cathode oscillation. Electrons are emitted from a cathode and accelerated to an anode which is spaced apart from the cathode. The anode has an annular slit there through effective to form the virtual cathode. The anode is at least one range thickness relative to electrons reflecting from the virtual cathode. A magnet is provided to produce an optimum magnetic field having the field strength effective to form an annular beam from the emitted electrons in substantial alignment with the annular anode slit. The magnetic field, however, does permit the reflected electrons to axially diverge from the annular beam. The reflected electrons are absorbed by the anode in returning to the real cathode, such that substantially no reflexing electrons occur. The resulting microwaves are produced with a single dominant mode and are substantially monochromatic relative to conventional virtual cathode microwave generators. 6 figs.

  15. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-16

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  16. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-02

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  17. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D.; Faraj, Daniel A.

    2014-07-22

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  18. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

  19. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2015-02-03

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

  20. Data communications for a collective operation in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2013-07-16

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  1. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D; Faraj, Daniel A

    2013-07-09

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  2. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-09

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task; the compute nodes coupled for data communications through the PAMI and through data communications resources including at least one segment of shared random access memory; including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers through a segment of shared memory; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  3. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-02

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task; the compute nodes coupled for data communications through the PAMI and through data communications resources including at least one segment of shared random access memory; including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers through a segment of shared memory; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  4. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-08-11

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint comprising a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI and through data communications resources including a deterministic data communications network, including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  5. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-30

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint comprising a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI and through data communications resources including a deterministic data communications network, including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  6. Data communications for a collective operation in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2015-11-19

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  7. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2013-09-03

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  8. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A; Mamidala, Amith R

    2014-02-11

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  9. Methods and apparatus for multi-resolution replication of files in a parallel computing system using semantic information

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Torres, Aaron

    2015-10-20

    Techniques are provided for storing files in a parallel computing system using different resolutions. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a sub-file. The method comprises the steps of obtaining semantic information related to the file; generating a plurality of replicas of the file with different resolutions based on the semantic information; and storing the file and the plurality of replicas of the file in one or more storage nodes of the parallel computing system. The different resolutions comprise, for example, a variable number of bits and/or a different sub-set of data elements from the file. A plurality of the sub-files can be merged to reproduce the file.

  10. The Swift Parallel Scripting Language for ALCF Systems | Argonne Leadership

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Computing Facility Projects bgclang Compiler Cobalt Scheduler GLEAN Petrel Swift The Swift Parallel Scripting Language for ALCF Systems Swift is an implicitly parallel functional language that makes it easier to script higher-level applications or workflows composed from serial or parallel programs. Recently made available across ALCF systems, it has been used to script application workflows in a broad range of diverse disciplines from protein structure prediction to modeling global

  11. Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications

    Office of Scientific and Technical Information (OSTI)

    (Conference) | SciTech Connect Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications Citation Details In-Document Search Title: Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications This paper describes a kernel scheduling algorithm that is based on co-scheduling principles and that is intended for parallel applications running on 1000 cores or more where inter-node scalability is key. Experimental results for a Linux implementation on a Cray XT5 machine are

  12. Linux Kernel Co-Scheduling and Bulk Synchronous Parallelism (Journal

    Office of Scientific and Technical Information (OSTI)

    Article) | SciTech Connect Linux Kernel Co-Scheduling and Bulk Synchronous Parallelism Citation Details In-Document Search Title: Linux Kernel Co-Scheduling and Bulk Synchronous Parallelism This paper describes a kernel scheduling algorithm that is based on coscheduling principles and that is intended for parallel applications running on 1000 cores or more. Experimental results for a Linux implementation on a Cray XT5 machine are presented. The results indicate that Linux is a suitable

  13. De Novo Ultrascale Atomistic Simulations On High-End Parallel

    Office of Scientific and Technical Information (OSTI)

    Supercomputers (Journal Article) | SciTech Connect De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers Citation Details In-Document Search Title: De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end

  14. Berkeley Unified Parallel C (UPC) Runtime Library

    Energy Science and Technology Software Center (OSTI)

    2003-03-31

    This software comprises a portable, open source implementation of a runtime library to support applications written in the Unified Parallel C (UPC) language. This library implements the UPC-specific functionality, including shared memory allocation and locks. The network-dependent functionality is implemented as a thin wrapper around a separate library implementing the GASNet (Global-Address Space Networking) specification. For true shared memory machines. GASNet is bypassed in favor of direct memory operations and local synchronization mechanisms. The Berkeleymore » UPC Runtime Library is currently the only implementation of the "Berkeley UPC Runtime Specification", and thus the only runtme library usable with the Berkeley UPC Compiler. Also, it is the only UPC runtime known to the author to provide two shared pointer representations: one for arbitrary blocksizes and one to optimize for the common cases of phaseless and blocksize=1. For distributed memory environments a library implementing the GASNet (Global-Address Space Networking) specification is required for communication. While no specialized hardware is required, a high-speed interconnet supported by the GASNet implementation is suggested for preformance. If no supported high-speed interconnect is available. GASNet can run over MPI. An external library is reqired for certain local memory allocation operations. A well defined interface allows for multiple implementations of this library, but at present the "umalloc" library from LBNL is the only compatible implementation.« less

  15. Flexible Language Constructs for Large Parallel Programs

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Rosing, Matt; Schnabel, Robert

    1994-01-01

    The goal of the research described in this article is to develop flexible language constructs for writing large data parallel numerical programs for distributed memory (multiple instruction multiple data [MIMD]) multiprocessors. Previously, several models have been developed to support synchronization and communication. Models for global synchronization include single instruction multiple data (SIMD), single program multiple data (SPMD), and sequential programs annotated with data distribution statements. The two primary models for communication include implicit communication based on shared memory and explicit communication based on messages. None of these models by themselves seem sufficient to permit the natural and efficient expression ofmore » the variety of algorithms that occur in large scientific computations. In this article, we give an overview of a new language that combines many of these programming models in a clean manner. This is done in a modular fashion such that different models can be combined to support large programs. Within a module, the selection of a model depends on the algorithm and its efficiency requirements. In this article, we give an overview of the language and discuss some of the critical implementation details.« less

  16. A garbage collection algorithm for shared memory parallel processors

    SciTech Connect (OSTI)

    Crammond, J. )

    1988-12-01

    This paper describes a technique for adapting the Morris sliding garbage collection algorithm to execute on parallel machines with shared memory. The algorithm is described within the framework of an implementation of the parallel logic language Parlog. However, the algorithm is a general one and can easily be adapted to parallel Prolog systems and to other languages. The performance of the algorithm executing a few simple Parlog benchmarks is analyzed. Finally, it is shown how the technique for parallelizing the sequential algorithm can be adapted for a semi-space copying algorithm.

  17. Mesoscale Simulations of Particulate Flows with Parallel Distributed...

    Office of Scientific and Technical Information (OSTI)

    Title: Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique Fluid particulate flows are common phenomena in nature and industry. ...

  18. A set of parallel, implicit methods for a reconstructed discontinuous...

    Office of Scientific and Technical Information (OSTI)

    Furthermore, an SPMD (single program, multiple data) programming paradigm based on MPI is proposed to achieve parallelism. The numerical results on complex geometries...

  19. Optimizing Pinhole and Parallel Hole Collimation for Scintimammography...

    Office of Scientific and Technical Information (OSTI)

    Using analytic formulas, pinhole and parallel hole collimator parameters were calculated that satisfy this object resolution with optimal geometric sensitivity. Analyses were ...

  20. A Comprehensive Look at High Performance Parallel I/O

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    A Comprehensive Look at High Performance Parallel I/O A Comprehensive Look at High Performance Parallel I/O Book Signing @ SC14! Nov. 18, 5 p.m. in Booth 1939 November 10, 2014 Contact: Linda Vu, +1 510 495 2402, lvu@lbl.gov HighPerf Parallel IO In the 1990s, high performance computing (HPC) made a dramatic transition to massively parallel processors. As this model solidified over the next 20 years, supercomputing performance increased from gigaflops-billions of calculations per second-to

  1. A set of parallel, implicit methods for a reconstructed discontinuous...

    Office of Scientific and Technical Information (OSTI)

    Journal Article: A set of parallel, implicit methods for a reconstructed discontinuous Galerkin method for compressible flows on 3D hybrid grids Citation Details In-Document Search...

  2. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and...

    Office of Scientific and Technical Information (OSTI)

    Technical Report: PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes Citation Details In-Document Search...

  3. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and...

    Office of Scientific and Technical Information (OSTI)

    PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes Lichtner, Peter OFM Research; Karra, Satish Los...

  4. Parallel performance optimizations on unstructured mesh-based simulations

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; Huck, Kevin; Hollingsworth, Jeffrey; Malony, Allen; Williams, Samuel; Oliker, Leonid

    2015-06-01

    This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more » We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less

  5. Magnetocumulative generator

    DOE Patents [OSTI]

    Pettibone, J.S.; Wheeler, P.C.

    1981-06-08

    An improved magnetocumulative generator is described that is useful for producing magnetic fields of very high energy content over large spatial volumes. The polar directed pleated magnetocumulative generator has a housing providing a housing chamber with an electrically conducting surface. The chamber forms a coaxial system having a small radius portion and a large radius portion. When a magnetic field is injected into the chamber, from an external source, most of the magnetic flux associated therewith positions itself in the small radius portion. The propagation of an explosive detonation through high-explosive layers disposed adjacent to the housing causes a phased closure of the chamber which sweeps most of the magnetic flux into the large radius portion of the coaxial system. The energy content of the magnetic field is greatly increased by flux stretching as well as by flux compression. The energy enhanced magnetic field is utilized within the housing chamber itself.

  6. Thermoelectric generator

    DOE Patents [OSTI]

    Pryslak, N.E.

    1974-02-26

    A thermoelectric generator having a rigid coupling or stack'' between the heat source and the hot strap joining the thermoelements is described. The stack includes a member of an insulating material, such as ceramic, for electrically isolating the thermoelements from the heat source, and a pair of members of a ductile material, such as gold, one each on each side of the insulating member, to absorb thermal differential expansion stresses in the stack. (Official Gazette)

  7. PLASMA GENERATOR

    DOE Patents [OSTI]

    Foster, J.S. Jr.

    1958-03-11

    This patent describes apparatus for producing an electricity neutral ionized gas discharge, termed a plasma, substantially free from contamination with neutral gas particles. The plasma generator of the present invention comprises a plasma chamber wherein gas introduced into the chamber is ionized by a radiofrequency source. A magnetic field is used to focus the plasma in line with an exit. This magnetic field cooperates with a differential pressure created across the exit to draw a uniform and uncontaminated plasma from the plasma chamber.

  8. Cluster generator

    DOE Patents [OSTI]

    Donchev, Todor I.; Petrov, Ivan G.

    2011-05-31

    Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow cathode includes one or more walls. The one or more walls define a sputtering chamber within the hollow cathode and include a material to be sputtered. A hollow anode is positioned at an end of the sputtering chamber, and atom clusters are formed when a gas discharge is generated between the hollow anode and the hollow cathode.

  9. Photon generator

    DOE Patents [OSTI]

    Srinivasan-Rao, Triveni

    2002-01-01

    A photon generator includes an electron gun for emitting an electron beam, a laser for emitting a laser beam, and an interaction ring wherein the laser beam repetitively collides with the electron beam for emitting a high energy photon beam therefrom in the exemplary form of x-rays. The interaction ring is a closed loop, sized and configured for circulating the electron beam with a period substantially equal to the period of the laser beam pulses for effecting repetitive collisions.

  10. Electric generator

    DOE Patents [OSTI]

    Foster, Jr., John S.; Wilson, James R.; McDonald, Jr., Charles A.

    1983-01-01

    1. In an electrical energy generator, the combination comprising a first elongated annular electrical current conductor having at least one bare surface extending longitudinally and facing radially inwards therein, a second elongated annular electrical current conductor disposed coaxially within said first conductor and having an outer bare surface area extending longitudinally and facing said bare surface of said first conductor, the contiguous coaxial areas of said first and second conductors defining an inductive element, means for applying an electrical current to at least one of said conductors for generating a magnetic field encompassing said inductive element, and explosive charge means disposed concentrically with respect to said conductors including at least the area of said inductive element, said explosive charge means including means disposed to initiate an explosive wave front in said explosive advancing longitudinally along said inductive element, said wave front being effective to progressively deform at least one of said conductors to bring said bare surfaces thereof into electrically conductive contact to progressively reduce the inductance of the inductive element defined by said conductors and transferring explosive energy to said magnetic field effective to generate an electrical potential between undeformed portions of said conductors ahead of said explosive wave front.

  11. Current parallel I/O limitations to scalable data analysis.

    SciTech Connect (OSTI)

    Mascarenhas, Ajith Arthur; Pebay, Philippe Pierre

    2011-07-01

    This report describes the limitations to parallel scalability which we have encountered when applying our otherwise optimally scalable parallel statistical analysis tool kit to large data sets distributed across the parallel file system of the current premier DOE computational facility. This report describes our study to evaluate the effect of parallel I/O on the overall scalability of a parallel data analysis pipeline using our scalable parallel statistics tool kit [PTBM11]. In this goal, we tested it using the Jaguar-pf DOE/ORNL peta-scale platform on a large combustion simulation data under a variety of process counts and domain decompositions scenarios. In this report we have recalled the foundations of the parallel statistical analysis tool kit which we have designed and implemented, with the specific double intent of reproducing typical data analysis workflows, and achieving optimal design for scalable parallel implementations. We have briefly reviewed those earlier results and publications which allow us to conclude that we have achieved both goals. However, in this report we have further established that, when used in conjuction with a state-of-the-art parallel I/O system, as can be found on the premier DOE peta-scale platform, the scaling properties of the overall analysis pipeline comprising parallel data access routines degrade rapidly. This finding is problematic and must be addressed if peta-scale data analysis is to be made scalable, or even possible. In order to attempt to address these parallel I/O limitations, we will investigate the use the Adaptable IO System (ADIOS) [LZL+10] to improve I/O performance, while maintaining flexibility for a variety of IO options, such MPI IO, POSIX IO. This system is developed at ORNL and other collaborating institutions, and is being tested extensively on Jaguar-pf. Simulation code being developed on these systems will also use ADIOS to output the data thereby making it easier for other systems, such as ours, to process that data.

  12. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  13. Parallel In Situ Indexing for Data-intensive Computing

    SciTech Connect (OSTI)

    Kim, Jinoh; Abbasi, Hasan; Chacon, Luis; Docan, Ciprian; Klasky, Scott; Liu, Qing; Podhorszki, Norbert; Shoshani, Arie; Wu, Kesheng

    2011-09-09

    As computing power increases exponentially, vast amount of data is created by many scientific re- search activities. However, the bandwidth for storing the data to disks and reading the data from disks has been improving at a much slower pace. These two trends produce an ever-widening data access gap. Our work brings together two distinct technologies to address this data access issue: indexing and in situ processing. From decades of database research literature, we know that indexing is an effective way to address the data access issue, particularly for accessing relatively small fraction of data records. As data sets increase in sizes, more and more analysts need to use selective data access, which makes indexing an even more important for improving data access. The challenge is that most implementations of in- dexing technology are embedded in large database management systems (DBMS), but most scientific datasets are not managed by any DBMS. In this work, we choose to include indexes with the scientific data instead of requiring the data to be loaded into a DBMS. We use compressed bitmap indexes from the FastBit software which are known to be highly effective for query-intensive workloads common to scientific data analysis. To use the indexes, we need to build them first. The index building procedure needs to access the whole data set and may also require a significant amount of compute time. In this work, we adapt the in situ processing technology to generate the indexes, thus removing the need of read- ing data from disks and to build indexes in parallel. The in situ data processing system used is ADIOS, a middleware for high-performance I/O. Our experimental results show that the indexes can improve the data access time up to 200 times depending on the fraction of data selected, and using in situ data processing system can effectively reduce the time needed to create the indexes, up to 10 times with our in situ technique when using identical parallel settings.

  14. Parallel 3-D method of characteristics in MPACT

    SciTech Connect (OSTI)

    Kochunas, B.; Dovvnar, T. J.; Liu, Z.

    2013-07-01

    A new parallel 3-D MOC kernel has been developed and implemented in MPACT which makes use of the modular ray tracing technique to reduce computational requirements and to facilitate parallel decomposition. The parallel model makes use of both distributed and shared memory parallelism which are implemented with the MPI and OpenMP standards, respectively. The kernel is capable of parallel decomposition of problems in space, angle, and by characteristic rays up to 0(104) processors. Initial verification of the parallel 3-D MOC kernel was performed using the Takeda 3-D transport benchmark problems. The eigenvalues computed by MPACT are within the statistical uncertainty of the benchmark reference and agree well with the averages of other participants. The MPACT k{sub eff} differs from the benchmark results for rodded and un-rodded cases by 11 and -40 pcm, respectively. The calculations were performed for various numbers of processors and parallel decompositions up to 15625 processors; all producing the same result at convergence. The parallel efficiency of the worst case was 60%, while very good efficiency (>95%) was observed for cases using 500 processors. The overall run time for the 500 processor case was 231 seconds and 19 seconds for the case with 15625 processors. Ongoing work is focused on developing theoretical performance models and the implementation of acceleration techniques to minimize the number of iterations to converge. (authors)

  15. Monthly Generation System Peak (pbl/generation)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Generation > Generation Hydro Power Wind Power Monthly GSP BPA White Book Dry Year Tools Firstgov Monthly Generation System Peak (GSP) This site is no longer maintained. Page last...

  16. SWAMP+: multiple subsequence alignment using associative massive parallelism

    SciTech Connect (OSTI)

    Steinfadt, Shannon Irene [Los Alamos National Laboratory; Baker, Johnnie W [KENT STATE UNIV.

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  17. Broadcasting collective operation contributions throughout a parallel computer

    DOE Patents [OSTI]

    Faraj, Ahmad

    2012-02-21

    Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.

  18. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Olivier, Stephen L.; de Supinski, Bronis R.; Schulz, Martin; Prins, Jan F.

    2013-01-01

    Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMAmore » systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.« less

  19. Parallel architecture for real-time simulation. Master's thesis

    SciTech Connect (OSTI)

    Cockrell, C.D.

    1989-01-01

    This thesis is concerned with the development of a very fast and highly efficient parallel computer architecture for real-time simulation of continuous systems. Currently, several parallel processing systems exist that may be capable of executing a complex simulation in real-time. These systems are examined and the pros and cons of each system discussed. The thesis then introduced a custom-designed parallel architecture based upon The University of Alabama's OPERA architecture. Each component of this system is discussed and rationale presented for its selection. The problem selected, real-time simulation of the Space Shuttle Main Engine for the test and evaluation of the proposed architecture, is explored, identifying the areas where parallelism can be exploited and parallel processing applied. Results from the test and evaluation phase are presented and compared with the results of the same problem that has been processed on a uniprocessor system.

  20. Microelectromechanical power generator and vibration sensor

    DOE Patents [OSTI]

    Roesler, Alexander W.; Christenson, Todd R.

    2006-11-28

    A microelectromechanical (MEM) apparatus is disclosed which can be used to generate electrical power in response to an external source of vibrations, or to sense the vibrations and generate an electrical output voltage in response thereto. The MEM apparatus utilizes a meandering electrical pickup located near a shuttle which holds a plurality of permanent magnets. Upon movement of the shuttle in response to vibrations coupled thereto, the permanent magnets move in a direction substantially parallel to the meandering electrical pickup, and this generates a voltage across the meandering electrical pickup. The MEM apparatus can be fabricated by LIGA or micromachining.

  1. Magnetocumulative generator

    DOE Patents [OSTI]

    Pettibone, Joseph S. (Livermore, CA); Wheeler, Paul C. (Livermore, CA)

    1983-01-01

    An improved magnetocumulative generator is described that is useful for producing magnetic fields of very high energy content over large spatial volumes. The polar directed pleated magnetocumulative generator has a housing (100, 101, 102, 103, 104, 105) providing a housing chamber (106) with an electrically conducting surface. The chamber (106) forms a coaxial system having a small radius portion and a large radius portion. When a magnetic field is injected into the chamber (106), from an external source, most of the magnetic flux associated therewith positions itself in the small radius portion. The propagation of an explosive detonation through high-explosive layers (107, 108) disposed adjacent to the housing causes a phased closure of the chamber (106) which sweeps most of the magnetic flux into the large radius portion of the coaxial system. The energy content of the magnetic field is greatly increased by flux stretching as well as by flux compression. The energy enhanced magnetic field is utilized within the housing chamber itself.

  2. Formation of electron kappa distributions due to interactions with parallel propagating whistler waves

    SciTech Connect (OSTI)

    Tao, X. Lu, Q.; Mengcheng National Geophysical Observatory, School of Earth and Space Sciences, University of Science and Technology of China, Hefei, Anhui 230026

    2014-02-15

    In space plasmas, charged particles are frequently observed to possess a high-energy tail, which is often modeled by a kappa-type distribution function. In this work, the formation of the electron kappa distribution in generation of parallel propagating whistler waves is investigated using fully nonlinear particle-in-cell (PIC) simulations. A previous research concluded that the bi-Maxwellian character of electron distributions is preserved in PIC simulations. We now demonstrate that for interactions between electrons and parallel propagating whistler waves, a non-Maxwellian high-energy tail can be formed, and a kappa distribution can be used to fit the electron distribution in time-asymptotic limit. The ?-parameter is found to decrease with increasing initial temperature anisotropy or decreasing ratio of electron plasma frequency to cyclotron frequency. The results might be helpful to understanding the origin of electron kappa distributions observed in space plasmas.

  3. Analysis and selection of optimal function implementations in massively parallel computer

    DOE Patents [OSTI]

    Archer, Charles Jens; Peters, Amanda; Ratterman, Joseph D.

    2011-05-31

    An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.

  4. Parallel 3D Finite Element Particle-in-Cell Simulations with Pic3P

    SciTech Connect (OSTI)

    Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; Ko, K.; Ben-Zvi, I.; Kewisch, J.; /Brookhaven

    2009-06-19

    SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic Particle-In-Cell code Pic3P. Designed for simulations of beam-cavity interactions dominated by space charge effects, Pic3P solves the complete set of Maxwell-Lorentz equations self-consistently and includes space-charge, retardation and boundary effects from first principles. Higher-order Finite Element methods with adaptive refinement on conformal unstructured meshes lead to highly efficient use of computational resources. Massively parallel processing with dynamic load balancing enables large-scale modeling of photoinjectors with unprecedented accuracy, aiding the design and operation of next-generation accelerator facilities. Applications include the LCLS RF gun and the BNL polarized SRF gun.

  5. Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

    SciTech Connect (OSTI)

    Candel, A; Kabel, A.; Lee, L.; Li, Z.; Ng, C.; Schussman, G.; Ko, K.; Syratchev, I.; /CERN

    2009-06-19

    In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.

  6. Triboelectric generator

    DOE Patents [OSTI]

    Wang, Zhong L; Fan, Fengru; Lin, Long; Zhu, Guang; Pan, Caofeng; Zhou, Yusheng

    2015-11-03

    A generator includes a thin first contact charging layer and a thin second contact charging layer. The thin first contact charging layer includes a first material that has a first rating on a triboelectric series. The thin first contact charging layer has a first side with a first conductive electrode applied thereto and an opposite second side. The thin second contact charging layer includes a second material that has a second rating on a triboelectric series that is more negative than the first rating. The thin first contact charging layer has a first side with a first conductive electrode applied thereto and an opposite second side. The thin second contact charging layer is disposed adjacent to the first contact charging layer so that the second side of the second contact charging layer is in contact with the second side of the first contact charging layer.

  7. Xyce Parallel Electronic Simulator : users' guide, version 4.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-02-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  8. Xyce parallel electronic simulator : users' guide. Version 5.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-11-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  9. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    SciTech Connect (OSTI)

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  10. Parallel object-oriented data mining system

    DOE Patents [OSTI]

    Kamath, Chandrika; Cantu-Paz, Erick

    2004-01-06

    A data mining system uncovers patterns, associations, anomalies and other statistically significant structures in data. Data files are read and displayed. Objects in the data files are identified. Relevant features for the objects are extracted. Patterns among the objects are recognized based upon the features. Data from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) sky survey was used to search for bent doubles. This test was conducted on data from the Very Large Array in New Mexico which seeks to locate a special type of quasar (radio-emitting stellar object) called bent doubles. The FIRST survey has generated more than 32,000 images of the sky to date. Each image is 7.1 megabytes, yielding more than 100 gigabytes of image data in the entire data set.

  11. Nemesis I: Parallel Enhancements to ExodusII

    Energy Science and Technology Software Center (OSTI)

    2006-03-28

    NEMESIS I is an enhancement to the EXODUS II finite element database model used to store and retrieve data for unstructured parallel finite element analyses. NEMESIS I adds data structures which facilitate the partitioning of a scalar (standard serial) EXODUS II file onto parallel disk systems found on many parallel computers. Since the NEMESIS I application programming interface (APl)can be used to append information to an existing EXODUS II files can be used on filesmore » which contain NEMESIS I information. The NEMESIS I information is written and read via C or C++ callable functions which compromise the NEMESIS I API.« less

  12. A Framework for Parallel Nonlinear Optimization by Partitioning Localized Constraints

    SciTech Connect (OSTI)

    Xu, You; Chen, Yixin

    2008-06-28

    We present a novel parallel framework for solving large-scale continuous nonlinear optimization problems based on constraint partitioning. The framework distributes constraints and variables to parallel processors and uses an existing solver to handle the partitioned subproblems. In contrast to most previous decomposition methods that require either separability or convexity of constraints, our approach is based on a new constraint partitioning theory and can handle nonconvex problems with inseparable global constraints. We also propose a hypergraph partitioning method to recognize the problem structure. Experimental results show that the proposed parallel algorithm can efficiently solve some difficult test cases.

  13. pcircle - A Suite of Scalable Parallel File System Tools

    Energy Science and Technology Software Center (OSTI)

    2015-10-01

    Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as asyncmore » progress report, checkpoint and restart, as well as integrity checking.« less

  14. TECA: A Parallel Toolkit for Extreme Climate Analysis

    SciTech Connect (OSTI)

    Prabhat, Mr; Ruebel, Oliver; Byna, Surendra; Wu, Kesheng; Li, Fuyu; Wehner, Michael; Bethel, E. Wes

    2012-03-12

    We present TECA, a parallel toolkit for detecting extreme events in large climate datasets. Modern climate datasets expose parallelism across a number of dimensions: spatial locations, timesteps and ensemble members. We design TECA to exploit these modes of parallelism and demonstrate a prototype implementation for detecting and tracking three classes of extreme events: tropical cyclones, extra-tropical cyclones and atmospheric rivers. We process a modern TB-sized CAM5 simulation dataset with TECA, and demonstrate good runtime performance for the three case studies.

  15. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-11

    Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  16. Methods and apparatus for capture and storage of semantic information with sub-files in a parallel computing system

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Torres, Aaron

    2015-02-03

    Techniques are provided for storing files in a parallel computing system using sub-files with semantically meaningful boundaries. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a plurality of sub-files. The method comprises the steps of obtaining a user specification of semantic information related to the file; providing the semantic information as a data structure description to a data formatting library write function; and storing the semantic information related to the file with one or more of the sub-files in one or more storage nodes of the parallel computing system. The semantic information provides a description of data in the file. The sub-files can be replicated based on semantically meaningful boundaries.

  17. A Comprehensive Look at High Performance Parallel I/O

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    In this era of "big data," high performance parallel IO-the way disk drives efficiently read and write information on HPC systems-is extremely important. Yet the last book to ...

  18. Parallel Botulinum Neurotoxin/A Immuno- and Enzyme Activity Assays...

    Office of Scientific and Technical Information (OSTI)

    Title: Parallel Botulinum NeurotoxinA Immuno- and Enzyme Activity Assays Using the Versatile RapiDx Platform. Abstract not provided. Authors: Sommer, Gregory Jon ; Wang, Ying-Chih ...

  19. Massively Parallel Models of the Human Circulatory System (Conference) |

    Office of Scientific and Technical Information (OSTI)

    SciTech Connect Massively Parallel Models of the Human Circulatory System Citation Details In-Document Search Title: Massively Parallel Models of the Human Circulatory System Authors: Randles, A ; Draeger, E W ; Oppelstrup, T ; Krauss, W ; Gunnels, J Publication Date: 2015-04-24 OSTI Identifier: 1241975 Report Number(s): LLNL-CONF-670030 DOE Contract Number: AC52-07NA27344 Resource Type: Conference Resource Relation: Conference: Presented at: Supercomputing 2015, Austin, TX, United States,

  20. Mesoscale Simulations of Particulate Flows with Parallel Distributed

    Office of Scientific and Technical Information (OSTI)

    Lagrange Multiplier Technique (Conference) | SciTech Connect Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique Citation Details In-Document Search Title: Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique Fluid particulate flows are common phenomena in nature and industry. Modeling of such flows at micro and macro levels as well establishing relationships between these approaches are needed to

  1. Mesoscale simulations of particulate flows with parallel distributed

    Office of Scientific and Technical Information (OSTI)

    Lagrange multiplier technique (Journal Article) | SciTech Connect Journal Article: Mesoscale simulations of particulate flows with parallel distributed Lagrange multiplier technique Citation Details In-Document Search Title: Mesoscale simulations of particulate flows with parallel distributed Lagrange multiplier technique Authors: Kanarska, Y ; Lomov, I ; Antoun, T Publication Date: 2010-09-10 OSTI Identifier: 1120915 Report Number(s): LLNL-JRNL-455392 DOE Contract Number: W-7405-ENG-48

  2. Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers Print Wednesday, 30 August 2006 00:00 Cooling an antiferromagnetic-ferromagnetic bilayer in a magnetic field typically results in a remanent (zero-field) magnetization in the ferromagnet (FM) that is always in the direction of the field during cooling (positive Mrem). Strikingly, when FeF2 is the antiferromagnet (AF), cooling in a field can lead to a remanent

  3. Interface for Parallel I/O from Componentized Visualization Algorithms

    Energy Science and Technology Software Center (OSTI)

    2008-09-16

    The software is an interface layer over file I/O with features specifically designed for efficient parallel reads and writes. The interface provides multiple concrete implementations that easily allow the replacement of one interface with another. This feature allows a reader or writer implementation to work independently of whether parallel file I/O is available or desired. The software also contains extensions to some readers to allow it to use the file I/O functionality.

  4. Discontinuous Methods for Accurate, Massively Parallel Quantum Molecular

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Dynamics John Pask is Lead Prinicipal Investigator for Discontinuous Methods for Accurate, Massively Parallel Quantum Molecular Dynamics. Discontinuous Methods for Accurate, Massively Parallel Quantum Molecular Dynamics Research We develop and apply a recent breakthrough, the Discontinuous Galerkin electronic structure method, to reach for the first time the required length and time scales to attain a detailed quantum mechanical understanding of the chemistry and dynamics at the SEI layer in

  5. Parallel Integral Curves (Book) | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    SciTech Connect Search Results Book: Parallel Integral Curves Citation Details In-Document Search Title: Parallel Integral Curves Authors: Pugmire, Dave [1] ; Peterka, Tom [2] ; Garth, Christoph [3] + Show Author Affiliations ORNL Argonne National Laboratory (ANL) unknown Publication Date: 2012-01-01 OSTI Identifier: 1096343 DOE Contract Number: DE-AC05-00OR22725 Resource Type: Book Publisher: Chapman & Hall/CRC Press, Tampa, FL, USA Research Org: Oak Ridge National Laboratory (ORNL)

  6. Clock Agreement Among Parallel Supercomputer Nodes (Dataset) | SciTech

    Office of Scientific and Technical Information (OSTI)

    Connect Dataset: Clock Agreement Among Parallel Supercomputer Nodes Citation Details In-Document Search Title: Clock Agreement Among Parallel Supercomputer Nodes This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines

  7. Parallel ptychographic reconstruction (Journal Article) | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    SciTech Connect Search Results Journal Article: Parallel ptychographic reconstruction Citation Details In-Document Search Title: Parallel ptychographic reconstruction Authors: Nashed, Youssef S. G. ; Vine, David J. ; Peterka, Tom ; Deng, Junjing ; Ross, Rob ; Jacobsen, Chris Publication Date: 2014-12-19 OSTI Identifier: 1222306 Grant/Contract Number: FC02-06ER25777 Type: Published Article Journal Name: Optics Express Additional Journal Information: Journal Volume: 22; Journal Issue: 26; Journal

  8. The structural simulation toolkit :a tool for exploring parallel

    Office of Scientific and Technical Information (OSTI)

    architectures and applications. (Conference) | SciTech Connect structural simulation toolkit :a tool for exploring parallel architectures and applications. Citation Details In-Document Search Title: The structural simulation toolkit :a tool for exploring parallel architectures and applications. No abstract prepared. Authors: Kogge, Peter [1] ; Murphy, Richard C. ; Rodrigues, Arun F. ; Underwood, Keith Douglas + Show Author Affiliations (Univeristy of Notre Dame, Notre Dame, IN) Publication

  9. Long-range triplet supercurrents induced by singlet supercurrents parallel to magnetic interfaces

    SciTech Connect (OSTI)

    Alidoust, Mohammad; Halterman, Klaus

    2014-11-17

    Employing a spin-parameterized Keldysh-Usadel technique for the diffusive regime, we demonstrate that even in the low proximity limit, considerable long-ranged triplet supercurrents can be effectively generated by spin-singlet supercurrents flowing parallel to the interfaces of uniform double ferromagnet interlayers with noncollinear exchange fields independent of actual junction geometry. The triplet supercurrents are found to be most pronounced when the thicknesses of the ferromagnet strips are unequal. To experimentally verify this generic phenomenon, we propose an accessible and controllable structure that can fully isolate the long-range triplet effects.

  10. Sort-First, Distributed Memory Parallel Visualization and Rendering

    SciTech Connect (OSTI)

    Bethel, E. Wes; Humphreys, Greg; Paul, Brian; Brederson, J. Dean

    2003-07-15

    While commodity computing and graphics hardware has increased in capacity and dropped in cost, it is still quite difficult to make effective use of such systems for general-purpose parallel visualization and graphics. We describe the results of a recent project that provides a software infrastructure suitable for general-purpose use by parallel visualization and graphics applications. Our work combines and extends two technologies: Chromium, a stream-oriented framework that implements the OpenGL programming interface; and OpenRM Scene Graph, a pipelined-parallel scene graph interface for graphics data management. Using this combination, we implement a sort-first, distributed memory, parallel volume rendering application. We describe the performance characteristics in terms of bandwidth requirements and highlight key algorithmic considerations needed to implement the sort-first system. We characterize system performance using a distributed memory parallel volume rendering application, a nd present performance gains realized by using scene specific knowledge to accelerate rendering through reduced network bandwidth. The contribution of this work is an exploration of general-purpose, sort-first architecture performance characteristics as applied to distributed memory, commodity hardware, along with a description of the algorithmic support needed to realize parallel, sort-first implementations.

  11. Allinea DDT as a Parallel Debugging Alternative to Totalview

    SciTech Connect (OSTI)

    Antypas, K.B.

    2007-03-05

    Totalview, from the Etnus Corporation, is a sophisticated and feature rich software debugger for parallel applications. As Totalview has gained in popularity and market share its pricing model has increased to the point where it is often prohibitively expensive for massively parallel supercomputers. Additionally, many of Totalview's advanced features are not used by members of the scientific computing community. For these reasons, supercomputing centers have begun to search for a basic parallel debugging tool which can be used as an alternative to Totalview. As the cost and complexity of Totalview has increased over the years, scientific computing centers have started searching for a viable parallel debugging alternative. DDT (Distributed Debugging Tool) from Allinea Software is a relatively new parallel debugging tool which aims to provide much of the same functionality as Totalview. This review outlines the basic features and limitations of DDT to determine if it can be a reasonable substitute for Totalview. DDT was tested on the NERSC platforms Bassi, Seaborg, Jacquard and Davinci with Fortran90, C, and C++ codes using MPI and OpenMP for parallelism.

  12. Effect on Non-Uniform Heat Generation on Thermionic Reactions

    SciTech Connect (OSTI)

    Schock, Alfred

    2012-01-19

    The penalty resulting from non-uniform heat generation in a thermionic reactor is examined. Operation at sub-optimum cesium pressure is shown to reduce this penalty, but at the risk of a condition analogous to burnout. For high pressure diodes, a simple empirical correlation between current, voltage and heat flux is developed and used to analyze the performance penalty associated with two different heat flux profiles, for series-and parallel-connected converters. The results demonstrate that series-connected converters require much finer power flattening than parallel converters. For example, a 10% variation in heat generation across a series array can result in a 25 to 50% power penalty.

  13. Parallel Tensor Compression for Large-Scale Scientific Data.

    SciTech Connect (OSTI)

    Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan

    2015-10-01

    As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.

  14. A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; Shephard, Mark S.

    2013-01-01

    Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order inmore » a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.« less

  15. final report for Center for Programming Models for Scalable Parallel Computing

    SciTech Connect (OSTI)

    Johnson, Ralph E

    2013-04-10

    This is the final report of the work on parallel programming patterns that was part of the Center for Programming Models for Scalable Parallel Computing

  16. Parallel garbage collection on a virtual memory system

    SciTech Connect (OSTI)

    Abraham, S.G.; Patel, J.H.

    1987-01-01

    Since most artificial intelligence applications are programmed in list processing languages, it is important to design architectures to support efficient garbage collection. This paper presents an architecture and an associated algorithm for parallel garbage collection on a virtual memory system. All the previously proposed parallel algorithms attempt to collect cells released by the list processor during the garbage collection cycle. We do not attempt to collect such cells. As a consequence, the list processor incurs little overhead in the proposed scheme, since it need not synchronize with the collector. Most parallel algorithms are designed for shared memory machines which have certain implicit synchronization functions on variable access. The proposed algorithm is designed for virtual memory systems where both the list processor and the garbage collector have private memories. The enforcement of coherence between the two private memories can be expensive and is not necessary in our scheme. 15 refs., 3 figs.

  17. Parallel halo finding in N-body cosmology simulations

    SciTech Connect (OSTI)

    Pfitzner, D.W.; Salmon, J.K.

    1996-12-31

    Cosmological N-body simulations on parallel computers produce large datasets - about five hundred Megabytes at a single output time, or tens of Gigabytes over the course of a simulation. These large datasets require further analysis before they can be compared to astronomical observations. We have implemented two methods for performing halo finding, a key part of the knowledge discovery process, on parallel machines. One of these is a parallel implementation of the friends of friends (FOF) algorithm, widely used in the field of N-body cosmology. The new isodensity (ID) method has been developed to overcome some of the shortcomings of FOR Both have been implemented on a variety of computer systems, and successfully used to extract halos from simulations with up to 256{sup 3} (or about 16.8 million) particles, which axe among the largest N-body cosmology simulations in existence.

  18. Parallel vacuum arc discharge with microhollow array dielectric and anode

    SciTech Connect (OSTI)

    Feng, Jinghua; Zhou, Lin; Fu, Yuecheng; Zhang, Jianhua; Xu, Rongkun; Chen, Faxin; Li, Linbo; Meng, Shijian

    2014-07-15

    An electrode configuration with microhollow array dielectric and anode was developed to obtain parallel vacuum arc discharge. Compared with the conventional electrodes, more than 10 parallel microhollow discharges were ignited for the new configuration, which increased the discharge area significantly and made the cathode eroded more uniformly. The vacuum discharge channel number could be increased effectively by decreasing the distances between holes or increasing the arc current. Experimental results revealed that plasmas ejected from the adjacent hollow and the relatively high arc voltage were two key factors leading to the parallel discharge. The characteristics of plasmas in the microhollow were investigated as well. The spectral line intensity and electron density of plasmas in microhollow increased obviously with the decease of the microhollow diameter.

  19. Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

    SciTech Connect (OSTI)

    Skinner, David; Verdier, Francesca; Anand, Harsh; Carter,Jonathan; Durst, Mark; Gerber, Richard

    2005-03-05

    This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems. An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.

  20. Coiled transmission line pulse generators

    DOE Patents [OSTI]

    McDonald, Kenneth Fox

    2010-11-09

    Methods and apparatus are provided for fabricating and constructing solid dielectric "Coiled Transmission Line" pulse generators in radial or axial coiled geometries. The pour and cure fabrication process enables a wide variety of geometries and form factors. The volume between the conductors is filled with liquid blends of monomers, polymers, oligomers, and/or cross-linkers and dielectric powders; and then cured to form high field strength and high dielectric constant solid dielectric transmission lines that intrinsically produce ideal rectangular high voltage pulses when charged and switched into matched impedance loads. Voltage levels may be increased by Marx and/or Blumlein principles incorporating spark gap or, preferentially, solid state switches (such as optically triggered thyristors) which produce reliable, high repetition rate operation. Moreover, these Marxed pulse generators can be DC charged and do not require additional pulse forming circuitry, pulse forming lines, transformers, or an a high voltage spark gap output switch. The apparatus accommodates a wide range of voltages, impedances, pulse durations, pulse repetition rates, and duty cycles. The resulting mobile or flight platform friendly cylindrical geometric configuration is much more compact, light-weight, and robust than conventional linear geometries, or pulse generators constructed from conventional components. Installing additional circuitry may accommodate optional pulse shape improvements. The Coiled Transmission Lines can also be connected in parallel to decrease the impedance, or in series to increase the pulse length.

  1. Methods for operating parallel computing systems employing sequenced communications

    DOE Patents [OSTI]

    Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

    1999-01-01

    A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.

  2. Methods for operating parallel computing systems employing sequenced communications

    DOE Patents [OSTI]

    Benner, R.E.; Gustafson, J.L.; Montry, G.R.

    1999-08-10

    A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.

  3. Building a Parallel Cloud Storage System using OpenStacks Swift Object Store and Transformative Parallel I/O

    SciTech Connect (OSTI)

    Burns, Andrew J.; Lora, Kaleb D.; Martinez, Esteban; Shorter, Martel L.

    2012-07-30

    Our project consists of bleeding-edge research into replacing the traditional storage archives with a parallel, cloud-based storage solution. It used OpenStack's Swift Object Store cloud software. It's Benchmarked Swift for write speed and scalability. Our project is unique because Swift is typically used for reads and we are mostly concerned with write speeds. Cloud Storage is a viable archive solution because: (1) Container management for larger parallel archives might ease the migration workload; (2) Many tools that are written for cloud storage could be utilized for local archive; and (3) Current large cloud storage practices in industry could be utilized to manage a scalable archive solution.

  4. Parallel heat transport in integrable and chaotic magnetic fields

    SciTech Connect (OSTI)

    Del-Castillo-Negrete, Diego B [ORNL; Chacon, Luis [ORNL

    2012-01-01

    The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly chal- lenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), , and the perpendicular, , conductivities ( / may exceed 1010 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green s function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geom- etry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island chain), weakly chaotic (devil s staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local closures, is non-diffusive, thus casting doubts on the appropriateness of the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.

  5. A massively parallel fractional step solver for incompressible flows

    SciTech Connect (OSTI)

    Houzeaux, G. Vazquez, M. Aubry, R. Cela, J.M.

    2009-09-20

    This paper presents a parallel implementation of fractional solvers for the incompressible Navier-Stokes equations using an algebraic approach. Under this framework, predictor-corrector and incremental projection schemes are seen as sub-classes of the same class, making apparent its differences and similarities. An additional advantage of this approach is to set a common basis for a parallelization strategy, which can be extended to other split techniques or to compressible flows. The predictor-corrector scheme consists in solving the momentum equation and a modified 'continuity' equation (namely a simple iteration for the pressure Schur complement) consecutively in order to converge to the monolithic solution, thus avoiding fractional errors. On the other hand, the incremental projection scheme solves only one iteration of the predictor-corrector per time step and adds a correction equation to fulfill the mass conservation. As shown in the paper, these two schemes are very well suited for massively parallel implementation. In fact, when compared with monolithic schemes, simpler solvers and preconditioners can be used to solve the non-symmetric momentum equations (GMRES, Bi-CGSTAB) and to solve the symmetric continuity equation (CG, Deflated CG). This gives good speedup properties of the algorithm. The implementation of the mesh partitioning technique is presented, as well as the parallel performances and speedups for thousands of processors.

  6. An intercalation-locked parallel-stranded DNA tetraplex

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Tripathi, S.; Zhang, D.; Paukstelis, P. J.

    2015-01-27

    DNA has proved to be an excellent material for nanoscale construction because complementary DNA duplexes are programmable and structurally predictable. However, in the absence of Watson–Crick pairings, DNA can be structurally more diverse. Here, we describe the crystal structures of d(ACTCGGATGAT) and the brominated derivative, d(ACBrUCGGABrUGAT). These oligonucleotides form parallel-stranded duplexes with a crystallographically equivalent strand, resulting in the first examples of DNA crystal structures that contains four different symmetric homo base pairs. Two of the parallel-stranded duplexes are coaxially stacked in opposite directions and locked together to form a tetraplex through intercalation of the 5'-most A–A base pairs betweenmore » adjacent G–G pairs in the partner duplex. The intercalation region is a new type of DNA tertiary structural motif with similarities to the i-motif. 1H–1H nuclear magnetic resonance and native gel electrophoresis confirmed the formation of a parallel-stranded duplex in solution. Finally, we modified specific nucleotide positions and added d(GAY) motifs to oligonucleotides and were readily able to obtain similar crystals. This suggests that this parallel-stranded DNA structure may be useful in the rational design of DNA crystals and nanostructures.« less

  7. Hardware packet pacing using a DMA in a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Heidelberger, Phillip; Vranas, Pavlos

    2013-08-13

    Method and system for hardware packet pacing using a direct memory access controller in a parallel computer which, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter.

  8. An intercalation-locked parallel-stranded DNA tetraplex

    SciTech Connect (OSTI)

    Tripathi, S.; Zhang, D.; Paukstelis, P. J.

    2015-01-27

    DNA has proved to be an excellent material for nanoscale construction because complementary DNA duplexes are programmable and structurally predictable. However, in the absence of Watson–Crick pairings, DNA can be structurally more diverse. Here, we describe the crystal structures of d(ACTCGGATGAT) and the brominated derivative, d(ACBrUCGGABrUGAT). These oligonucleotides form parallel-stranded duplexes with a crystallographically equivalent strand, resulting in the first examples of DNA crystal structures that contains four different symmetric homo base pairs. Two of the parallel-stranded duplexes are coaxially stacked in opposite directions and locked together to form a tetraplex through intercalation of the 5'-most A–A base pairs between adjacent G–G pairs in the partner duplex. The intercalation region is a new type of DNA tertiary structural motif with similarities to the i-motif. 1H–1H nuclear magnetic resonance and native gel electrophoresis confirmed the formation of a parallel-stranded duplex in solution. Finally, we modified specific nucleotide positions and added d(GAY) motifs to oligonucleotides and were readily able to obtain similar crystals. This suggests that this parallel-stranded DNA structure may be useful in the rational design of DNA crystals and nanostructures.

  9. Xyce Parallel Electronic Simulator Users Guide Version 6.2.

    SciTech Connect (OSTI)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Schiek, Richard; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason; Baur, David Gregory

    2014-09-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce%40sandia.gov (outside Sandia) xyce-sandia%40sandia.gov (Sandia only)

  10. Wind energy systems have low operating expenses because they have no fuel cost.

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Wind energy systems have low operating expenses because they have no fuel cost. Photo by Jenny Hager Photography, NREL 15990. 1. Wind energy is cost competitive with other fuel sources. The average levelized price of wind power purchase agree- ments signed in 2013 was approximately 2.5 cents per kilowatt-hour, a price that is not only cost competitive with new gas-fired power plants but also compares favorably to a range of fuel cost projections of gas-fired generation extending out through

  11. EERE Success Story-Nevada: Geothermal Brine Brings Low-Cost Power with

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Big Potential | Department of Energy Nevada: Geothermal Brine Brings Low-Cost Power with Big Potential EERE Success Story-Nevada: Geothermal Brine Brings Low-Cost Power with Big Potential August 21, 2013 - 12:00am Addthis Utilizing a $1 million EERE investment, heat from geothermal fluids-a byproduct of gold mining-will be generating electricity this year for less than $0.06 per kilowatt hour with ElectraTherm's new plug-and-play technology. Building on this first-of-its-kind success, this

  12. Electric power monthly, February 1999 with data for November 1998

    SciTech Connect (OSTI)

    1999-02-01

    The Electric Power Monthly presents monthly electricity statistics for a wide audience including Congress, Federal and State agencies, the electric utility industry, and the general public. The purpose of this publication is to provide energy decision makers with accurate and timely information that may be used in forming various perspectives on electric issues that lie ahead. Statistics are provided for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt-hour of electricity sold.

  13. Tax Incentives

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Tax Incentives of 1992, allows owners of qualified over a 10-year period. Qualified wind wind turbines (indexed for inflation). - The federal Renewable Electricity Production Tax Credit (PTC), established by the Energy Policy Act renewable energy facilities to receive tax credits for each kilowatt-hour (kWh) of electricity generated by the facility power projects are eligible to receive 2.3 cents per kWh for the produc - tion of electricity from utility-scale dsireusa.org/incentives/incentive.

  14. Solar Real-Time Pricing: Is Real-Time Electricity Pricing Beneficial to

    Broader source: Energy.gov (indexed) [DOE]

    1 of 5 More than 15,000 solar energy professionals from 75 countries were on hand at Solar Power International (SPI) in Anaheim, CA to network, share ideas, and participate in educational programming related to growing the American solar energy market. 2 of 5 SPI 2015 was held at the Anaheim Convention Center, which features a 2.4 MW solar PV system on its rooftop. The system generates an estimated 3.6 million kilowatt-hours of electricity annually - enough energy to power 600 homes for a year.

  15. Team Cumberland Presentation

    Office of Environmental Management (EM)

    Tax Incentives of 1992, allows owners of qualified over a 10-year period. Qualified wind wind turbines (indexed for inflation). - The federal Renewable Electricity Production Tax Credit (PTC), established by the Energy Policy Act renewable energy facilities to receive tax credits for each kilowatt-hour (kWh) of electricity generated by the facility power projects are eligible to receive 2.3 cents per kWh for the produc - tion of electricity from utility-scale dsireusa.org/incentives/incentive.

  16. Engineering innovation to reduce wind power COE

    SciTech Connect (OSTI)

    Ammerman, Curtt Nelson

    2011-01-10

    There are enough wind resources in the US to provide 10 times the electric power we currently use, however wind power only accounts for 2% of our total electricity production. One of the main limitations to wind use is cost. Wind power currently costs 5-to-8 cents per kilowatt-hour, which is more than twice the cost of electricity generated by burning coal. Our Intelligent Wind Turbine LDRD Project is applying LANL's leading-edge engineering expertise in modeling and simulation, experimental validation, and advanced sensing technologies to challenges faced in the design and operation of modern wind turbines.

  17. Explicit spatial scattering for load balancing in conservatively synchronized parallel discrete-event simulations

    SciTech Connect (OSTI)

    Thulasidasan, Sunil; Kasiviswanathan, Shiva; Eidenbenz, Stephan; Romero, Philip

    2010-01-01

    We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.

  18. Parallelization and checkpointing of GPU applications through program transformation

    SciTech Connect (OSTI)

    Solano-Quinde, Lizandro Dami#19;an

    2012-11-15

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have beneffited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and to develop support for application-level fault tolerance in applications using multiple GPUs. Our techniques reduce the burden of enhancing single-GPU applications to support these features. To achieve our goal, this work designs and implements a framework for enhancing a single-GPU OpenCL application through application transformation.

  19. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; Anderson, Steve; Woodward, Paul; Dietz, Hank

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  20. Parallel, distributed and GPU computing technologies in single-particle electron microscopy

    SciTech Connect (OSTI)

    Schmeisser, Martin; Heisen, Burkhard C.; Luettich, Mario; Busche, Boris; Hauer, Florian; Koske, Tobias; Knauber, Karl-Heinz; Stark, Holger

    2009-07-01

    An introduction to the current paradigm shift towards concurrency in software. Most known methods for the determination of the structure of macromolecular complexes are limited or at least restricted at some point by their computational demands. Recent developments in information technology such as multicore, parallel and GPU processing can be used to overcome these limitations. In particular, graphics processing units (GPUs), which were originally developed for rendering real-time effects in computer games, are now ubiquitous and provide unprecedented computational power for scientific applications. Each parallel-processing paradigm alone can improve overall performance; the increased computational performance obtained by combining all paradigms, unleashing the full power of todays technology, makes certain applications feasible that were previously virtually impossible. In this article, state-of-the-art paradigms are introduced, the tools and infrastructure needed to apply these paradigms are presented and a state-of-the-art infrastructure and solution strategy for moving scientific applications to the next generation of computer hardware is outlined.

  1. Establishing a group of endpoints in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.; Xue, Hanhong

    2016-02-02

    A parallel computer executes a number of tasks, each task includes a number of endpoints and the endpoints are configured to support collective operations. In such a parallel computer, establishing a group of endpoints receiving a user specification of a set of endpoints included in a global collection of endpoints, where the user specification defines the set in accordance with a predefined virtual representation of the endpoints, the predefined virtual representation is a data structure setting forth an organization of tasks and endpoints included in the global collection of endpoints and the user specification defines the set of endpoints without a user specification of a particular endpoint; and defining a group of endpoints in dependence upon the predefined virtual representation of the endpoints and the user specification.

  2. JPARSS: A Java Parallel Network Package for Grid Computing

    SciTech Connect (OSTI)

    Chen, Jie; Akers, Walter; Chen, Ying; Watson, William

    2002-03-01

    The emergence of high speed wide area networks makes grid computinga reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve bandwidth and to reduce latency on a high speed wide area network. This paper presents a Java package called JPARSS (Java Parallel Secure Stream (Socket)) that divides data into partitions that are sent over several parallel Java streams simultaneously and allows Java or Web applications to achieve optimal TCP performance in a grid environment without the necessity of tuning TCP window size. This package enables single sign-on, certificate delegation and secure or plain-text data transfer using several security components based on X.509 certificate and SSL. Several experiments will be presented to show that using Java parallelstreams is more effective than tuning TCP window size. In addition a simple architecture using Web services

  3. Beam Dynamics Studies of Parallel-Bar Deflecting Cavities

    SciTech Connect (OSTI)

    S. Ahmed, G. Krafft, K. Detrick, S. Silva, J. Delayen, M. Spata ,M. Tiefenback, A. Hofler ,K. Beard

    2011-03-01

    We have performed three-dimensional simulations of beam dynamics for parallel-bar transverse electromagnetic mode (TEM) type RF separators: normal- and super-conducting. The compact size of these cavities as compared to conventional TM$_{110}$ type structures is more attractive particularly at low frequency. Highly concentrated electromagnetic fields between the parallel bars provide strong electrical stability to the beam for any mechanical disturbance. An array of six 2-cell normal conducting cavities or a one- or two-cell superconducting structure are enough to produce the required vertical displacement at the Lambertson magnet. Both the normal and super-conducting structures show very small emittance dilution due to the vertical kick of the beam.

  4. Performance evaluation of a parallel sparse lattice Boltzmann solver

    SciTech Connect (OSTI)

    Axner, L. Bernsdorf, J. Zeiser, T. Lammers, P. Linxweiler, J. Hoekstra, A.G.

    2008-05-01

    We develop a performance prediction model for a parallelized sparse lattice Boltzmann solver and present performance results for simulations of flow in a variety of complex geometries. A special focus is on partitioning and memory/load balancing strategy for geometries with a high solid fraction and/or complex topology such as porous media, fissured rocks and geometries from medical applications. The topology of the lattice nodes representing the fluid fraction of the computational domain is mapped on a graph. Graph decomposition is performed with both multilevel recursive-bisection and multilevel k-way schemes based on modified Kernighan-Lin and Fiduccia-Mattheyses partitioning algorithms. Performance results and optimization strategies are presented for a variety of platforms, showing a parallel efficiency of almost 80% for the largest problem size. A good agreement between the performance model and experimental results is demonstrated.

  5. Ultrafast stimulated Raman parallel adiabatic passage by shaped pulses

    SciTech Connect (OSTI)

    Dridi, G.; Guerin, S.; Hakobyan, V.; Jauslin, H. R.; Eleuch, H.

    2009-10-15

    We present a general and versatile technique of population transfer based on parallel adiabatic passage by femtosecond shaped pulses. Their amplitude and phase are specifically designed to optimize the adiabatic passage corresponding to parallel eigenvalues at all times. We show that this technique allows the robust adiabatic population transfer in a Raman system with the total pulse area as low as 3{pi}, corresponding to a fluence of one order of magnitude below the conventional stimulated Raman adiabatic passage process. This process of short duration, typically picosecond and subpicosecond, is easily implementable with the modern pulse shaper technology and opens the possibility of ultrafast robust population transfer with interesting applications in quantum information processing.

  6. Methodology for Augmenting Existing Paths with Additional Parallel Transects

    SciTech Connect (OSTI)

    Wilson, John E.

    2013-09-30

    Visual Sample Plan (VSP) is sample planning software that is used, among other purposes, to plan transect sampling paths to detect areas that were potentially used for munition training. This module was developed for application on a large site where existing roads and trails were to be used as primary sampling paths. Gap areas between these primary paths needed to found and covered with parallel transect paths. These gap areas represent areas on the site that are more than a specified distance from a primary path. These added parallel paths needed to optionally be connected together into a single paththe shortest path possible. The paths also needed to optionally be attached to existing primary paths, again with the shortest possible path. Finally, the process must be repeatable and predictable so that the same inputs (primary paths, specified distance, and path options) will result in the same set of new paths every time. This methodology was developed to meet those specifications.

  7. Data-Parallel Mesh Connected Components Labeling and Analysis

    SciTech Connect (OSTI)

    Harrison, Cyrus; Childs, Hank; Gaither, Kelly

    2011-04-10

    We present a data-parallel algorithm for identifying and labeling the connected sub-meshes within a domain-decomposed 3D mesh. The identification task is challenging in a distributed-memory parallel setting because connectivity is transitive and the cells composing each sub-mesh may span many or all processors. Our algorithm employs a multi-stage application of the Union-find algorithm and a spatial partitioning scheme to efficiently merge information across processors and produce a global labeling of connected sub-meshes. Marking each vertex with its corresponding sub-mesh label allows us to isolate mesh features based on topology, enabling new analysis capabilities. We briefly discuss two specific applications of the algorithm and present results from a weak scaling study. We demonstrate the algorithm at concurrency levels up to 2197 cores and analyze meshes containing up to 68 billion cells.

  8. Performing a local reduction operation on a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Faraj, Daniel A.

    2012-12-11

    A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

  9. Parallel Algorithms for Graph Optimization using Tree Decompositions

    SciTech Connect (OSTI)

    Sullivan, Blair D; Weerapurage, Dinesh P; Groer, Christopher S

    2012-06-01

    Although many $\\cal{NP}$-hard graph optimization problems can be solved in polynomial time on graphs of bounded tree-width, the adoption of these techniques into mainstream scientific computation has been limited due to the high memory requirements of the necessary dynamic programming tables and excessive runtimes of sequential implementations. This work addresses both challenges by proposing a set of new parallel algorithms for all steps of a tree decomposition-based approach to solve the maximum weighted independent set problem. A hybrid OpenMP/MPI implementation includes a highly scalable parallel dynamic programming algorithm leveraging the MADNESS task-based runtime, and computational results demonstrate scaling. This work enables a significant expansion of the scale of graphs on which exact solutions to maximum weighted independent set can be obtained, and forms a framework for solving additional graph optimization problems with similar techniques.

  10. Administering truncated receive functions in a parallel messaging interface

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-12-09

    Administering truncated receive functions in a parallel messaging interface (`PMI`) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.

  11. Performing a local reduction operation on a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A; Faraj, Daniel A

    2013-06-04

    A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

  12. Coupled Serial and Parallel Non-uniform SQUIDs

    SciTech Connect (OSTI)

    Longhini, Patrick; In, Visarath; Berggren, Susan; Palacios, Antonio; Leese de Escobar, Anna

    2011-04-19

    In this work we numerical model series and parallel non-uniform superconducting quantum interference device (SQUID) array. Previous work has shown that series SQUID array constructed with a random distribution of loop sizes, (i.e. different areas for each SQUID loop) there exists a unique 'anti-peak' at the zero magnetic field for the voltage versus applied magnetic field (V-B). Similar results extend to a parallel SQUID array where the difference lies in the arrangement of the Josephson junctions. Other system parameter such as bias current, the number of loops, and mutual inductances are varied to demonstrate the change in dynamic range and linearity of the V-B response. Application of the SQUID array as a low noise amplifier (LNA) would increase link margins and affect the entire communication system. For unmanned aerial vehicles (UAVs), size, weight and power are limited, the SQUID array would allow use of practical 'electrically small' antennas that provide acceptable gain.

  13. Laser Safety Method For Duplex Open Loop Parallel Optical Link

    DOE Patents [OSTI]

    Baumgartner, Steven John; Hedin, Daniel Scott; Paschal, Matthew James

    2003-12-02

    A method and apparatus are provided to ensure that laser optical power does not exceed a "safe" level in an open loop parallel optical link in the event that a fiber optic ribbon cable is broken or otherwise severed. A duplex parallel optical link includes a transmitter and receiver pair and a fiber optic ribbon that includes a designated number of channels that cannot be split. The duplex transceiver includes a corresponding transmitter and receiver that are physically attached to each other and cannot be detached therefrom, so as to ensure safe, laser optical power in the event that the fiber optic ribbon cable is broken or severed. Safe optical power is ensured by redundant current and voltage safety checks.

  14. Scalable parallel solution coupling for multi-physics reactor simulation.

    SciTech Connect (OSTI)

    Tautges, T. J.; Caceres, A.; Mathematics and Computer Science

    2009-01-01

    Reactor simulation depends on the coupled solution of various physics types, including neutronics, thermal/hydraulics, and structural mechanics. This paper describes the formulation and implementation of a parallel solution coupling capability being developed for reactor simulation. The coupling process consists of mesh and coupler initialization, point location, field interpolation, and field normalization. We report here our test of this capability on an example problem, namely, a reflector assembly from an advanced burner test reactor. Performance of this coupler in parallel is reasonable for the chosen problem size and range of processor counts. The runtime is dominated by startup costs, which amortize over the entire coupled simulation. Future efforts will include adding more sophisticated interpolation and normalization methods, to accommodate different numerical solvers used in various physics modules and to obtain better conservation properties for certain field types.

  15. Parallel resistivity and ohmic heating of laboratory dipole plasmas

    SciTech Connect (OSTI)

    Fox, W.

    2012-08-15

    The parallel resistivity is calculated in the long-mean-free-path regime for the dipole plasma geometry; this is shown to be a neoclassical transport problem in the limit of a small number of circulating electrons. In this regime, the resistivity is substantially higher than the Spitzer resistivity due to the magnetic trapping of a majority of the electrons. This suggests that heating the outer flux surfaces of the plasma with low-frequency parallel electric fields can be substantially more efficient than might be naively estimated. Such a skin-current heating scheme is analyzed by deriving an equation for diffusion of skin currents into the plasma, from which quantities such as the resistive skin-depth, lumped-circuit impedance, and power deposited in the plasma can be estimated. Numerical estimates indicate that this may be a simple and efficient way to couple power into experiments in this geometry.

  16. Xyce Parallel Electronic Simulator : reference guide, version 4.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-02-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

  17. Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers Print Cooling an antiferromagnetic-ferromagnetic bilayer in a magnetic field typically results in a remanent (zero-field) magnetization in the ferromagnet (FM) that is always in the direction of the field during cooling (positive Mrem). Strikingly, when FeF2 is the antiferromagnet (AF), cooling in a field can lead to a remanent magnetization opposite to the field (negative Mrem). A collaboration led by researchers from the Stanford

  18. Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers Print Cooling an antiferromagnetic-ferromagnetic bilayer in a magnetic field typically results in a remanent (zero-field) magnetization in the ferromagnet (FM) that is always in the direction of the field during cooling (positive Mrem). Strikingly, when FeF2 is the antiferromagnet (AF), cooling in a field can lead to a remanent magnetization opposite to the field (negative Mrem). A collaboration led by researchers from the Stanford

  19. Multithreaded processor architecture for parallel symbolic computation. Technical report

    SciTech Connect (OSTI)

    Fujita, T.

    1987-09-01

    This paper describes the Multilisp Architecture for Symbolic Applications (MASA), which is a multithreaded processor architecture for parallel symbolic computation with various features intended for effective Multilisp program execution. The principal mechanisms exploited for this processor are multiple contexts, interleaved pipeline execution from separate instruction streams, and synchronization based on a bit in each memory cell. The tagged architecture approach is taken for Lisp program execution, and trap conditions are provided for future object manipulation and garbage collection.

  20. Xyce parallel electronic simulator reference guide, version 6.0.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G.

    2013-08-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1].

  1. Substantially Parallel Flux Uncluttered Rotor Machines (U-Machine) - Energy

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Innovation Portal Substantially Parallel Flux Uncluttered Rotor Machines (U-Machine) Oak Ridge National Laboratory Contact ORNL About This Technology Projected performance/speed characteristics of the novel U-machine motor Projected performance/speed characteristics of the novel U-machine motor Technology Marketing SummaryA general concern based on the supply and demand trend of the permanent magnet (PM) raw materials suggests the need for elimination of these materials from electric motors

  2. A Parallel Stochastic Framework for Reservoir Characterization and History Matching

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Thomas, Sunil G.; Klie, Hector M.; Rodriguez, Adolfo A.; Wheeler, Mary F.

    2011-01-01

    The spatial distribution of parameters that characterize the subsurface is never known to any reasonable level of accuracy required to solve the governing PDEs of multiphase flow or species transport through porous media. This paper presents a numerically cheap, yet efficient, accurate and parallel framework to estimate reservoir parameters, for example, medium permeability, using sensor information from measurements of the solution variables such as phase pressures, phase concentrations, fluxes, and seismic and well log data. Numerical results are presented to demonstrate the method.

  3. Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel and Antiparallel Interfacial Coupling in AF-FM Bilayers Print Cooling an antiferromagnetic-ferromagnetic bilayer in a magnetic field typically results in a remanent (zero-field) magnetization in the ferromagnet (FM) that is always in the direction of the field during cooling (positive Mrem). Strikingly, when FeF2 is the antiferromagnet (AF), cooling in a field can lead to a remanent magnetization opposite to the field (negative Mrem). A collaboration led by researchers from the Stanford

  4. Xyce parallel electronic simulator reference guide, version 6.1

    SciTech Connect (OSTI)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory

    2014-03-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .

  5. Electronically commutated serial-parallel switching for motor windings

    DOE Patents [OSTI]

    Hsu, John S. (Oak Ridge, TN)

    2012-03-27

    A method and a circuit for controlling an ac machine comprises controlling a full bridge network of commutation switches which are connected between a multiphase voltage source and the phase windings to switch the phase windings between a parallel connection and a series connection while providing commutation discharge paths for electrical current resulting from inductance in the phase windings. This provides extra torque for starting a vehicle from lower battery current.

  6. Parallel Element Agglomeration Algebraic Multigrid and Upscaling Library

    Energy Science and Technology Software Center (OSTI)

    2015-02-19

    ParFELAG is a parallel distributed memory C++ library for numerical upscaling of finite element discretizations. It provides optimal complesity algorithms ro build multilevel hierarchies and solvers that can be used for solving a wide class of partial differential equations (elliptic, hyperbolic, saddle point problems) on general unstructured mesh (under the assumption that the topology of the agglomerated entities is correct). Additionally, a novel multilevel solver for saddle point problems with divergence constraint is implemented.

  7. Cluster generator (Patent) | DOEPatents

    Office of Scientific and Technical Information (OSTI)

    Cluster generator Title: Cluster generator Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow ...

  8. Biomass: Biogas Generator

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    BIOGAS GENERATOR Curriculum: Biomass Power (organic chemistry, chemicalcarbon cycles, ... to burn Summary: Students build a simple digester to generate a quantity of gas to burn. ...

  9. Parallel garbage collection without synchronization overhead. Technical report

    SciTech Connect (OSTI)

    Patel, J.H.

    1984-08-01

    Incremental garbage-collection schemes incur substantial overhead that is directly translated as reduced execution efficiency for the user. Parallel garbage-collection schemes implemented via time-slicing on a serial processor also incur this overhead, which might even be aggravated due to context switching. It is useful, therefore, to examine the possibility of implementing a parallel garbage-collection algorithm using a separate processor operating asynchronously with the main-list processor. The overhead in such a scheme arises from the synchronization necessary to manage the two processors, maintaining memory consistency. In this paper, the authors present an architecture and supporting parallel garbage-collection algorithms designed for a virtual memory system with separate processors for list processing and for garbage collection. Each processor has its own primary memory; in addition, there is a small common memory which both processors may access. Individual memories swap off a common secondary memory, but no locking mechanism is required. In particular, a page may reside in both memories simultaneously, and indeed may be accessed and modified freely by each processor. A secondary memory controller ensures consistency without necessitating numerous lockouts on the pages.

  10. PArallel Reacting Multiphase FLOw Computational Fluid Dynamic Analysis

    Energy Science and Technology Software Center (OSTI)

    2002-06-01

    PARMFLO is a parallel multiphase reacting flow computational fluid dynamics (CFD) code. It can perform steady or unsteady simulations in three space dimensions. It is intended for use in engineering CFD analysis of industrial flow system components. Its parallel processing capabilities allow it to be applied to problems that use at least an order of magnitude more computational cells than the number that can be used on a typical single processor workstation (about 106 cellsmore » in parallel processing mode versus about io cells in serial processing mode). Alternately, by spreading the work of a CFD problem that could be run on a single workstation over a group of computers on a network, it can bring the runtime down by an order of magnitude or more (typically from many days to less than one day). The software was implemented using the industry standard Message-Passing Interface (MPI) and domain decomposition in one spatial direction. The phases of a flow problem may include an ideal gas mixture with an arbitrary number of chemical species, and dispersed droplet and particle phases. Regions of porous media may also be included within the domain. The porous media may be packed beds, foams, or monolith catalyst supports. With these features, the code is especially suited to analysis of mixing of reactants in the inlet chamber of catalytic reactors coupled to computation of product yields that result from the flow of the mixture through the catalyst coaled support structure.« less

  11. Parallel Computing Environments and Methods for Power Distribution System Simulation

    SciTech Connect (OSTI)

    Lu, Ning; Taylor, Zachary T.; Chassin, David P.; Guttromson, Ross T.; Studham, Scott S.

    2005-11-10

    The development of cost-effective high-performance parallel computing on multi-processor super computers makes it attractive to port excessively time consuming simulation software from personal computers (PC) to super computes. The power distribution system simulator (PDSS) takes a bottom-up approach and simulates load at appliance level, where detailed thermal models for appliances are used. This approach works well for a small power distribution system consisting of a few thousand appliances. When the number of appliances increases, the simulation uses up the PC memory and its run time increases to a point where the approach is no longer feasible to model a practical large power distribution system. This paper presents an effort made to port a PC-based power distribution system simulator (PDSS) to a 128-processor shared-memory super computer. The paper offers an overview of the parallel computing environment and a description of the modification made to the PDSS model. The performances of the PDSS running on a standalone PC and on the super computer are compared. Future research direction of utilizing parallel computing in the power distribution system simulation is also addressed.

  12. Automatic Thread-Level Parallelization in the Chombo AMR Library

    SciTech Connect (OSTI)

    Christen, Matthias; Keen, Noel; Ligocki, Terry; Oliker, Leonid; Shalf, John; Van Straalen, Brian; Williams, Samuel

    2011-05-26

    The increasing on-chip parallelism has some substantial implications for HPC applications. Currently, hybrid programming models (typically MPI+OpenMP) are employed for mapping software to the hardware in order to leverage the hardware?s architectural features. In this paper, we present an approach that automatically introduces thread level parallelism into Chombo, a parallel adaptive mesh refinement framework for finite difference type PDE solvers. In Chombo, core algorithms are specified in the ChomboFortran, a macro language extension to F77 that is part of the Chombo framework. This domain-specific language forms an already used target language for an automatic migration of the large number of existing algorithms into a hybrid MPI+OpenMP implementation. It also provides access to the auto-tuning methodology that enables tuning certain aspects of an algorithm to hardware characteristics. Performance measurements are presented for a few of the most relevant kernels with respect to a specific application benchmark using this technique as well as benchmark results for the entire application. The kernel benchmarks show that, using auto-tuning, up to a factor of 11 in performance was gained with 4 threads with respect to the serial reference implementation.

  13. Design and performance of a scalable, parallel statistics toolkit.

    SciTech Connect (OSTI)

    Thompson, David C.; Bennett, Janine Camille; Pebay, Philippe Pierre

    2010-11-01

    Most statistical software packages implement a broad range of techniques but do so in an ad hoc fashion, leaving users who do not have a broad knowledge of statistics at a disadvantage since they may not understand all the implications of a given analysis or how to test the validity of results. These packages are also largely serial in nature, or target multicore architectures instead of distributed-memory systems, or provide only a small number of statistics in parallel. This paper surveys a collection of parallel implementations of statistics algorithm developed as part of a common framework over the last 3 years. The framework strategically groups modeling techniques with associated verification and validation techniques to make the underlying assumptions of the statistics more clear. Furthermore it employs a design pattern specifically targeted for distributed-memory parallelism, where architectural advances in large-scale high-performance computing have been focused. Moment-based statistics (which include descriptive, correlative, and multicorrelative statistics, principal component analysis (PCA), and k-means statistics) scale nearly linearly with the data set size and number of processes. Entropy-based statistics (which include order and contingency statistics) do not scale well when the data in question is continuous or quasi-diffuse but do scale well when the data is discrete and compact. We confirm and extend our earlier results by now establishing near-optimal scalability with up to 10,000 processes.

  14. Identifying failure in a tree network of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Pinnow, Kurt W.; Wallenfelt, Brian P.

    2010-08-24

    Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.

  15. Improved parallel solution techniques for the integral transport matrix method

    SciTech Connect (OSTI)

    Zerr, Robert J; Azmy, Yousry Y

    2010-11-23

    Alternative solution strategies to the parallel block Jacobi (PBJ) method for the solution of the global problem with the integral transport matrix method operators have been designed and tested. The most straightforward improvement to the Jacobi iterative method is the Gauss-Seidel alternative. The parallel red-black Gauss-Seidel (PGS) algorithm can improve on the number of iterations and reduce work per iteration by applying an alternating red-black color-set to the subdomains and assigning multiple sub-domains per processor. A parallel GMRES(m) method was implemented as an alternative to stationary iterations. Computational results show that the PGS method can improve on the PBJ method execution by up to {approx}50% when eight sub-domains per processor are used. However, compared to traditional source iterations with diffusion synthetic acceleration, it is still approximately an order of magnitude slower. The best-performing case are opticaUy thick because sub-domains decouple, yielding faster convergence. Further tests revealed that 64 sub-domains per processor was the best performing level of sub-domain division. An acceleration technique that improves the convergence rate would greatly improve the ITMM. The GMRES(m) method with a diagonal block preconditioner consumes approximately the same time as the PBJ solver but could be improved by an as yet undeveloped, more efficient preconditioner.

  16. An extensible operating system design for large-scale parallel machines.

    SciTech Connect (OSTI)

    Riesen, Rolf E.; Ferreira, Kurt Brian

    2009-04-01

    Running untrusted user-level code inside an operating system kernel has been studied in the 1990's but has not really caught on. We believe the time has come to resurrect kernel extensions for operating systems that run on highly-parallel clusters and supercomputers. The reason is that the usage model for these machines differs significantly from a desktop machine or a server. In addition, vendors are starting to add features, such as floating-point accelerators, multicore processors, and reconfigurable compute elements. An operating system for such machines must be adaptable to the requirements of specific applications and provide abstractions to access next-generation hardware features, without sacrificing performance or scalability.

  17. A Study of Successive Over-relaxation Method Parallelization over Modern HPC Languages

    SciTech Connect (OSTI)

    Mittal, Sparsh [ORNL

    2014-01-01

    Successive over-relaxation (SOR) is a computationally intensive, yet extremely important iterative solver for solving linear systems. Due to recent trends of exponential growth in amount of data generated and increasing problem sizes, serial platforms have proved to be insucient in providing the required computational power. In this paper, we present parallel implementations of red-black SOR method using three modern programming languages namely Chapel, D and Go. We employ SOR method for solving 2D steady-state heat conduction problem. We discuss the optimizations incorporated and the features of these languages which are crucial for improving the program performance. Experiments have been performed using 2, 4, and 8 threads and performance results are compared with serial execution. The analysis of results provides important insights into working of SOR method.

  18. Generator stator core vent duct spacer posts

    DOE Patents [OSTI]

    Griffith, John Wesley; Tong, Wei

    2003-06-24

    Generator stator cores are constructed by stacking many layers of magnetic laminations. Ventilation ducts may be inserted between these layers by inserting spacers into the core stack. The ventilation ducts allow for the passage of cooling gas through the core during operation. The spacers or spacer posts are positioned between groups of the magnetic laminations to define the ventilation ducts. The spacer posts are secured with longitudinal axes thereof substantially parallel to the core axis. With this structure, core tightness can be assured while maximizing ventilation duct cross section for gas flow and minimizing magnetic loss in the spacers.

  19. High voltage pulse generator. [Patent application

    DOE Patents [OSTI]

    Fasching, G.E.

    1975-06-12

    An improved high-voltage pulse generator is described which is especially useful in ultrasonic testing of rock core samples. An N number of capacitors are charged in parallel to V volts and at the proper instance are coupled in series to produce a high-voltage pulse of N times V volts. Rapid switching of the capacitors from the paralleled charging configuration to the series discharging configuration is accomplished by using silicon-controlled rectifiers which are chain self-triggered following the initial triggering of the first rectifier connected between the first and second capacitors. A timing and triggering circuit is provided to properly synchronize triggering pulses to the first SCR at a time when the charging voltage is not being applied to the parallel-connected charging capacitors. The output voltage can be readily increased by adding additional charging networks. The circuit allows the peak level of the output to be easily varied over a wide range by using a variable autotransformer in the charging circuit.

  20. Fencing network direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-07-07

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to a deterministic data communications network through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and the deterministic data communications network; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  1. Fencing network direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-07-14

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to a deterministic data communications network through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and the deterministic data communications network; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  2. Xyce Parallel Electronic Simulator Users Guide Version 6.4

    SciTech Connect (OSTI)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Schiek, Richard; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason; Baur, David Gregory

    2015-12-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2015 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)

  3. Xyce Parallel Electronic Simulator - Users' Guide Version 2.1.

    SciTech Connect (OSTI)

    Hutchinson, Scott A; Hoekstra, Robert J.; Russo, Thomas V.; Rankin, Eric; Pawlowski, Roger P.; Fixel, Deborah A; Schiek, Richard; Bogdan, Carolyn W.; Shirley, David N.; Campbell, Phillip M.; Keiter, Eric R.

    2005-06-01

    This manual describes the use of theXyceParallel Electronic Simulator.Xycehasbeen designed as a SPICE-compatible, high-performance analog circuit simulator, andhas been written to support the simulation needs of the Sandia National Laboratorieselectrical designers. This development has focused on improving capability over thecurrent state-of-the-art in the following areas:%04Capability to solve extremely large circuit problems by supporting large-scale par-allel computing platforms (up to thousands of processors). Note that this includessupport for most popular parallel and serial computers.%04Improved performance for all numerical kernels (e.g., time integrator, nonlinearand linear solvers) through state-of-the-art algorithms and novel techniques.%04Device models which are specifically tailored to meet Sandia's needs, includingmany radiation-aware devices.3 XyceTMUsers' Guide%04Object-oriented code design and implementation using modern coding practicesthat ensure that theXyceParallel Electronic Simulator will be maintainable andextensible far into the future.Xyceis a parallel code in the most general sense of the phrase - a message passingparallel implementation - which allows it to run efficiently on the widest possible numberof computing platforms. These include serial, shared-memory and distributed-memoryparallel as well as heterogeneous platforms. Careful attention has been paid to thespecific nature of circuit-simulation problems to ensure that optimal parallel efficiencyis achieved as the number of processors grows.The development ofXyceprovides a platform for computational research and de-velopment aimed specifically at the needs of the Laboratory. WithXyce, Sandia hasan %22in-house%22 capability with which both new electrical (e.g., device model develop-ment) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms)research and development can be performed. As a result,Xyceis a unique electricalsimulation capability, designed to meet the unique needs of the laboratory.4 XyceTMUsers' GuideAcknowledgementsThe authors would like to acknowledge the entire Sandia National Laboratories HPEMS(High Performance Electrical Modeling and Simulation) team, including Steve Wix, CarolynBogdan, Regina Schells, Ken Marx, Steve Brandon and Bill Ballard, for their support onthis project. We also appreciate very much the work of Jim Emery, Becky Arnold and MikeWilliamson for the help in reviewing this document.Lastly, a very special thanks to Hue Lai for typesetting this document with LATEX.TrademarksThe information herein is subject to change without notice.Copyrightc 2002-2003 Sandia Corporation. All rights reserved.XyceTMElectronic Simulator andXyceTMtrademarks of Sandia Corporation.Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence DesignSystems, Inc.Silicon Graphics, the Silicon Graphics logo and IRIX are registered trademarks of SiliconGraphics, Inc.Microsoft, Windows and Windows 2000 are registered trademark of Microsoft Corporation.Solaris and UltraSPARC are registered trademarks of Sun Microsystems Corporation.Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation.HP and Alpha are registered trademarks of Hewlett-Packard company.Amtec and TecPlot are trademarks of Amtec Engineering, Inc.Xyce's expression library is based on that inside Spice 3F5 developed by the EECS De-partment at the University of California.All other trademarks are property of their respective owners.ContactsBug Reportshttp://tvrusso.sandia.gov/bugzillaEmailxyce-support%40sandia.govWorld Wide Webhttp://www.cs.sandia.gov/xyce5 XyceTMUsers' GuideThis page is left intentionally blank6

  4. Gamma ray generator

    DOE Patents [OSTI]

    Firestone, Richard B; Reijonen, Jani

    2014-05-27

    An embodiment of a gamma ray generator includes a neutron generator and a moderator. The moderator is coupled to the neutron generator. The moderator includes a neutron capture material. In operation, the neutron generator produces neutrons and the neutron capture material captures at least some of the neutrons to produces gamma rays. An application of the gamma ray generator is as a source of gamma rays for calibration of gamma ray detectors.

  5. Optimizing Parallel Access to the BaBar Database System Using...

    Office of Scientific and Technical Information (OSTI)

    Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Citation Details In-Document Search Title: Optimizing Parallel Access to the BaBar Database System Using ...

  6. Parallel I/O Software Infrastructure for Large-Scale Systems

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel IO Software Infrastructure for Large-Scale Systems Parallel IO Software Infrastructure for Large-Scale Systems Choudhary.png An illustration of how MPI---IO file domain...

  7. cray-hdf5-parallel/1.8.13 garbling integers in intel environment

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    cray-hdf5-parallel1.8.13 garbling integers in intel environment cray-hdf5-parallel1.8.13 garbling integers in intel environment September 11, 2014 This problem was fixed on 11...

  8. BlueGene/L Applications: Parallelism on a Massive Scale (Journal...

    Office of Scientific and Technical Information (OSTI)

    BlueGeneL Applications: Parallelism on a Massive Scale Citation Details In-Document Search Title: BlueGeneL Applications: Parallelism on a Massive Scale You are accessing a ...

  9. Final Report: Migration Mechanisms for Large-scale Parallel Applications

    SciTech Connect (OSTI)

    Jason Nieh

    2009-10-30

    Process migration is the ability to transfer a process from one machine to another. It is a useful facility in distributed computing environments, especially as computing devices become more pervasive and Internet access becomes more ubiquitous. The potential benefits of process migration, among others, are fault resilience by migrating processes off of faulty hosts, data access locality by migrating processes closer to the data, better system response time by migrating processes closer to users, dynamic load balancing by migrating processes to less loaded hosts, and improved service availability and administration by migrating processes before host maintenance so that applications can continue to run with minimal downtime. Although process migration provides substantial potential benefits and many approaches have been considered, achieving transparent process migration functionality has been difficult in practice. To address this problem, our work has designed, implemented, and evaluated new and powerful transparent process checkpoint-restart and migration mechanisms for desktop, server, and parallel applications that operate across heterogeneous cluster and mobile computing environments. A key aspect of this work has been to introduce lightweight operating system virtualization to provide processes with private, virtual namespaces that decouple and isolate processes from dependencies on the host operating system instance. This decoupling enables processes to be transparently checkpointed and migrated without modifying, recompiling, or relinking applications or the operating system. Building on this lightweight operating system virtualization approach, we have developed novel technologies that enable (1) coordinated, consistent checkpoint-restart and migration of multiple processes, (2) fast checkpointing of process and file system state to enable restart of multiple parallel execution environments and time travel, (3) process migration across heterogeneous software environments, (4) network checkpoint-restart and migration of distributed and parallel applications, (5) a utility computing infrastructure for mobile desktop cloud computing based on process checkpoint-restart and migration functionality, (6) a process migration security architecture for protecting applications and infrastructure from denial-of-service attacks, and (7) a checkpoint-restart mobile computing system using portable storage devices.

  10. Implementation of a parallel multilevel secure process. Master's thesis

    SciTech Connect (OSTI)

    Pratt, D.R.

    1988-06-01

    This thesis demonstrates an implementation of a parallel multilevel secure process. This is done within the framework of an electronic-mail system. Security is implemented by GEMSOS, the operating system of the Gemini Trusted Computer Base. A brief history of computer secrecy is followed by a discussion of security kernels. Event counts and sequences are used to provide concurrency control and are covered in detail. The specifications for the system are based upon the requirements for a Headquarters of a hypothetical Marine Battalion in garrison.

  11. Computing NLTE Opacities -- Node Level Parallel Calculation

    SciTech Connect (OSTI)

    Holladay, Daniel

    2015-09-11

    Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.

  12. Local rollback for fault-tolerance in parallel computing systems

    DOE Patents [OSTI]

    Blumrich, Matthias A.; Chen, Dong; Gara, Alan; Giampapa, Mark E.; Heidelberger, Philip; Ohmacht, Martin; Steinmacher-Burow, Burkhard; Sugavanam, Krishnan

    2012-01-24

    A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

  13. Performing a global barrier operation in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-12-09

    Executing computing tasks on a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.

  14. SERODS optical data storage with parallel signal transfer

    DOE Patents [OSTI]

    Vo-Dinh, Tuan

    2003-06-24

    Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.

  15. SERODS optical data storage with parallel signal transfer

    DOE Patents [OSTI]

    Vo-Dinh, Tuan

    2003-09-02

    Surface-enhanced Raman optical data storage (SERODS) systems having increased reading and writing speeds, that is, increased data transfer rates, are disclosed. In the various SERODS read and write systems, the surface-enhanced Raman scattering (SERS) data is written and read using a two-dimensional process called parallel signal transfer (PST). The various embodiments utilize laser light beam excitation of the SERODS medium, optical filtering, beam imaging, and two-dimensional light detection. Two- and three-dimensional SERODS media are utilized. The SERODS write systems employ either a different laser or a different level of laser power.

  16. Parallel optics technology assessment for the versatile link project

    SciTech Connect (OSTI)

    Chramowicz, J.; Kwan, S.; Rivera, R.; Prosser, A.; /Fermilab

    2011-01-01

    This poster describes the assessment of commercially available and prototype parallel optics modules for possible use as back end components for the Versatile Link common project. The assessment covers SNAP12 transmitter and receiver modules as well as optical engine technologies in dense packaging options. Tests were performed using vendor evaluation boards (SNAP12) as well as custom evaluation boards (optical engines). The measurements obtained were used to compare the performance of these components with single channel SFP+ components operating at a transmission wavelength of 850 nm over multimode fibers.

  17. Routing performance analysis and optimization within a massively parallel computer

    DOE Patents [OSTI]

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  18. Scripts for Scalable Monitoring of Parallel Filesystem Infrastructure

    SciTech Connect (OSTI)

    2014-02-27

    Scripts for scalable monitoring of parallel filesystem infrastructure provide frameworks for monitoring the health of block storage arrays and large InfiniBand fabrics. The block storage framework uses Python multiprocessing to within scale the number monitored arrays to scale with the number of processors in the system. This enables live monitoring of HPC-scale filesystem with 10-50 storage arrays. For InfiniBand monitoring, there are scripts included that monitor InfiniBand health of each host along with visualization tools for mapping the topology of complex fabric topologies.

  19. Scripts for Scalable Monitoring of Parallel Filesystem Infrastructure

    Energy Science and Technology Software Center (OSTI)

    2014-02-27

    Scripts for scalable monitoring of parallel filesystem infrastructure provide frameworks for monitoring the health of block storage arrays and large InfiniBand fabrics. The block storage framework uses Python multiprocessing to within scale the number monitored arrays to scale with the number of processors in the system. This enables live monitoring of HPC-scale filesystem with 10-50 storage arrays. For InfiniBand monitoring, there are scripts included that monitor InfiniBand health of each host along with visualization toolsmore » for mapping the topology of complex fabric topologies.« less

  20. Digital intermediate frequency QAM modulator using parallel processing

    DOE Patents [OSTI]

    Pao, Hsueh-Yuan; Tran, Binh-Nien

    2008-05-27

    The digital Intermediate Frequency (IF) modulator applies to various modulation types and offers a simple and low cost method to implement a high-speed digital IF modulator using field programmable gate arrays (FPGAs). The architecture eliminates multipliers and sequential processing by storing the pre-computed modulated cosine and sine carriers in ROM look-up-tables (LUTs). The high-speed input data stream is parallel processed using the corresponding LUTs, which reduces the main processing speed, allowing the use of low cost FPGAs.

  1. Parallel Computation of Persistent Homology using the Blowup Complex

    SciTech Connect (OSTI)

    Lewis, Ryan; Morozov, Dmitriy

    2015-04-27

    We describe a parallel algorithm that computes persistent homology, an algebraic descriptor of a filtered topological space. Our algorithm is distinguished by operating on a spatial decomposition of the domain, as opposed to a decomposition with respect to the filtration. We rely on a classical construction, called the Mayer--Vietoris blowup complex, to glue global topological information about a space from its disjoint subsets. We introduce an efficient algorithm to perform this gluing operation, which may be of independent interest, and describe how to process the domain hierarchically. We report on a set of experiments that help assess the strengths and identify the limitations of our method.

  2. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo

    2005-06-14

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  3. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo

    2008-04-22

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  4. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo

    2009-12-29

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  5. Using Backup Generators: Choosing the Right Backup Generator...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Homeowners Using Backup Generators: Choosing the Right Backup Generator - Homeowners Using Backup Generators: Choosing the Right Backup Generator - Homeowners Determine the amount ...

  6. High-performance parallel interface to synchronous optical network gateway

    DOE Patents [OSTI]

    St. John, W.B.; DuBois, D.H.

    1996-12-03

    Disclosed is a system of sending and receiving gateways interconnects high speed data interfaces, e.g., HIPPI interfaces, through fiber optic links, e.g., a SONET network. An electronic stripe distributor distributes bytes of data from a first interface at the sending gateway onto parallel fiber optics of the fiber optic link to form transmitted data. An electronic stripe collector receives the transmitted data on the parallel fiber optics and reforms the data into a format effective for input to a second interface at the receiving gateway. Preferably, an error correcting syndrome is constructed at the sending gateway and sent with a data frame so that transmission errors can be detected and corrected in a real-time basis. Since the high speed data interface operates faster than any of the fiber optic links the transmission rate must be adapted to match the available number of fiber optic links so the sending and receiving gateways monitor the availability of fiber links and adjust the data throughput accordingly. In another aspect, the receiving gateway must have sufficient available buffer capacity to accept an incoming data frame. A credit-based flow control system provides for continuously updating the sending gateway on the available buffer capacity at the receiving gateway. 7 figs.

  7. High-performance parallel interface to synchronous optical network gateway

    DOE Patents [OSTI]

    St. John, Wallace B.; DuBois, David H.

    1996-01-01

    A system of sending and receiving gateways interconnects high speed data interfaces, e.g., HIPPI interfaces, through fiber optic links, e.g., a SONET network. An electronic stripe distributor distributes bytes of data from a first interface at the sending gateway onto parallel fiber optics of the fiber optic link to form transmitted data. An electronic stripe collector receives the transmitted data on the parallel fiber optics and reforms the data into a format effective for input to a second interface at the receiving gateway. Preferably, an error correcting syndrome is constructed at the sending gateway and sent with a data frame so that transmission errors can be detected and corrected in a real-time basis. Since the high speed data interface operates faster than any of the fiber optic links the transmission rate must be adapted to match the available number of fiber optic links so the sending and receiving gateways monitor the availability of fiber links and adjust the data throughput accordingly. In another aspect, the receiving gateway must have sufficient available buffer capacity to accept an incoming data frame. A credit-based flow control system provides for continuously updating the sending gateway on the available buffer capacity at the receiving gateway.

  8. Long-time dynamics through parallel trajectory splicing

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Perez, Danny; Cubuk, Ekin D.; Waterland, Amos; Kaxiras, Efthimios; Voter, Arthur F.

    2015-11-24

    Simulating the atomistic evolution of materials over long time scales is a longstanding challenge, especially for complex systems where the distribution of barrier heights is very heterogeneous. Such systems are difficult to investigate using conventional long-time scale techniques, and the fact that they tend to remain trapped in small regions of configuration space for extended periods of time strongly limits the physical insights gained from short simulations. We introduce a novel simulation technique, Parallel Trajectory Splicing (ParSplice), that aims at addressing this problem through the timewise parallelization of long trajectories. The computational efficiency of ParSplice stems from a speculation strategymore » whereby predictions of the future evolution of the system are leveraged to increase the amount of work that can be concurrently performed at any one time, hence improving the scalability of the method. ParSplice is also able to accurately account for, and potentially reuse, a substantial fraction of the computational work invested in the simulation. We validate the method on a simple Ag surface system and demonstrate substantial increases in efficiency compared to previous methods. As a result, we then demonstrate the power of ParSplice through the study of topology changes in Ag42Cu13 core–shell nanoparticles.« less

  9. Scalable Library for the Parallel Solution of Sparse Linear Systems

    Energy Science and Technology Software Center (OSTI)

    1993-07-14

    BlockSolve is a scalable parallel software library for the solution of large sparse, symmetric systems of linear equations. It runs on a variety of parallel architectures and can easily be ported to others. BlockSovle is primarily intended for the solution of sparse linear systems that arise from physical problems having multiple degrees of freedom at each node point. For example, when the finite element method is used to solve practical problems in structural engineering, eachmore » node will typically have anywhere from 3-6 degrees of freedom associated with it. BlockSolve is written to take advantage of problems of this nature; however, it is still reasonably efficient for problems that have only one degree of freedom associated with each node, such as the three-dimensional Poisson problem. It does not require that the matrices have any particular structure other than being sparse and symmetric. BlockSolve is intended to be used within real application codes. It is designed to work best in the context of our experience which indicated that most application codes solve the same linear systems with several different right-hand sides and/or linear systems with the same structure, but different matrix values multiple times.« less

  10. Long-time dynamics through parallel trajectory splicing

    SciTech Connect (OSTI)

    Perez, Danny; Cubuk, Ekin D.; Waterland, Amos; Kaxiras, Efthimios; Voter, Arthur F.

    2015-11-24

    Simulating the atomistic evolution of materials over long time scales is a longstanding challenge, especially for complex systems where the distribution of barrier heights is very heterogeneous. Such systems are difficult to investigate using conventional long-time scale techniques, and the fact that they tend to remain trapped in small regions of configuration space for extended periods of time strongly limits the physical insights gained from short simulations. We introduce a novel simulation technique, Parallel Trajectory Splicing (ParSplice), that aims at addressing this problem through the timewise parallelization of long trajectories. The computational efficiency of ParSplice stems from a speculation strategy whereby predictions of the future evolution of the system are leveraged to increase the amount of work that can be concurrently performed at any one time, hence improving the scalability of the method. ParSplice is also able to accurately account for, and potentially reuse, a substantial fraction of the computational work invested in the simulation. We validate the method on a simple Ag surface system and demonstrate substantial increases in efficiency compared to previous methods. As a result, we then demonstrate the power of ParSplice through the study of topology changes in Ag42Cu13 core–shell nanoparticles.

  11. Perm Web: remote parallel and distributed volume visualization

    SciTech Connect (OSTI)

    Wittenbrink, C.M.; Kim, K.; Story, J.; Pang, A.; Hollerbach, K.; Max, N.

    1997-01-01

    In this paper we present a system for visualizing volume data from remote supercomputers (PermWeb). We have developed both parallel volume rendering algorithms, and the World Wide Web software for accessing the data at the remote sites. The implementation uses Hypertext Markup Language (HTML), Java, and Common Gateway Interface (CGI) scripts to connect World Wide Web (WWW) servers/clients to our volume renderers. The front ends are interactive Java classes for specification of view, shading, and classification inputs. We present performance results, and implementation details for connections to our computing resources at the University of California Santa Cruz including a MasPar MP-2, SGI Reality Engine-RE2, and SGI Challenge machines. We apply the system to the task of visualizing trabecular bone from finite element simulations. Fast volume rendering on remote compute servers through a web interface allows us to increase the accessibility of the results to more users. User interface issues, overviews of parallel algorithm developments, and overall system interfaces and protocols are presented. Access is available through Uniform Resource Locator (URL) http://www.cse.ucsc.edu/research/slvg/. 26 refs., 7 figs.

  12. Energy Proportionality and Performance in Data Parallel Computing Clusters

    SciTech Connect (OSTI)

    Kim, Jinoh; Chou, Jerry; Rotem, Doron

    2011-02-14

    Energy consumption in datacenters has recently become a major concern due to the rising operational costs andscalability issues. Recent solutions to this problem propose the principle of energy proportionality, i.e., the amount of energy consumedby the server nodes must be proportional to the amount of work performed. For data parallelism and fault tolerancepurposes, most common file systems used in MapReduce-type clusters maintain a set of replicas for each data block. A coveringset is a group of nodes that together contain at least one replica of the data blocks needed for performing computing tasks. In thiswork, we develop and analyze algorithms to maintain energy proportionality by discovering a covering set that minimizesenergy consumption while placing the remaining nodes in lowpower standby mode. Our algorithms can also discover coveringsets in heterogeneous computing environments. In order to allow more data parallelism, we generalize our algorithms so that itcan discover k-covering sets, i.e., a set of nodes that contain at least k replicas of the data blocks. Our experimental results showthat we can achieve substantial energy saving without significant performance loss in diverse cluster configurations and workingenvironments.

  13. Efficient VLSI networks for parallel processing based on orthogonal trees

    SciTech Connect (OSTI)

    Nath, D.; Maheshwari, S.N.; Bhatt, P.C.P.

    1983-06-01

    Two interconnection networks for parallel processing, namely the orthogonal trees network and the orthogonal tree cycles (OTN and OTC) are discussed. Both networks are suitable for VLSI implementation and have been analysed using Thompson's model of VLSI. While the OTN and OTC have time performances similar to fast networks such as the perfect shuffle network (PSN), the cube connected cycles (CCC), etc., they have substantially better area* time/sup 2/ performances for a number of matrix and graph problems. For instance, the connected components and a minimal spanning tree of an undirected n-vertex graph can be found in 0(log/sup 4/ n) time on the OTC with an area* time/sup 2/ performance of 0(n/sup 2/ log/sup 8/ n) and 0(n/sup 2/ log/sup 9/ n) respectively. This is asymptotically much better than the performances of the CCC, PSN and MESH. The OTC and OTN can be looked upon as general purpose parallel processors since a number of other problems such as sorting and DFT can be solved on them with an area* time/sup 2/ performance matching that of other networks. Finally, programming the OTN and OTC is simple and they are also amenable to pipelining a series of problems. 33 references.

  14. Nonlinear parallel momentum transport in strong electrostatic turbulence

    SciTech Connect (OSTI)

    Wang, Lu Wen, Tiliang; Diamond, P. H.

    2015-05-15

    Most existing theoretical studies of momentum transport focus on calculating the Reynolds stress based on quasilinear theory, without considering the nonlinear momentum flux-〈v{sup ~}{sub r}n{sup ~}u{sup ~}{sub ∥}〉. However, a recent experiment on TORPEX found that the nonlinear toroidal momentum flux induced by blobs makes a significant contribution as compared to the Reynolds stress [Labit et al., Phys. Plasmas 18, 032308 (2011)]. In this work, the nonlinear parallel momentum flux in strong electrostatic turbulence is calculated by using a three dimensional Hasegawa-Mima equation, which is relevant for tokamak edge turbulence. It is shown that the nonlinear diffusivity is smaller than the quasilinear diffusivity from Reynolds stress. However, the leading order nonlinear residual stress can be comparable to the quasilinear residual stress, and so may be important to intrinsic rotation in tokamak edge plasmas. A key difference from the quasilinear residual stress is that parallel fluctuation spectrum asymmetry is not required for nonlinear residual stress.

  15. A Programming Model Performance Study Using the NAS Parallel Benchmarks

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Shan, Hongzhang; Blagojević, Filip; Min, Seung-Jai; Hargrove, Paul; Jin, Haoqiang; Fuerlinger, Karl; Koniges, Alice; Wright, Nicholas J.

    2010-01-01

    Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper we use the NAS Parallel Benchmarks to study three programming models, MPI, OpenMP and PGAS to understand their performance and memory usage characteristics on current multicore architectures. To understand these characteristics we use the Integrated Performance Monitoring tool and other ways to measure communication versus computation time, as well as the fraction of the run time spent in OpenMP. The benchmarks are run on two different Cray XT5 systems and an Infiniband cluster. Our results show that in general the threemore » programming models exhibit very similar performance characteristics. In a few cases, OpenMP is significantly faster because it explicitly avoids communication. For these particular cases, we were able to re-write the UPC versions and achieve equal performance to OpenMP. Using OpenMP was also the most advantageous in terms of memory usage. Also we compare performance differences between the two Cray systems, which have quad-core and hex-core processors. We show that at scale the performance is almost always slower on the hex-core system because of increased contention for network resources.« less

  16. Underwater power generator

    SciTech Connect (OSTI)

    Bowley, W.W.

    1983-05-10

    Apparatus and method for generating electrical power by disposing a plurality of power producing modules in a substantially constant velocity ocean current and mechanically coupling the output of the modules to drive a single electrical generator is disclosed.

  17. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    SciTech Connect (OSTI)

    Le Crom, Stphane; Schackwitz, Wendy; Pennacchiod, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel X.; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frdric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels, such as ethanol, and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions and 18 larger deletions leading to the loss of more than 100 kb of genomic DNA. From these events we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild type strain QM6a. Our analysis provides the first genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus, and suggests new areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.

  18. Motor/generator

    DOE Patents [OSTI]

    Hickam, Christopher Dale (Glasford, IL)

    2008-05-13

    A motor/generator is provided for connecting between a transmission input shaft and an output shaft of a prime mover. The motor/generator may include a motor/generator housing, a stator mounted to the motor/generator housing, a rotor mounted at least partially within the motor/generator housing and rotatable about a rotor rotation axis, and a transmission-shaft coupler drivingly coupled to the rotor. The transmission-shaft coupler may include a clamp, which may include a base attached to the rotor and a plurality of adjustable jaws.

  19. Aggregating job exit statuses of a plurality of compute nodes executing a parallel application

    DOE Patents [OSTI]

    Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Mundy, Michael B.

    2015-07-21

    Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregating each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.

  20. The parallel I/O architecture of the High Performance Storage System (HPSS)

    SciTech Connect (OSTI)

    Watson, R.W.; Coyne, R.A.

    1995-02-01

    Rapid improvements in computational science, processing capability, main memory sizes, data collection devices, multimedia capabilities and integration of enterprise data are producing very large datasets (10s-100s of gigabytes to terabytes). This rapid growth of data has resulted in a serious imbalance in I/O and storage system performance and functionality. One promising approach to restoring balanced I/O and storage system performance is use of parallel data transfer techniques for client access to storage, device-to-device transfers, and remote file transfers. This paper describes the parallel I/O architecture and mechanisms, Parallel Transport Protocol, parallel FIP, and parallel client Application Programming Interface (API) used by the High Performance Storage System (HPSS). Parallel storage integration issues with a local parallel file system are also discussed.

  1. Adding Data Management Services to Parallel File Systems

    SciTech Connect (OSTI)

    Brandt, Scott

    2015-03-04

    The objective of this project, called DAMASC for “Data Management in Scientific Computing”, is to coalesce data management with parallel file system management to present a declarative interface to scientists for managing, querying, and analyzing extremely large data sets efficiently and predictably. Managing extremely large data sets is a key challenge of exascale computing. The overhead, energy, and cost of moving massive volumes of data demand designs where computation is close to storage. In current architectures, compute/analysis clusters access data in a physically separate parallel file system and largely leave it scientist to reduce data movement. Over the past decades the high-end computing community has adopted middleware with multiple layers of abstractions and specialized file formats such as NetCDF-4 and HDF5. These abstractions provide a limited set of high-level data processing functions, but have inherent functionality and performance limitations: middleware that provides access to the highly structured contents of scientific data files stored in the (unstructured) file systems can only optimize to the extent that file system interfaces permit; the highly structured formats of these files often impedes native file system performance optimizations. We are developing Damasc, an enhanced high-performance file system with native rich data management services. Damasc will enable efficient queries and updates over files stored in their native byte-stream format while retaining the inherent performance of file system data storage via declarative queries and updates over views of underlying files. Damasc has four key benefits for the development of data-intensive scientific code: (1) applications can use important data-management services, such as declarative queries, views, and provenance tracking, that are currently available only within database systems; (2) the use of these services becomes easier, as they are provided within a familiar file-based ecosystem; (3) common optimizations, e.g., indexing and caching, are readily supported across several file formats, avoiding effort duplication; and (4) performance improves significantly, as data processing is integrated more tightly with data storage. Our key contributions are: SciHadoop which explores changes to MapReduce assumption by taking advantage of semantics of structured data while preserving MapReduce’s failure and resource management; DataMods which extends common abstractions of parallel file systems so they become programmable such that they can be extended to natively support a variety of data models and can be hooked into emerging distributed runtimes such as Stanford’s Legion; and Miso which combines Hadoop and relational data warehousing to minimize time to insight, taking into account the overhead of ingesting data into data warehousing.

  2. Solar thermoelectric generator

    DOE Patents [OSTI]

    Toberer, Eric S.; Baranowski, Lauryn L.; Warren, Emily L.

    2016-05-03

    Solar thermoelectric generators (STEGs) are solid state heat engines that generate electricity from concentrated sunlight. A novel detailed balance model for STEGs is provided and applied to both state-of-the-art and idealized materials. STEGs can produce electricity by using sunlight to heat one side of a thermoelectric generator. While concentrated sunlight can be used to achieve extremely high temperatures (and thus improved generator efficiency), the solar absorber also emits a significant amount of black body radiation. This emitted light is the dominant loss mechanism in these generators. In this invention, we propose a solution to this problem that eliminates virtually all of the emitted black body radiation. This enables solar thermoelectric generators to operate at higher efficiency and achieve said efficient with lower levels of optical concentration. The solution is suitable for both single and dual axis solar thermoelectric generators.

  3. Update on Development of Mesh Generation Algorithms in MeshKit

    SciTech Connect (OSTI)

    Jain, Rajeev; Vanderzee, Evan; Mahadevan, Vijay

    2015-09-30

    MeshKit uses a graph-based design for coding all its meshing algorithms, which includes the Reactor Geometry (and mesh) Generation (RGG) algorithms. This report highlights the developmental updates of all the algorithms, results and future work. Parallel versions of algorithms, documentation and performance results are reported. RGG GUI design was updated to incorporate new features requested by the users; boundary layer generation and parallel RGG support were added to the GUI. Key contributions to the release, upgrade and maintenance of other SIGMA1 libraries (CGM and MOAB) were made. Several fundamental meshing algorithms for creating a robust parallel meshing pipeline in MeshKit are under development. Results and current status of automated, open-source and high quality nuclear reactor assembly mesh generation algorithms such as trimesher, quadmesher, interval matching and multi-sweeper are reported.

  4. Runtime optimization of an application executing on a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A; Smith, Brian E

    2014-11-25

    Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.

  5. Runtime optimization of an application executing on a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A; Smith, Brian E

    2014-11-18

    Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.

  6. Tracking moving radar targets with parallel, velocity-tuned filters

    DOE Patents [OSTI]

    Bickel, Douglas L.; Harmony, David W.; Bielek, Timothy P.; Hollowell, Jeff A.; Murray, Margaret S.; Martinez, Ana

    2013-04-30

    Radar data associated with radar illumination of a movable target is processed to monitor motion of the target. A plurality of filter operations are performed in parallel on the radar data so that each filter operation produces target image information. The filter operations are defined to have respectively corresponding velocity ranges that differ from one another. The target image information produced by one of the filter operations represents the target more accurately than the target image information produced by the remainder of the filter operations when a current velocity of the target is within the velocity range associated with the one filter operation. In response to the current velocity of the target being within the velocity range associated with the one filter operation, motion of the target is tracked based on the target image information produced by the one filter operation.

  7. Final Report: Super Instruction Architecture for Scalable Parallel Computations

    SciTech Connect (OSTI)

    Sanders, Beverly Ann; Bartlett, Rodney; Deumens, Erik

    2013-12-23

    The most advanced methods for reliable and accurate computation of the electronic structure of molecular and nano systems are the coupled-cluster techniques. These high-accuracy methods help us to understand, for example, how biological enzymes operate and contribute to the design of new organic explosives. The ACES III software provides a modern, high-performance implementation of these methods optimized for high performance parallel computer systems, ranging from small clusters typical in individual research groups, through larger clusters available in campus and regional computer centers, all the way to high-end petascale systems at national labs, including exploiting GPUs if available. This project enhanced the ACESIII software package and used it to study interesting scientific problems.

  8. Design of dynamic load-balancing tools for parallel applications

    SciTech Connect (OSTI)

    Devine, K.D.; Hendrickson, B.A.; Boman, E.G.; St. John, M.; Vaughan, C.T.

    2000-01-03

    The design of general-purpose dynamic load-balancing tools for parallel applications is more challenging than the design of static partitioning tools. Both algorithmic and software engineering issues arise. The authors have addressed many of these issues in the design of the Zoltan dynamic load-balancing library. Zoltan has an object-oriented interface that makes it easy to use and provides separation between the application and the load-balancing algorithms. It contains a suite of dynamic load-balancing algorithms, including both geometric and graph-based algorithms. Its design makes it valuable both as a partitioning tool for a variety of applications and as a research test-bed for new algorithmic development. In this paper, the authors describe Zoltan's design and demonstrate its use in an unstructured-mesh finite element application.

  9. Optimized collectives using a DMA on a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Gabor, Dozsa; Giampapa, Mark E.; Heidelberger; Phillip

    2011-02-08

    Optimizing collective operations using direct memory access controller on a parallel computer, in one aspect, may comprise establishing a byte counter associated with a direct memory access controller for each submessage in a message. The byte counter includes at least a base address of memory and a byte count associated with a submessage. A byte counter associated with a submessage is monitored to determine whether at least a block of data of the submessage has been received. The block of data has a predetermined size, for example, a number of bytes. The block is processed when the block has been fully received, for example, when the byte count indicates all bytes of the block have been received. The monitoring and processing may continue for all blocks in all submessages in the message.

  10. Microchannel cross load array with dense parallel input

    DOE Patents [OSTI]

    Swierkowski, Stefan P.

    2004-04-06

    An architecture or layout for microchannel arrays using T or Cross (+) loading for electrophoresis or other injection and separation chemistry that are performed in microfluidic configurations. This architecture enables a very dense layout of arrays of functionally identical shaped channels and it also solves the problem of simultaneously enabling efficient parallel shapes and biasing of the input wells, waste wells, and bias wells at the input end of the separation columns. One T load architecture uses circular holes with common rows, but not columns, which allows the flow paths for each channel to be identical in shape, using multiple mirror image pieces. Another T load architecture enables the access hole array to be formed on a biaxial, collinear grid suitable for EDM micromachining (square holes), with common rows and columns.

  11. Runtime optimization of an application executing on a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.; Smith, Brian E.

    2013-01-29

    Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.

  12. DMA shared byte counters in a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Gara, Alan G.; Heidelberger, Philip; Vranas, Pavlos

    2010-04-06

    A parallel computer system is constructed as a network of interconnected compute nodes. Each of the compute nodes includes at least one processor, a memory and a DMA engine. The DMA engine includes a processor interface for interfacing with the at least one processor, DMA logic, a memory interface for interfacing with the memory, a DMA network interface for interfacing with the network, injection and reception byte counters, injection and reception FIFO metadata, and status registers and control registers. The injection FIFOs maintain memory locations of the injection FIFO metadata memory locations including its current head and tail, and the reception FIFOs maintain the reception FIFO metadata memory locations including its current head and tail. The injection byte counters and reception byte counters may be shared between messages.

  13. Nonlocal microscopic theory of quantum friction between parallel metallic slabs

    SciTech Connect (OSTI)

    Despoja, Vito

    2011-05-15

    We present a new derivation of the friction force between two metallic slabs moving with constant relative parallel velocity, based on T=0 quantum-field theory formalism. By including a fully nonlocal description of dynamically screened electron fluctuations in the slab, and avoiding the usual matching-condition procedure, we generalize previous expressions for the friction force, to which our results reduce in the local limit. Analyzing the friction force calculated in the two local models and in the nonlocal theory, we show that for physically relevant velocities local theories using the plasmon and Drude models of dielectric response are inappropriate to describe friction, which is due to excitation of low-energy electron-hole pairs, which are properly included in nonlocal theory. We also show that inclusion of dissipation in the nonlocal electronic response has negligible influence on friction.

  14. Determining collective barrier operation skew in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2015-11-24

    Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.

  15. Determining collective barrier operation skew in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2015-12-24

    Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.

  16. Stochastic PArallel Rarefied-gas Time-accurate Analyzer

    Energy Science and Technology Software Center (OSTI)

    2014-01-24

    The SPARTA package is software for simulating low-density fluids via the Direct Simulation Monte Carlo (DSMC) method, which is a particle-based method for tracking particle trajectories and collisions as a model of a multi-species gas. The main component of SPARTA is a simulation code which allows the user to specify a simulation domain, populate it with particles, embed triangulated surfaces as boundary conditions for the flow, overlay a grid for finding pairs of collision partners,more » and evolve the system in time via explicit timestepping. The package also includes various pre- and post-processing tools, useful for setting up simulations and analyzing the results. The simulation code runs either in serial on a single processor or desktop machine, or can be run in parallel using the MPI message-passing library, to enable faster performance on large problems.« less

  17. Synchronizing compute node time bases in a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip

    2015-01-27

    Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.

  18. Synchronizing compute node time bases in a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Faraj, Daniel A; Gooding, Thomas M; Heidelberger, Philip

    2014-12-30

    Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.

  19. Development of parallel DEM for the open source code MFIX

    SciTech Connect (OSTI)

    Gopalakrishnan, Pradeep; Tafti, Danesh

    2013-02-01

    The paper presents the development of a parallel Discrete Element Method (DEM) solver for the open source code, Multiphase Flow with Interphase eXchange (MFIX) based on the domain decomposition method. The performance of the code was evaluated by simulating a bubbling fluidized bed with 2.5 million particles. The DEM solver shows strong scalability up to 256 processors with an efficiency of 81%. Further, to analyze weak scaling, the static height of the fluidized bed was increased to hold 5 and 10 million particles. The results show that global communication cost increases with problem size while the computational cost remains constant. Further, the effects of static bed height on the bubble hydrodynamics and mixing characteristics are analyzed.

  20. Parallel Access of Out-Of-Core Dense Extendible Arrays

    SciTech Connect (OSTI)

    Otoo, Ekow J; Rotem, Doron

    2007-07-26

    Datasets used in scientific and engineering applications are often modeled as dense multi-dimensional arrays. For very large datasets, the corresponding array models are typically stored out-of-core as array files. The array elements are mapped onto linear consecutive locations that correspond to the linear ordering of the multi-dimensional indices. Two conventional mappings used are the row-major order and the column-major order of multi-dimensional arrays. Such conventional mappings of dense array files highly limit the performance of applications and the extendibility of the dataset. Firstly, an array file that is organized in say row-major order causes applications that subsequently access the data in column-major order, to have abysmal performance. Secondly, any subsequent expansion of the array file is limited to only one dimension. Expansions of such out-of-core conventional arrays along arbitrary dimensions, require storage reorganization that can be very expensive. Wepresent a solution for storing out-of-core dense extendible arrays that resolve the two limitations. The method uses a mapping function F*(), together with information maintained in axial vectors, to compute the linear address of an extendible array element when passed its k-dimensional index. We also give the inverse function, F-1*() for deriving the k-dimensional index when given the linear address. We show how the mapping function, in combination with MPI-IO and a parallel file system, allows for the growth of the extendible array without reorganization and no significant performance degradation of applications accessing elements in any desired order. We give methods for reading and writing sub-arrays into and out of parallel applications that run on a cluster of workstations. The axial-vectors are replicated and maintained in each node that accesses sub-array elements.

  1. RGG: Reactor geometry (and mesh) generator

    SciTech Connect (OSTI)

    Jain, R.; Tautges, T.

    2012-07-01

    The reactor geometry (and mesh) generator RGG takes advantage of information about repeated structures in both assembly and core lattices to simplify the creation of geometry and mesh. It is released as open source software as a part of the MeshKit mesh generation library. The methodology operates in three stages. First, assembly geometry models of various types are generated by a tool called AssyGen. Next, the assembly model or models are meshed by using MeshKit tools or the CUBIT mesh generation tool-kit, optionally based on a journal file output by AssyGen. After one or more assembly model meshes have been constructed, a tool called CoreGen uses a copy/move/merge process to arrange the model meshes into a core model. In this paper, we present the current state of tools and new features in RGG. We also discuss the parallel-enabled CoreGen, which in several cases achieves super-linear speedups since the problems fit in available RAM at higher processor counts. Several RGG applications - 1/6 VHTR model, 1/4 PWR reactor core, and a full-core model for Monju - are reported. (authors)

  2. Table 14a. Average Electricity Prices, Projected vs. Actual

    U.S. Energy Information Administration (EIA) Indexed Site

    a. Average Electricity Prices, Projected vs. Actual" "Projected Price in Constant Dollars" " (constant dollars, cents per kilowatt-hour in ""dollar year"" specific to each AEO)" ...

  3. Tax Credits, Rebates & Savings | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Boilers, Heat Pumps, Programmable Thermostats, Other EE Orcas Power & Light- MORE Green Power Program Incentive payments will be paid per kilowatt hour (kWh) of production,...

  4. Tax Credits, Rebates & Savings | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Insulation, Windows, Motor VFDs, Comprehensive MeasuresWhole Building, Other EE Orcas Power & Light- MORE Green Power Program Incentive payments will be paid per kilowatt hour...

  5. NREL: Concentrating Solar Power Research - Southwest Concentrating...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    of deployment, combined with research and development to reduce technology component costs, could help reduce concentrating solar power electricity costs to 0.07kilowatt-hour. ...

  6. System Advisor Model (SAM) | Open Energy Information

    Open Energy Info (EERE)

    total electricity production in kilowatt-hours for the first year based on hourly weather data for a particular location, and physical specifications of the power system...

  7. Method of grid generation

    DOE Patents [OSTI]

    Barnette, Daniel W.

    2002-01-01

    The present invention provides a method of grid generation that uses the geometry of the problem space and the governing relations to generate a grid. The method can generate a grid with minimized discretization errors, and with minimal user interaction. The method of the present invention comprises assigning grid cell locations so that, when the governing relations are discretized using the grid, at least some of the discretization errors are substantially zero. Conventional grid generation is driven by the problem space geometry; grid generation according to the present invention is driven by problem space geometry and by governing relations. The present invention accordingly can provide two significant benefits: more efficient and accurate modeling since discretization errors are minimized, and reduced cost grid generation since less human interaction is required.

  8. Steam generator support system

    DOE Patents [OSTI]

    Moldenhauer, J.E.

    1987-08-25

    A support system for connection to an outer surface of a J-shaped steam generator for use with a nuclear reactor or other liquid metal cooled power source is disclosed. The J-shaped steam generator is mounted with the bent portion at the bottom. An arrangement of elongated rod members provides both horizontal and vertical support for the steam generator. The rod members are interconnected to the steam generator assembly and a support structure in a manner which provides for thermal distortion of the steam generator without the transfer of bending moments to the support structure and in a like manner substantially minimizes forces being transferred between the support structure and the steam generator as a result of seismic disturbances. 4 figs.

  9. Steam generator support system

    DOE Patents [OSTI]

    Moldenhauer, James E.

    1987-01-01

    A support system for connection to an outer surface of a J-shaped steam generator for use with a nuclear reactor or other liquid metal cooled power source. The J-shaped steam generator is mounted with the bent portion at the bottom. An arrangement of elongated rod members provides both horizontal and vertical support for the steam generator. The rod members are interconnected to the steam generator assembly and a support structure in a manner which provides for thermal distortion of the steam generator without the transfer of bending moments to the support structure and in a like manner substantially minimizes forces being transferred between the support structure and the steam generator as a result of seismic disturbances.

  10. Generation | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Generation Generation Southeastern’s Power Operations employees perform the tasks of declaring, scheduling, dispatching, and accounting for capacity and energy generated at the 22 hydroelectric projects in the agency’s 11-state marketing area. Southeastern has Certified System Operators, meeting the criteria set forth by the North American Electric Reliability Corporation. Southeastern's Power Operations employees perform the tasks of declaring, scheduling, dispatching, and accounting

  11. Next Generation Materials:

    Energy Savers [EERE]

    Research & Development Projects » Next Generation Manufacturing Processes Next Generation Manufacturing Processes New process technologies can rejuvenate U.S. manufacturing. Novel processing concepts can open pathways to double net energy productivity, enabling rapid manufacture of energy-efficient, high-quality products at competitive cost. Four process technology areas are expected to generate large energy, carbon, and economic benefits across the manufacturing sector. Click the areas

  12. Distributed generation hits market

    SciTech Connect (OSTI)

    1997-10-01

    The pace at which vendors are developing and marketing gas turbines and reciprocating engines for small-scale applications may signal the widespread growth of distributed generation. Loosely defined to refer to applications in which power generation equipment is located close to end users who have near-term power capacity needs, distributed generation encompasses a broad range of technologies and load requirements. Disagreement is inevitable, but many industry observers associate distributed generation with applications anywhere from 25 kW to 25 MW. Ten years ago, distributed generation users only represented about 2% of the world market. Today, that figure has increased to about 4 or 5%, and probably could settle in the 20% range within a 3-to-5-year period, according to Michael Jones, San Diego, Calif.-based Solar Turbines Inc. power generation marketing manager. The US Energy Information Administration predicts about 175 GW of generation capacity will be added domestically by 2010. If 20% comes from smaller plants, distributed generation could account for about 35 GW. Even with more competition, it`s highly unlikely distributed generation will totally replace current market structures and central stations. Distributed generation may be best suited for making market inroads when and where central systems need upgrading, and should prove its worth when the system can`t handle peak demands. Typical applications include small reciprocating engine generators at remote customer sites or larger gas turbines to boost the grid. Additional market opportunities include standby capacity, peak shaving, power quality, cogeneration and capacity rental for immediate demand requirements. Integration of distributed generation systems--using gas-fueled engines, gas-fired combustion engines and fuel cells--can upgrade power quality for customers and reduce operating costs for electric utilities.

  13. Isolated trigger pulse generator

    DOE Patents [OSTI]

    Aaland, Kristian (Livermore, CA) [Livermore, CA

    1980-02-19

    A trigger pulse generation system capable of delivering a multiplicity of isolated 100 kV trigger pulses with picosecond simultaneity.

  14. Renewable Electricity Generation

    SciTech Connect (OSTI)

    2012-09-01

    This document highlights DOE's Office of Energy Efficiency and Renewable Energy's advancements in renewable electricity generation technologies including solar, water, wind, and geothermal.

  15. Isolated trigger pulse generator

    DOE Patents [OSTI]

    Aaland, K.

    1980-02-19

    A trigger pulse generation system capable of delivering a multiplicity of isolated 100 kV trigger pulses with picosecond simultaneity. 2 figs.

  16. Thermophotovoltaic energy generation

    DOE Patents [OSTI]

    Celanovic, Ivan; Chan, Walker; Bermel, Peter; Yeng, Adrian Y. X.; Marton, Christopher; Ghebrebrhan, Michael; Araghchini, Mohammad; Jensen, Klavs F.; Soljacic, Marin; Joannopoulos, John D.; Johnson, Steven G.; Pilawa-Podgurski, Robert; Fisher, Peter

    2015-08-25

    Inventive systems and methods for the generation of energy using thermophotovoltaic cells are described. Also described are systems and methods for selectively emitting electromagnetic radiation from an emitter for use in thermophotovoltaic energy generation systems. In at least some of the inventive energy generation systems and methods, a voltage applied to the thermophotovoltaic cell (e.g., to enhance the power produced by the cell) can be adjusted to enhance system performance. Certain embodiments of the systems and methods described herein can be used to generate energy relatively efficiently.

  17. SNE TRAFIC GENERATOR

    Energy Science and Technology Software Center (OSTI)

    003027MLTPL00 Network Traffic Generator for Low-rate Small Network Equipment Software http://eln.lbl.gov/sne_traffic_gen.html

  18. " Generation, by Program Sponsorship...

    U.S. Energy Information Administration (EIA) Indexed Site

    by Total Inputs of Energy for Heat, Power, and Electricity" " Generation, by Program Sponsorship, Industry Group, Selected" " Industries, and Type of Energy-Management Program, ...

  19. " Generation by Program Sponsorship...

    U.S. Energy Information Administration (EIA) Indexed Site

    A49. Total Inputs of Energy for Heat, Power, and Electricity" " Generation by Program Sponsorship, Industry Group, Selected" " Industries, and Type of Energy-Management Program, ...

  20. NEGATIVE GATE GENERATOR

    DOE Patents [OSTI]

    Jones, C.S.; Eaton, T.E.

    1958-02-01

    This patent relates to pulse generating circuits and more particularly to rectangular pulse generators. The pulse generator of the present invention incorporates thyratrons as switching elements to discharge a first capacitor through a load resistor to initiate and provide the body of a Pulse, and subsequently dlscharge a second capacitor to impress the potential of its charge, with opposite potential polarity across the load resistor to terminate the pulse. Accurate rectangular pulses in the millimicrosecond range are produced across a low impedance by this generator.

  1. Talkin Bout Wind Generation

    Office of Energy Efficiency and Renewable Energy (EERE)

    The amount of electricity generated by the wind industry started to grow back around 1999, and since 2007 has been increasing at a rapid pace.

  2. Parallel processing data network of master and slave transputers controlled by a serial control network

    DOE Patents [OSTI]

    Crosetto, D.B.

    1996-12-31

    The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.

  3. Parallel processing data network of master and slave transputers controlled by a serial control network

    DOE Patents [OSTI]

    Crosetto, Dario B.

    1996-01-01

    The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.

  4. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, Bruce E. (Livermore, CA); Duncan, David B. (Auburn, CA)

    1994-01-01

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect).

  5. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, Bruce E. (Livermore, CA); Duncan, David B. (Auburn, CA)

    1993-01-01

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect).

  6. Internal split field generator

    DOE Patents [OSTI]

    Thundat; Thomas George (Knoxville, TN); Van Neste, Charles W. (Kingston, TN); Vass, Arpad Alexander (Oak Ridge, TN)

    2012-01-03

    A generator includes a coil of conductive material. A stationary magnetic field source applies a stationary magnetic field to the coil. An internal magnetic field source is disposed within a cavity of the coil to apply a moving magnetic field to the coil. The stationary magnetic field interacts with the moving magnetic field to generate an electrical energy in the coil.

  7. Solid aerosol generator

    DOE Patents [OSTI]

    Prescott, Donald S.; Schober, Robert K.; Beller, John

    1992-01-01

    An improved solid aerosol generator used to produce a gas borne stream of dry, solid particles of predetermined size and concentration. The improved solid aerosol generator nebulizes a feed solution of known concentration with a flow of preheated gas and dries the resultant wet heated aerosol in a grounded, conical heating chamber, achieving high recovery and flow rates.

  8. Improved solid aerosol generator

    DOE Patents [OSTI]

    Prescott, D.S.; Schober, R.K.; Beller, J.

    1988-07-19

    An improved solid aerosol generator used to produce a gas borne stream of dry, solid particles of predetermined size and concentration. The improved solid aerosol generator nebulizes a feed solution of known concentration with a flow of preheated gas and dries the resultant wet heated aerosol in a grounded, conical heating chamber, achieving high recovery and flow rates. 2 figs.

  9. Solid aerosol generator

    DOE Patents [OSTI]

    Prescott, D.S.; Schober, R.K.; Beller, J.

    1992-03-17

    An improved solid aerosol generator used to produce a gas borne stream of dry, solid particles of predetermined size and concentration is disclosed. The improved solid aerosol generator nebulizes a feed solution of known concentration with a flow of preheated gas and dries the resultant wet heated aerosol in a grounded, conical heating chamber, achieving high recovery and flow rates. 2 figs.

  10. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, B.E.; Duncan, D.B.

    1994-02-15

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus is described. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect). 7 figures.

  11. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, B.E.; Duncan, D.B.

    1993-12-28

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect). 11 figures.

  12. Multi-petascale highly efficient parallel supercomputer (Patent) | SciTech

    Office of Scientific and Technical Information (OSTI)

    Connect Multi-petascale highly efficient parallel supercomputer Citation Details In-Document Search Title: Multi-petascale highly efficient parallel supercomputer A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many

  13. NETZ-a compact high speed parallel microprogrammed machine for signal processing

    SciTech Connect (OSTI)

    Dinur, J.; Lahat, M.

    1984-01-01

    A very fast processor, called NETZ, of unconventional architecture, was developed for real-time execution of highly complex computational algorithm. The unconventional architecture design includes advanced techniques such as the incorporation of two processors working in parallel, parallel processing and pipelining including a high-speed hardware multiplier, the use of a special loop counter, and the use of a variable-length computation cycle. A horizontal microprogrammed control unit allows fast parallel execution. 7 references.

  14. Geothermal Generation | Open Energy Information

    Open Energy Info (EERE)

    Geothermal Generation This article is a stub. You can help OpenEI by expanding it. Global Geothermal Energy Generation Global Geothermal Electricity Generation in 2007 (in millions...

  15. Rivulet Flow In Vertical Parallel-Wall Channel

    SciTech Connect (OSTI)

    D. M. McEligot; G. E. Mc Creery; P. Meakin

    2006-04-01

    In comparison with studies of rivulet flow over external surfaces, rivulet flow confined by two surfaces has received almost no attention. Fully-developed rivulet flow in vertical parallel-wall channels was characterized, both experimentally and analytically for flows intermediate between a lower flow limit of drop flow and an upper limit where the rivulets meander. Although this regime is the most simple rivulet flow regime, it does not appear to have been previously investigated in detail. Experiments were performed that measured rivulet widths for aperture spacing ranging from 0.152 mm to 0.914 mm. The results were compared with a simple steadystate analytical model for laminar flow. The model divides the rivulet cross-section into an inner region, which is dominated by viscous and gravitational forces and where essentially all flow is assumed to occur, and an outer region, dominated by capillary forces, where the geometry is determined by the contact angle between the fluid and the wall. Calculations using the model provided excellent agreement with data for inner rivulet widths and good agreement with measurements of outer rivulet widths.

  16. Comparing current cluster, massively parallel, and accelerated systems

    SciTech Connect (OSTI)

    Barker, Kevin J; Davis, Kei; Hoisie, Adolfy; Kerbyson, Darren J; Pakin, Scott; Lang, Mike; Sancho Pitarch, Jose C

    2010-01-01

    Currently there is large architectural diversity in high perfonnance computing systems. They include 'commodity' cluster systems that optimize per-node performance for small jobs, massively parallel processors (MPPs) that optimize aggregate perfonnance for large jobs, and accelerated systems that optimize both per-node and aggregate performance but only for applications custom-designed to take advantage of such systems. Because of these dissimilarities, meaningful comparisons of achievable performance are not straightforward. In this work we utilize a methodology that combines both empirical analysis and performance modeling to compare clusters (represented by a 4,352-core IB cluster), MPPs (represented by a 147,456-core BG/P), and accelerated systems (represented by the 129,600-core Roadrunner) across a workload of four applications. Strengths of our approach include the ability to compare architectures - as opposed to specific implementations of an architecture - attribute each application's performance bottlenecks to characteristics unique to each system, and to explore performance scenarios in advance of their availability for measurement. Our analysis illustrates that application performance is essentially unrelated to relative peak performance but that application performance can be both predicted and explained using modeling.

  17. Parallel Block Structured Adaptive Mesh Refinement on Graphics Processing Units

    SciTech Connect (OSTI)

    Beckingsale, D. A.; Gaudin, W. P.; Hornung, R. D.; Gunney, B. T.; Gamblin, T.; Herdman, J. A.; Jarvis, S. A.

    2014-11-17

    Block-structured adaptive mesh refinement is a technique that can be used when solving partial differential equations to reduce the number of zones necessary to achieve the required accuracy in areas of interest. These areas (shock fronts, material interfaces, etc.) are recursively covered with finer mesh patches that are grouped into a hierarchy of refinement levels. Despite the potential for large savings in computational requirements and memory usage without a corresponding reduction in accuracy, AMR adds overhead in managing the mesh hierarchy, adding complex communication and data movement requirements to a simulation. In this paper, we describe the design and implementation of a native GPU-based AMR library, including: the classes used to manage data on a mesh patch, the routines used for transferring data between GPUs on different nodes, and the data-parallel operators developed to coarsen and refine mesh data. We validate the performance and accuracy of our implementation using three test problems and two architectures: an eight-node cluster, and over four thousand nodes of Oak Ridge National Laboratory’s Titan supercomputer. Our GPU-based AMR hydrodynamics code performs up to 4.87× faster than the CPU-based implementation, and has been scaled to over four thousand GPUs using a combination of MPI and CUDA.

  18. Massively parallel processor networks with optical express channels

    DOE Patents [OSTI]

    Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.

    1999-01-01

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.

  19. Massively parallel processor networks with optical express channels

    DOE Patents [OSTI]

    Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.

    1999-08-24

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.

  20. Executing a gather operation on a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Ratterman, Joseph D.

    2012-03-20

    Methods, apparatus, and computer program products are disclosed for executing a gather operation on a parallel computer according to embodiments of the present invention. Embodiments include configuring, by the logical root, a result buffer or the logical root, the result buffer having positions, each position corresponding to a ranked node in the operational group and for storing contribution data gathered from that ranked node. Embodiments also include repeatedly for each position in the result buffer: determining, by each compute node of an operational group, whether the current position in the result buffer corresponds with the rank of the compute node, if the current position in the result buffer corresponds with the rank of the compute node, contributing, by that compute node, the compute node's contribution data, if the current position in the result buffer does not correspond with the rank of the compute node, contributing, by that compute node, a value of zero for the contribution data, and storing, by the logical root in the current position in the result buffer, results of a bitwise OR operation of all the contribution data by all compute nodes of the operational group for the current position, the results received through the global combining network.

  1. System and method for a parallel immunoassay system

    DOE Patents [OSTI]

    Stevens, Fred J.

    2002-01-01

    A method and system for detecting a target antigen using massively parallel immunoassay technology. In this system, high affinity antibodies of the antigen are covalently linked to small beads or particles. The beads are exposed to a solution containing DNA-oligomer-mimics of the antigen. The mimics which are reactive with the covalently attached antibody or antibodies will bind to the appropriate antibody molecule on the bead. The particles or beads are then washed to remove any unbound DNA-oligomer-mimics and are then immobilized or trapped. The bead-antibody complexes are then exposed to a test solution which may contain the targeted antigens. If the antigen is present it will replace the mimic since it has a greater affinity for the respective antibody. The particles are then removed from the solution leaving a residual solution. This residual solution is applied a DNA chip containing many samples of complimentary DNA. If the DNA tag from a mimic binds with its complimentary DNA, it indicates the presence of the target antigen. A flourescent tag can be used to more easily identify the bound DNA tag.

  2. A two-level parallel direct search implementation for arbitrarily sized objective functions

    SciTech Connect (OSTI)

    Hutchinson, S.A.; Shadid, N.; Moffat, H.K.

    1994-12-31

    In the past, many optimization schemes for massively parallel computers have attempted to achieve parallel efficiency using one of two methods. In the case of large and expensive objective function calculations, the optimization itself may be run in serial and the objective function calculations parallelized. In contrast, if the objective function calculations are relatively inexpensive and can be performed on a single processor, then the actual optimization routine itself may be parallelized. In this paper, a scheme based upon the Parallel Direct Search (PDS) technique is presented which allows the objective function calculations to be done on an arbitrarily large number (p{sub 2}) of processors. If, p, the number of processors available, is greater than or equal to 2p{sub 2} then the optimization may be parallelized as well. This allows for efficient use of computational resources since the objective function calculations can be performed on the number of processors that allow for peak parallel efficiency and then further speedup may be achieved by parallelizing the optimization. Results are presented for an optimization problem which involves the solution of a PDE using a finite-element algorithm as part of the objective function calculation. The optimum number of processors for the finite-element calculations is less than p/2. Thus, the PDS method is also parallelized. Performance comparisons are given for a nCUBE 2 implementation.

  3. A two-level parallel direct search implementation for arbitrarily sized objective functions

    SciTech Connect (OSTI)

    Hutchinson, S.A.; Shadid, J.N.; Moffat, H.K.; Ng, K.T.

    1994-02-21

    In the past, many optimization schemes for massively parallel computers have attempted to achieve parallel efficiency using one of two methods. In the case of large and expensive objective function calculations, the optimization itself may be run in serial and the objective function calculations parallelized. In contrast, if the objective function calculations are relatively inexpensive and can be performed on a single processor, then the actual optimization routine, itself may be parallelized. In this paper, a scheme based upon the Parallel Direct Search (PDS) technique is presented which allows the objective function calculations to be done on an arbitrarily large number (p2) of processors. If, p, the number of processors available, is greater than or equal to 2p{sub 2} then the optimization may be parallelized as well. This allows for efficient use of computational resources since the objective function calculations can be performed on the number of processors that allow for peak parallel efficiency and then further speedup may be achieved by parallelizing the optimization. Results are presented for an optimization problem which involves the solution of a PDE using a finite-element algorithm as part of the objective function calculation. The optimum number of processors for the finite-element calculations is less than p/2. Thus, the PDS method is also parallelized. Performance comparisons are given for a nCUBE 2 implementation.

  4. Cpl6: The New Extensible, High-Performance Parallel Coupler forthe...

    Office of Scientific and Technical Information (OSTI)

    Cpl6: The New Extensible, High-Performance Parallel Coupler forthe Community Climate System Model Citation Details In-Document Search Title: Cpl6: The New Extensible, ...

  5. Combined fuel and air staged power generation system

    SciTech Connect (OSTI)

    Rabovitser, Iosif K; Pratapas, John M; Boulanov, Dmitri

    2014-05-27

    A method and apparatus for generation of electric power employing fuel and air staging in which a first stage gas turbine and a second stage partial oxidation gas turbine power operated in parallel. A first portion of fuel and oxidant are provided to the first stage gas turbine which generates a first portion of electric power and a hot oxidant. A second portion of fuel and oxidant are provided to the second stage partial oxidation gas turbine which generates a second portion of electric power and a hot syngas. The hot oxidant and the hot syngas are provided to a bottoming cycle employing a fuel-fired boiler by which a third portion of electric power is generated.

  6. A NEW GENERATION CHEMICAL FLOODING SIMULATOR

    SciTech Connect (OSTI)

    Gary A. Pope; Kamy Sepehrnoori; Mojdeh Delshad

    2005-01-01

    The premise of this research is that a general-purpose reservoir simulator for several improved oil recovery processes can and should be developed so that high-resolution simulations of a variety of very large and difficult problems can be achieved using state-of-the-art algorithms and computers. Such a simulator is not currently available to the industry. The goal of this proposed research is to develop a new-generation chemical flooding simulator that is capable of efficiently and accurately simulating oil reservoirs with at least a million gridblocks in less than one day on massively parallel computers. Task 1 is the formulation and development of solution scheme, Task 2 is the implementation of the chemical module, and Task 3 is validation and application. In this final report, we will detail our progress on Tasks 1 through 3 of the project.

  7. Compact neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo; Lou, Tak Pui

    2005-03-22

    A compact neutron generator has at its outer circumference a toroidal shaped plasma chamber in which a tritium (or other) plasma is generated. A RF antenna is wrapped around the plasma chamber. A plurality of tritium ion beamlets are extracted through spaced extraction apertures of a plasma electrode on the inner surface of the toroidal plasma chamber and directed inwardly toward the center of neutron generator. The beamlets pass through spaced acceleration and focusing electrodes to a neutron generating target at the center of neutron generator. The target is typically made of titanium tubing. Water is flowed through the tubing for cooling. The beam can be pulsed rapidly to achieve ultrashort neutron bursts. The target may be moved rapidly up and down so that the average power deposited on the surface of the target may be kept at a reasonable level. The neutron generator can produce fast neutrons from a T-T reaction which can be used for luggage and cargo interrogation applications. A luggage or cargo inspection system has a pulsed T-T neutron generator or source at the center, surrounded by associated gamma detectors and other components for identifying explosives or other contraband.

  8. Synthetic guide star generation

    DOE Patents [OSTI]

    Payne, Stephen A.; Page, Ralph H.; Ebbers, Christopher A.; Beach, Raymond J.

    2004-03-09

    A system for assisting in observing a celestial object and providing synthetic guide star generation. A lasing system provides radiation at a frequency at or near 938 nm and radiation at a frequency at or near 1583 nm. The lasing system includes a fiber laser operating between 880 nm and 960 nm and a fiber laser operating between 1524 nm and 1650 nm. A frequency-conversion system mixes the radiation and generates light at a frequency at or near 589 nm. A system directs the light at a frequency at or near 589 nm toward the celestial object and provides synthetic guide star generation.

  9. Synthetic guide star generation

    DOE Patents [OSTI]

    Payne, Stephen A. [Castro Valley, CA; Page, Ralph H. [Castro Valley, CA; Ebbers, Christopher A. [Livermore, CA; Beach, Raymond J. [Livermore, CA

    2008-06-10

    A system for assisting in observing a celestial object and providing synthetic guide star generation. A lasing system provides radiation at a frequency at or near 938 nm and radiation at a frequency at or near 1583 nm. The lasing system includes a fiber laser operating between 880 nm and 960 nm and a fiber laser operating between 1524 nm and 1650 nm. A frequency-conversion system mixes the radiation and generates light at a frequency at or near 589 nm. A system directs the light at a frequency at or near 589 nm toward the celestial object and provides synthetic guide star generation.

  10. Magnetic field generator

    DOE Patents [OSTI]

    Krienin, Frank (Shoreham, NY)

    1990-01-01

    A magnetic field generating device provides a useful magnetic field within a specific retgion, while keeping nearby surrounding regions virtually field free. By placing an appropriate current density along a flux line of the source, the stray field effects of the generator may be contained. One current carrying structure may support a truncated cosine distribution, and it may be surrounded by a current structure which follows a flux line that would occur in a full coaxial double cosine distribution. Strong magnetic fields may be generated and contained using superconducting cables to approximate required current surfaces.

  11. Mann 3600 Pattern Generator

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Mann 3600 Pattern Generator Description: The GCA Mann 3600 pattern generator is designed for patterning standard 5" x 5" mask plates for use in optical lithography. Pattern designs are created in AutoCAD. The AutoCAD file is then converted into binary format, which can be fractured into data read by the pattern generator. The illumination source for exposures is a high pressure Hg arc lamp. The light is filtered and projected onto a shutter, which controls the exposure dose. A set of

  12. Graph Generator Survey

    SciTech Connect (OSTI)

    Lothian, Josh; Powers, Sarah S; Sullivan, Blair D; Baker, Matthew B; Schrock, Jonathan; Poole, Stephen W

    2013-12-01

    The benchmarking effort within the Extreme Scale Systems Center at Oak Ridge National Laboratory seeks to provide High Performance Computing benchmarks and test suites of interest to the DoD sponsor. The work described in this report is a part of the effort focusing on graph generation. A previously developed benchmark, SystemBurn, allowed the emulation of dierent application behavior profiles within a single framework. To complement this effort, similar capabilities are desired for graph-centric problems. This report examines existing synthetic graph generator implementations in preparation for further study on the properties of their generated synthetic graphs.

  13. PULSE SYNTHESIZING GENERATOR

    DOE Patents [OSTI]

    Kerns, Q.A.

    1963-08-01

    >An electronlc circuit for synthesizing electrical current pulses having very fast rise times includes several sinewave generators tuned to progressively higher harmonic frequencies with signal amplitudes and phases selectable according to the Fourier series of the waveform that is to be synthesized. Phase control is provided by periodically triggering the generators at precisely controlled times. The outputs of the generators are combined in a coaxial transmission line. Any frequency-dependent delays that occur in the transmission line can be readily compensated for so that the desired signal wave shape is obtained at the output of the line. (AEC)

  14. Generating electricity from viruses

    SciTech Connect (OSTI)

    Lee, Seung-Wuk

    2013-10-31

    Berkeley Lab's Seung-Wuk Lee discusses "Generating electricity from viruses" in this Oct. 28, 2013 talk, which is part of a Science at the Theater event entitled Eight Big Ideas.

  15. Generating electricity from viruses

    ScienceCinema (OSTI)

    Lee, Seung-Wuk

    2014-06-23

    Berkeley Lab's Seung-Wuk Lee discusses "Generating electricity from viruses" in this Oct. 28, 2013 talk, which is part of a Science at the Theater event entitled Eight Big Ideas.

  16. Biomass for Electricity Generation

    Reports and Publications (EIA)

    2002-01-01

    This paper examines issues affecting the uses of biomass for electricity generation. The methodology used in the National Energy Modeling System to account for various types of biomass is discussed, and the underlying assumptions are explained.

  17. Vector generator scan converter

    DOE Patents [OSTI]

    Moore, J.M.; Leighton, J.F.

    1988-02-05

    High printing speeds for graphics data are achieved with a laser printer by transmitting compressed graphics data from a main processor over an I/O channel to a vector generator scan converter which reconstructs a full graphics image for input to the laser printer through a raster data input port. The vector generator scan converter includes a microprocessor with associated microcode memory containing a microcode instruction set, a working memory for storing compressed data, vector generator hardware for drawing a full graphic image from vector parameters calculated by the microprocessor, image buffer memory for storing the reconstructed graphics image and an output scanner for reading the graphics image data and inputting the data to the printer. The vector generator scan converter eliminates the bottleneck created by the I/O channel for transmitting graphics data from the main processor to the laser printer, and increases printer speed up to thirty fold. 7 figs.

  18. Vector generator scan converter

    DOE Patents [OSTI]

    Moore, James M.; Leighton, James F.

    1990-01-01

    High printing speeds for graphics data are achieved with a laser printer by transmitting compressed graphics data from a main processor over an I/O (input/output) channel to a vector generator scan converter which reconstructs a full graphics image for input to the laser printer through a raster data input port. The vector generator scan converter includes a microprocessor with associated microcode memory containing a microcode instruction set, a working memory for storing compressed data, vector generator hardward for drawing a full graphic image from vector parameters calculated by the microprocessor, image buffer memory for storing the reconstructed graphics image and an output scanner for reading the graphics image data and inputting the data to the printer. The vector generator scan converter eliminates the bottleneck created by the I/O channel for transmitting graphics data from the main processor to the laser printer, and increases printer speed up to thirty fold.

  19. Denison Dam Historical Generation

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    50,000 100,000 150,000 200,000 250,000 300,000 350,000 400,000 450,000 500,000 (MWh) Denison Dam Historical Generation

  20. Scram signal generator

    DOE Patents [OSTI]

    Johanson, Edward W. (New Lenox, IL); Simms, Richard (Westmont, IL)

    1981-01-01

    A scram signal generating circuit for nuclear reactor installations monitors a flow signal representing the flow rate of the liquid sodium coolant which is circulated through the reactor, and initiates reactor shutdown for a rapid variation in the flow signal, indicative of fuel motion. The scram signal generating circuit includes a long-term drift compensation circuit which processes the flow signal and generates an output signal representing the flow rate of the coolant. The output signal remains substantially unchanged for small variations in the flow signal, attributable to long term drift in the flow rate, but a rapid change in the flow signal, indicative of a fast flow variation, causes a corresponding change in the output signal. A comparator circuit compares the output signal with a reference signal, representing a given percentage of the steady state flow rate of the coolant, and generates a scram signal to initiate reactor shutdown when the output signal equals the reference signal.

  1. Relativistic electron beam generator

    DOE Patents [OSTI]

    Mooney, L.J.; Hyatt, H.M.

    1975-11-11

    A relativistic electron beam generator for laser media excitation is described. The device employs a diode type relativistic electron beam source having a cathode shape which provides a rectangular output beam with uniform current density.

  2. Oscillating fluid power generator

    DOE Patents [OSTI]

    Morris, David C

    2014-02-25

    A system and method for harvesting the kinetic energy of a fluid flow for power generation with a vertically oriented, aerodynamic wing structure comprising one or more airfoil elements pivotably attached to a mast. When activated by the moving fluid stream, the wing structure oscillates back and forth, generating lift first in one direction then in the opposite direction. This oscillating movement is converted to unidirectional rotational movement in order to provide motive power to an electricity generator. Unlike other oscillating devices, this device is designed to harvest the maximum aerodynamic lift forces available for a given oscillation cycle. Because the system is not subjected to the same intense forces and stresses as turbine systems, it can be constructed less expensively, reducing the cost of electricity generation. The system can be grouped in more compact clusters, be less evident in the landscape, and present reduced risk to avian species.

  3. Xyce Parallel Electronic Simulator Reference Guide Version 6.4

    SciTech Connect (OSTI)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Schiek, Richard; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason; Baur, David Gregory

    2015-12-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce . This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1] . Trademarks The information herein is subject to change without notice. Copyright c 2002-2015 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce@sandia.gov (outside Sandia) xyce-sandia@sandia.gov (Sandia only)

  4. EIA - Electricity Generating Capacity

    U.S. Energy Information Administration (EIA) Indexed Site

    Electricity Generating Capacity Release Date: January 3, 2013 | Next Release: August 2013 Year Existing Units by Energy Source Unit Additions Unit Retirements 2011 XLS XLS XLS 2010 XLS XLS XLS 2009 XLS XLS XLS 2008 XLS XLS XLS 2007 XLS XLS XLS 2006 XLS XLS XLS 2005 XLS XLS XLS 2004 XLS XLS XLS 2003 XLS XLS XLS Source: Form EIA-860, "Annual Electric Generator Report." Related links Electric Power Monthly Electric Power Annual Form EIA-860 Source Data

  5. Steam generator tube failures

    SciTech Connect (OSTI)

    MacDonald, P.E.; Shah, V.N.; Ward, L.W.; Ellison, P.G.

    1996-04-01

    A review and summary of the available information on steam generator tubing failures and the impact of these failures on plant safety is presented. The following topics are covered: pressurized water reactor (PWR), Canadian deuterium uranium (CANDU) reactor, and Russian water moderated, water cooled energy reactor (VVER) steam generator degradation, PWR steam generator tube ruptures, the thermal-hydraulic response of a PWR plant with a faulted steam generator, the risk significance of steam generator tube rupture accidents, tubing inspection requirements and fitness-for-service criteria in various countries, and defect detection reliability and sizing accuracy. A significant number of steam generator tubes are defective and are removed from service or repaired each year. This wide spread damage has been caused by many diverse degradation mechanisms, some of which are difficult to detect and predict. In addition, spontaneous tube ruptures have occurred at the rate of about one every 2 years over the last 20 years, and incipient tube ruptures (tube failures usually identified with leak detection monitors just before rupture) have been occurring at the rate of about one per year. These ruptures have caused complex plant transients which have not always been easy for the reactor operators to control. Our analysis shows that if more than 15 tubes rupture during a main steam line break, the system response could lead to core melting. Although spontaneous and induced steam generator tube ruptures are small contributors to the total core damage frequency calculated in probabilistic risk assessments, they are risk significant because the radionuclides are likely to bypass the reactor containment building. The frequency of steam generator tube ruptures can be significantly reduced through appropriate and timely inspections and repairs or removal from service.

  6. Fuel cell generator

    DOE Patents [OSTI]

    Makiel, Joseph M.

    1985-01-01

    A high temperature solid electrolyte fuel cell generator comprising a housing means defining a plurality of chambers including a generator chamber and a combustion products chamber, a porous barrier separating the generator and combustion product chambers, a plurality of elongated annular fuel cells each having a closed end and an open end with the open ends disposed within the combustion product chamber, the cells extending from the open end through the porous barrier and into the generator chamber, a conduit for each cell, each conduit extending into a portion of each cell disposed within the generator chamber, each conduit having means for discharging a first gaseous reactant within each fuel cell, exhaust means for exhausting the combustion product chamber, manifolding means for supplying the first gaseous reactant to the conduits with the manifolding means disposed within the combustion product chamber between the porous barrier and the exhaust means and the manifolding means further comprising support and bypass means for providing support of the manifolding means within the housing while allowing combustion products from the first and a second gaseous reactant to flow past the manifolding means to the exhaust means, and means for flowing the second gaseous reactant into the generator chamber.

  7. Streamline Integration using MPI-Hybrid Parallelism on a Large Multi-Core Architecture

    SciTech Connect (OSTI)

    Camp, David; Garth, Christoph; Childs, Hank; Pugmire, Dave; Joy, Kenneth I.

    2010-11-01

    Streamline computation in a very large vector field data set represents a significant challenge due to the non-local and datadependentnature of streamline integration. In this paper, we conduct a study of the performance characteristics of hybrid parallel programmingand execution as applied to streamline integration on a large, multicore platform. With multi-core processors now prevalent in clustersand supercomputers, there is a need to understand the impact of these hybrid systems in order to make the best implementation choice.We use two MPI-based distribution approaches based on established parallelization paradigms, parallelize-over-seeds and parallelize-overblocks,and present a novel MPI-hybrid algorithm for each approach to compute streamlines. Our findings indicate that the work sharing betweencores in the proposed MPI-hybrid parallel implementation results in much improved performance and consumes less communication andI/O bandwidth than a traditional, non-hybrid distributed implementation.

  8. Parallel performance of a preconditioned CG solver for unstructured finite element applications

    SciTech Connect (OSTI)

    Shadid, J.N.; Hutchinson, S.A.; Moffat, H.K.

    1994-06-01

    A parallel unstructured finite element (FE) implementation designed for message passing machines is described. This implementation employs automated problem partitioning algorithms for load balancing unstructured grids, a distributed sparse matrix representation of the global finite element equations and a parallel conjugate gradient (CG) solver. In this paper a number of issues related to the efficient implementation of parallel unstructured mesh applications are presented. These include the differences between structured and unstructured mesh parallel applications, major communication kernels for unstructured CG solvers, automatic mesh partitioning algorithms, and the influence of mesh. partitioning metrics on parallel performance. Initial results are presented for example finite element (FE) heat transfer analysis applications on a 1024 processor nCUBE 2 hypercube. Results indicate over 95% scaled efficiencies are obtained for some large problems despite the required unstructured data communication.

  9. Parallel performance of a preconditioned CG solver for unstructured finite element applications

    SciTech Connect (OSTI)

    Shadid, J.N.; Hutchinson, S.A.; Moffat, H.K.

    1994-12-31

    A parallel unstructured finite element (FE) implementation designed for message passing MIMD machines is described. This implementation employs automated problem partitioning algorithms for load balancing unstructured grids, a distributed sparse matrix representation of the global finite element equations and a parallel conjugate gradient (CG) solver. In this paper a number of issues related to the efficient implementation of parallel unstructured mesh applications are presented. These include the differences between structured and unstructured mesh parallel applications, major communication kernels for unstructured CG solvers, automatic mesh partitioning algorithms, and the influence of mesh partitioning metrics on parallel performance. Initial results are presented for example finite element (FE) heat transfer analysis applications on a 1024 processor nCUBE 2 hypercube. Results indicate over 95% scaled efficiencies are obtained for some large problems despite the required unstructured data communication.

  10. Photovoltaic power generation system free of bypass diodes

    DOE Patents [OSTI]

    Lentine, Anthony L.; Okandan, Murat; Nielson, Gregory N.

    2015-07-28

    A photovoltaic power generation system that includes a solar panel that is free of bypass diodes is described herein. The solar panel includes a plurality of photovoltaic sub-modules, wherein at least two of photovoltaic sub-modules in the plurality of photovoltaic sub-modules are electrically connected in parallel. A photovoltaic sub-module includes a plurality of groups of electrically connected photovoltaic cells, wherein at least two of the groups are electrically connected in series. A photovoltaic group includes a plurality of strings of photovoltaic cells, wherein a string of photovoltaic cells comprises a plurality of photovoltaic cells electrically connected in series. The strings of photovoltaic cells are electrically connected in parallel, and the photovoltaic cells are microsystem-enabled photovoltaic cells.

  11. Final Report for "Analyzing and visualizing next generation climate data"

    SciTech Connect (OSTI)

    Pletzer, Alexander

    2012-11-13

    The project "Analyzing and visualizing next generation climate data" adds block-structured (mosaic) grid support, parallel processing, and 2D/3D curvilinear interpolation to the open-source UV-CDAT climate data analysis tool. Block structured grid support complies to the Gridspec extension submitted to the Climate and Forecast metadata conventions. It contains two parts: aggregation of data spread over multiple mosaic tiles (M-SPEC) and aggregation of temporal data stored in different files (F-SPEC). Together, M-SPEC and F-SPEC allow users to interact with data stored in multiple files as if the data were in a single file. For computational expensive tasks, a flexible, multi-dimensional, multi-type distributed array class allows users to process data in parallel using remote memory access. Both nodal and cell based interpolation is supported; users can choose between different interpolation libraries including ESMF and LibCF depending on the their particular needs.

  12. MCNP LWR Core Generator

    SciTech Connect (OSTI)

    Fischer, Noah A.

    2012-08-14

    The reactor core input generator allows for MCNP input files to be tailored to design specifications and generated in seconds. Full reactor models can now easily be created by specifying a small set of parameters and generating an MCNP input for a full reactor core. Axial zoning of the core will allow for density variation in the fuel and moderator, with pin-by-pin fidelity, so that BWR cores can more accurately be modeled. LWR core work in progress: (1) Reflectivity option for specifying 1/4, 1/2, or full core simulation; (2) Axial zoning for moderator densities that vary with height; (3) Generating multiple types of assemblies for different fuel enrichments; and (4) Parameters for specifying BWR box walls. Fuel pin work in progress: (1) Radial and azimuthal zoning for generating further unique materials in fuel rods; (2) Options for specifying different types of fuel for MOX or multiple burn assemblies; (3) Additional options for replacing fuel rods with burnable poison rods; and (4) Control rod/blade modeling.

  13. MHD Generating system

    DOE Patents [OSTI]

    Petrick, Michael; Pierson, Edward S.; Schreiner, Felix

    1980-01-01

    According to the present invention, coal combustion gas is the primary working fluid and copper or a copper alloy is the electrodynamic fluid in the MHD generator, thereby eliminating the heat exchangers between the combustor and the liquid-metal MHD working fluids, allowing the use of a conventional coalfired steam bottoming plant, and making the plant simpler, more efficient and cheaper. In operation, the gas and liquid are combined in a mixer and the resulting two-phase mixture enters the MHD generator. The MHD generator acts as a turbine and electric generator in one unit wherein the gas expands, drives the liquid across the magnetic field and thus generates electrical power. The gas and liquid are separated, and the available energy in the gas is recovered before the gas is exhausted to the atmosphere. Where the combustion gas contains sulfur, oxygen is bubbled through a side loop to remove sulfur therefrom as a concentrated stream of sulfur dioxide. The combustor is operated substoichiometrically to control the oxide level in the copper.

  14. Superconducting thermoelectric generator

    DOE Patents [OSTI]

    Metzger, J.D.; El-Genk, M.S.

    1994-01-01

    Thermoelectricity is produced by applying a temperature differential to dissimilar electrically conducting or semiconducting materials, thereby producing a voltage that is proportional to the temperature difference. Thermoelectric generators use this effect to directly convert heat into electricity; however, presently-known generators have low efficiencies due to the production of high currents which in turn cause large resistive heating losses. Some thermoelectric generators operate at efficiencies between 4% and 7% in the 800{degrees} to 1200{degrees}C range. According to its major aspects and bradly stated, the present invention is an apparatus and method for producing electricity from heat. In particular, the invention is a thermoelectric generator that juxtaposes a superconducting material and a semiconducting material - so that the superconducting and the semiconducting materials touch - to convert heat energy into electrical energy without resistive losses in the temperature range below the critical temperature of the superconducting material. Preferably, an array of superconducting material is encased in one of several possible configurations within a second material having a high thermal conductivity, preferably a semiconductor, to form a thermoelectric generator.

  15. Parallel Computation of the Topology of Level Sets

    SciTech Connect (OSTI)

    Pascucci, V; Cole-McLaughlin, K

    2004-12-16

    This paper introduces two efficient algorithms that compute the Contour Tree of a 3D scalar field F and its augmented version with the Betti numbers of each isosurface. The Contour Tree is a fundamental data structure in scientific visualization that is used to preprocess the domain mesh to allow optimal computation of isosurfaces with minimal overhead storage. The Contour Tree can also be used to build user interfaces reporting the complete topological characterization of a scalar field, as shown in Figure 1. Data exploration time is reduced since the user understands the evolution of level set components with changing isovalue. The Augmented Contour Tree provides even more accurate information segmenting the range space of the scalar field in portion of invariant topology. The exploration time for a single isosurface is also improved since its genus is known in advance. Our first new algorithm augments any given Contour Tree with the Betti numbers of all possible corresponding isocontours in linear time with the size of the tree. Moreover we show how to extend the scheme introduced in [3] with the Betti number computation without increasing its complexity. Thus, we improve on the time complexity from our previous approach [10] from O(m log m) to O(n log n + m), where m is the number of cells and n is the number of vertices in the domain of F. Our second contribution is a new divide-and-conquer algorithm that computes the Augmented Contour Tree with improved efficiency. The approach computes the output Contour Tree by merging two intermediate Contour Trees and is independent of the interpolant. In this way we confine any knowledge regarding a specific interpolant to an independent function that computes the tree for a single cell. We have implemented this function for the trilinear interpolant and plan to replace it with higher order interpolants when needed. The time complexity is O(n + t log n), where t is the number of critical points of F. For the first time we can compute the Contour Tree in linear time in many practical cases where t = O(n{sup 1-{epsilon}}). We report the running times for a parallel implementation, showing good scalability with the number of processors.

  16. Life Cycle Greenhouse Gas Emissions of Trough and Tower Concentrating Solar Power Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Burkhardt, J. J.; Heath, G.; Cohen, E.

    2012-04-01

    In reviewing life cycle assessment (LCA) literature of utility-scale concentrating solar power (CSP) systems, this analysis focuses on reducing variability and clarifying the central tendency of published estimates of life cycle greenhouse gas (GHG) emissions through a meta-analytical process called harmonization. From 125 references reviewed, 10 produced 36 independent GHG emissions estimates passing screens for quality and relevance: 19 for parabolic trough (trough) technology and 17 for power tower (tower) technology. The interquartile range (IQR) of published estimates for troughs and towers were 83 and 20 grams of carbon dioxide equivalent per kilowatt-hour (g CO2-eq/kWh),1 respectively; median estimates were 26 and 38 g CO2-eq/kWh for trough and tower, respectively. Two levels of harmonization were applied. Light harmonization reduced variability in published estimates by using consistent values for key parameters pertaining to plant design and performance. The IQR and median were reduced by 87% and 17%, respectively, for troughs. For towers, the IQR and median decreased by 33% and 38%, respectively. Next, five trough LCAs reporting detailed life cycle inventories were identified. The variability and central tendency of their estimates are reduced by 91% and 81%, respectively, after light harmonization. By harmonizing these five estimates to consistent values for global warming intensities of materials and expanding system boundaries to consistently include electricity and auxiliary natural gas combustion, variability is reduced by an additional 32% while central tendency increases by 8%. These harmonized values provide useful starting points for policy makers in evaluating life cycle GHG emissions from CSP projects without the requirement to conduct a full LCA for each new project.

  17. Spherical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo

    2006-11-21

    A spherical neutron generator is formed with a small spherical target and a spherical shell RF-driven plasma ion source surrounding the target. A deuterium (or deuterium and tritium) ion plasma is produced by RF excitation in the plasma ion source using an RF antenna. The plasma generation region is a spherical shell between an outer chamber and an inner extraction electrode. A spherical neutron generating target is at the center of the chamber and is biased negatively with respect to the extraction electrode which contains many holes. Ions passing through the holes in the extraction electrode are focused onto the target which produces neutrons by D-D or D-T reactions.

  18. Thermoacoustic magnetohydrodynamic electrical generator

    DOE Patents [OSTI]

    Wheatley, J.C.; Swift, G.W.; Migliori, A.

    1984-11-16

    A thermoacoustic magnetohydrodynamic electrical generator includes an intrinsically irreversible thermoacoustic heat engine coupled to a magnetohydrodynamic electrical generator. The heat engine includes an electrically conductive liquid metal as the working fluid and includes two heat exchange and thermoacoustic structure assemblies which drive the liquid in a push-pull arrangement to cause the liquid metal to oscillate at a resonant acoustic frequency on the order of 1000 Hz. The engine is positioned in the field of a magnet and is oriented such that the liquid metal oscillates in a direction orthogonal to the field of the magnet, whereby an alternating electrical potential is generated in the liquid metal. Low-loss, low-inductance electrical conductors electrically connected to opposite sides of the liquid metal conduct an output signal to a transformer adapted to convert the low-voltage, high-current output signal to a more usable higher voltage, lower current signal.

  19. Thermoacoustic magnetohydrodynamic electrical generator

    DOE Patents [OSTI]

    Wheatley, John C.; Swift, Gregory W.; Migliori, Albert

    1986-01-01

    A thermoacoustic magnetohydrodynamic electrical generator includes an intrinsically irreversible thermoacoustic heat engine coupled to a magnetohydrodynamic electrical generator. The heat engine includes an electrically conductive liquid metal as the working fluid and includes two heat exchange and thermoacoustic structure assemblies which drive the liquid in a push-pull arrangement to cause the liquid metal to oscillate at a resonant acoustic frequency on the order of 1,000 Hz. The engine is positioned in the field of a magnet and is oriented such that the liquid metal oscillates in a direction orthogonal to the field of the magnet, whereby an alternating electrical potential is generated in the liquid metal. Low-loss, low-inductance electrical conductors electrically connected to opposite sides of the liquid metal conduct an output signal to a transformer adapted to convert the low-voltage, high-current output signal to a more usable higher voltage, lower current signal.

  20. Sidetone generator flowmeter

    DOE Patents [OSTI]

    Fritz, Robert J.

    1986-01-01

    A flowmeter is provided which uses the sidetones generated in a cavity formed in the wall of a flowpipe or the like in response to fluid flowing past the cavity to provide a measure of the flow velocity of that fluid. The dimensions of the cavity are such as to provide a dominant vibratory frequency which is sensed by a pressure sensor. The flowmeter is adapted for use for a range of frequencies in which the Strouhal number is constant and under these conditions the vibratory frequency is directly related to the flow rate. The tone generator cavity and pressure transducer form a unit which is connected in-line in the flowpipe.

  1. Sidetone generator flowmeter

    DOE Patents [OSTI]

    Fritz, R.J.

    1983-11-03

    A flowmeter is provided which uses the sidetones generated in a cavity formed in the wall of a flowpipe or the like in response to fluid flowing past the cavity to provide a measure of the flow velocity of that fluid. The dimensions of the cavity are such as to provide a dominant vibratory frequency which is sensed by a pressure sensor. The flowmeter is adapted for use for a range of frequencies in which the Strouhal number is constant and under these conditions the vibratory frequency is directly related to the flow rate. The tone generator cavity and pressure transducer form a unit which is connected in-line in the flowpipe.

  2. External split field generator

    DOE Patents [OSTI]

    Thundat, Thomas George (Knoxville, TN); Van Neste, Charles W. (Kingston, TN); Vass, Arpad Alexander (Oak Ridge, TN)

    2012-02-21

    A generator includes a coil disposed about a core. A first stationary magnetic field source may be disposed on a first end portion of the core and a second stationary magnetic field source may be disposed on a second end portion of core. The first and second stationary magnetic field sources apply a stationary magnetic field to the coil. An external magnetic field source may be disposed outside the coil to apply a moving magnetic field to the coil. Electrical energy is generated in response to an interaction between the coil, the moving magnetic field, and the stationary magnetic field.

  3. Using Backup Generators: Choosing the Right Backup Generator - Homeowners |

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Department of Energy Homeowners Using Backup Generators: Choosing the Right Backup Generator - Homeowners Using Backup Generators: Choosing the Right Backup Generator - Homeowners Determine the amount of power you will need-How much power do you need to operate equipment and appliances connected to the generator? Portable generators made for household use can provide temporary power to a small number of selected appliances or lights. For example, light bulb wattage indicates the power needed

  4. Xyce parallel electronic simulator users' guide, Version 6.0.1.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David Gregory.

    2014-01-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.

  5. Xyce parallel electronic simulator users guide, version 6.1

    SciTech Connect (OSTI)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory

    2014-03-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.

  6. Iridium 191-m generator

    DOE Patents [OSTI]

    Treves, S.; Cheng, C.C.

    1988-03-08

    Potassium osmate, of the formula K[sub 2]OsO[sub 2](OH)[sub 4], is used to make a column for the generation of Ir-191 m, which is used in first pass angiography to detect cardiac defects in patients. 2 figs.

  7. Iridium 191-M generator

    DOE Patents [OSTI]

    Treves, Salvador; Cheng, Chris C.

    1988-03-08

    Potassium osmate, of the formula K.sub.2 Os O.sub.2 (OH).sub.4), used to make a column for the generation of Ir-191 m, which is used in first pass angiography to detect cardiac defects in patients.

  8. Hydrogen Generation for Refineries

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    2014 DE-FG02-08ER85135 Hydrogen Generation for Refineries DOE Phase II SBIR Dr. Girish Srinivas P.I. gsrinivas@tda.com 303-940-2321 Dr. Steven Gebhard, P.E. Dr. Robert Copeland Mr. ...

  9. Using Backup Generators: Choosing the Right Backup Generator...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Fuel sources may present additional safety and permitting issues. Choose the generator's ... Determine any utility requirements or building codes-Before you buy a generator, ask your ...

  10. Stochastic dynamics of small ensembles of non-processive molecular motors: The parallel cluster model

    SciTech Connect (OSTI)

    Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.

    2013-11-07

    Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.

  11. APPSPACK 4.0 : asynchronous parallel pattern search for derivative-free optimization.

    SciTech Connect (OSTI)

    Gray, Genetha Anne; Kolda, Tamara Gibson

    2004-12-01

    APPSPACK is software for solving unconstrained and bound constrained optimization problems. It implements an asynchronous parallel pattern search method that has been specifically designed for problems characterized by expensive function evaluations. Using APPSPACK to solve optimization problems has several advantages: No derivative information is needed; the procedure for evaluating the objective function can be executed via a separate program or script; the code can be run in serial or parallel, regardless of whether or not the function evaluation itself is parallel; and the software is freely available. We describe the underlying algorithm, data structures, and features of APPSPACK version 4.0 as well as how to use and customize the software.

  12. GASIFICATION FOR DISTRIBUTED GENERATION

    SciTech Connect (OSTI)

    Ronald C. Timpe; Michael D. Mann; Darren D. Schmidt

    2000-05-01

    A recent emphasis in gasification technology development has been directed toward reduced-scale gasifier systems for distributed generation at remote sites. The domestic distributed power generation market over the next decade is expected to be 5-6 gigawatts per year. The global increase is expected at 20 gigawatts over the next decade. The economics of gasification for distributed power generation are significantly improved when fuel transport is minimized. Until recently, gasification technology has been synonymous with coal conversion. Presently, however, interest centers on providing clean-burning fuel to remote sites that are not necessarily near coal supplies but have sufficient alternative carbonaceous material to feed a small gasifier. Gasifiers up to 50 MW are of current interest, with emphasis on those of 5-MW generating capacity. Internal combustion engines offer a more robust system for utilizing the fuel gas, while fuel cells and microturbines offer higher electric conversion efficiencies. The initial focus of this multiyear effort was on internal combustion engines and microturbines as more realistic near-term options for distributed generation. In this project, we studied emerging gasification technologies that can provide gas from regionally available feedstock as fuel to power generators under 30 MW in a distributed generation setting. Larger-scale gasification, primarily coal-fed, has been used commercially for more than 50 years to produce clean synthesis gas for the refining, chemical, and power industries. Commercial-scale gasification activities are under way at 113 sites in 22 countries in North and South America, Europe, Asia, Africa, and Australia, according to the Gasification Technologies Council. Gasification studies were carried out on alfalfa, black liquor (a high-sodium waste from the pulp industry), cow manure, and willow on the laboratory scale and on alfalfa, black liquor, and willow on the bench scale. Initial parametric tests evaluated through reactivity and product composition were carried out on thermogravimetric analysis (TGA) equipment. These tests were evaluated and then followed by bench-scale studies at 1123 K using an integrated bench-scale fluidized-bed gasifier (IBG) which can be operated in the semicontinuous batch mode. Products from tests were solid (ash), liquid (tar), and gas. Tar was separated on an open chromatographic column. Analysis of the gas product was carried out using on-line Fourier transform infrared spectroscopy (FT-IR). For selected tests, gas was collected periodically and analyzed using a refinery gas analyzer GC (gas chromatograph). The solid product was not extensively analyzed. This report is a part of a search into emerging gasification technologies that can provide power under 30 MW in a distributed generation setting. Larger-scale gasification has been used commercially for more than 50 years to produce clean synthesis gas for the refining, chemical, and power industries, and it is probable that scaled-down applications for use in remote areas will become viable. The appendix to this report contains a list, description, and sources of currently available gasification technologies that could be or are being commercially applied for distributed generation. This list was gathered from current sources and provides information about the supplier, the relative size range, and the status of the technology.

  13. Electric power monthly, September 1996, with data for June 1996

    SciTech Connect (OSTI)

    1996-09-01

    The Coal and Electric Data and Renewables Division; Office of Coal, Nuclear, Electric and Alternate Fuels, Energy Information Administration (EIA), Department of Energy prepares the EPM. This publication provides monthly statistics at the State, Census division, and U.S. levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics in the EPM on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant.

  14. Electric power monthly, December 1996 with data for September 1996

    SciTech Connect (OSTI)

    1996-12-01

    The report presents monthly electricity statistics for a wide audience including Congress, Federal and State agencies, the electric utility industry, and the general public. The purpose of this publication is to provide energy decisionmakers with accurate and timely information that may be used in forming various perspectives on electric issues that lie ahead. This publication provides monthly statistics at the State, Census division, and US levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant. 57 tabs.

  15. Electric power monthly, July 1999, with data for April 1999

    SciTech Connect (OSTI)

    1999-07-01

    The Electric Power Division, Office of Coal, Nuclear, Electric and Alternate Fuels, Energy Information Administration (EIA), Department of Energy prepares the Electric Power Monthly (EPM). This publication provides monthly statistics at the State, Census division, and US levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics in the EPM on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant. 1 fig., 64 tabs.

  16. Exelôn. Generation

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Exeln. Generation 4300 Winfield Road Warrenville, Illinois 60555 Writer's Direct Dial: ... On March 14, 2011, representatives of Exelon Generation Company, LLC and Exelon Nuclear ...

  17. Solaire Generation | Open Energy Information

    Open Energy Info (EERE)

    Generation Place: New York, New York Zip: 10001 Sector: Solar Product: New York-based rooftop PV mounting systems and solar canopy maker. References: Solaire Generation1 This...

  18. Hydro Power (pbl/generation)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Generation > Generation Hydro Power FCRPS Hydro Projects FCRPS Information Kiosk Current Hydrological Info Fish Funding Agreement FCRPS Definitions Wind Power Monthly GSP BPA White...

  19. SSE Generation | Open Energy Information

    Open Energy Info (EERE)

    SSE Generation Jump to: navigation, search Name: SSE Generation Place: Perth, Scotland, United Kingdom Zip: PH1 3AQ Sector: Renewable Energy Product: Owns and operates around half...

  20. Thermoacoustic magnetohydrodynamic electrical generator

    SciTech Connect (OSTI)

    Wheatley, J.C.; Swift, G.W.; Migliori, A.

    1986-07-08

    A thermoacoustic magnetohydrodynamic electrical generator is described comprising a magnet having a magnetic field, an elongate hollow housing containing an electrically conductive liquid and a thermoacoustic structure positioned in the liquid, heat exchange means thermally connected to the thermoacoustic structure for inducing the liquid to oscillate at an acoustic resonant frequency within the housing. The housing is positioned in the magnetic field and oriented such that the direction of the magnetic field and the direction of oscillatory motion of the liquid are substantially orthogonal to one another, first and second electrical conductor means connected to the liquid on opposite sides of the housing along an axis which is substantially orthogonal to both the direction of the magnetic field and the direction of oscillatory motion of the liquid, an alternating current output signal is generated in the conductor means at a frequency corresponding to the frequency of the oscillatory motion of the liquid.