National Library of Energy BETA

Sample records for kilowatt-hour parallel generation

  1. A Parallelized Hash Generator System

    E-Print Network [OSTI]

    EDA385 A Parallelized Hash Generator System Niklas Ald´en ael10nal@student.lu.se Gabriel J cracker uses the MD5 hash function to generate a hash from a random generated character sequence

  2. Computer Assisted Parallel Program Generation

    E-Print Network [OSTI]

    Kawata, Shigeo

    2015-01-01

    Parallel computation is widely employed in scientific researches, engineering activities and product development. Parallel program writing itself is not always a simple task depending on problems solved. Large-scale scientific computing, huge data analyses and precise visualizations, for example, would require parallel computations, and the parallel computing needs the parallelization techniques. In this Chapter a parallel program generation support is discussed, and a computer-assisted parallel program generation system P-NCAS is introduced. Computer assisted problem solving is one of key methods to promote innovations in science and engineering, and contributes to enrich our society and our life toward a programming-free environment in computing science. Problem solving environments (PSE) research activities had started to enhance the programming power in 1970's. The P-NCAS is one of the PSEs; The PSE concept provides an integrated human-friendly computational software and hardware system to solve a target ...

  3. Building the Next Generation of Parallel Applications: Co-Design...

    Office of Scientific and Technical Information (OSTI)

    Building the Next Generation of Parallel Applications: Co-Design Opportunities and Challenges. Citation Details In-Document Search Title: Building the Next Generation of Parallel...

  4. Parallel problem generation for structured problems in mathematical programming 

    E-Print Network [OSTI]

    Qiang, Feng

    2015-11-26

    The aim of this research is to investigate parallel problem generation for structured optimization problems. The result of this research has produced a novel parallel model generator tool, namely the Parallel Structured ...

  5. Array-Based Hierarchical Mesh Generation in Parallel | Argonne...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Array-Based Hierarchical Mesh Generation in Parallel Event Sponsor: Mathematics and Computing Science Seminar Start Date: Aug 20 2015 - 1:00am BuildingRoom: Building 240Room 4301...

  6. Generating unstructured nuclear reactor core meshes in parallel

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore »examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less

  7. Generating unstructured nuclear reactor core meshes in parallel

    SciTech Connect (OSTI)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor core examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.

  8. Automatic generation of executable communication specifications from parallel applications

    SciTech Connect (OSTI)

    Pakin, Scott [Los Alamos National Laboratory; Wu, Xing [NCSU; Mueller, Frank [NCSU

    2011-01-19

    Portable parallel benchmarks are widely used and highly effective for (a) the evaluation, analysis and procurement of high-performance computing (HPC) systems and (b) quantifying the potential benefits of porting applications for new hardware platforms. Yet, past techniques to synthetically parameterized hand-coded HPC benchmarks prove insufficient for today's rapidly-evolving scientific codes particularly when subject to multi-scale science modeling or when utilizing domain-specific libraries. To address these problems, this work contributes novel methods to automatically generate highly portable and customizable communication benchmarks from HPC applications. We utilize ScalaTrace, a lossless, yet scalable, parallel application tracing framework to collect selected aspects of the run-time behavior of HPC applications, including communication operations and execution time, while abstracting away the details of the computation proper. We subsequently generate benchmarks with identical run-time behavior from the collected traces. A unique feature of our approach is that we generate benchmarks in CONCEPTUAL, a domain-specific language that enables the expression of sophisticated communication patterns using a rich and easily understandable grammar yet compiles to ordinary C + MPI. Experimental results demonstrate that the generated benchmarks are able to preserve the run-time behavior - including both the communication pattern and the execution time - of the original applications. Such automated benchmark generation is particularly valuable for proprietary, export-controlled, or classified application codes: when supplied to a third party. Our auto-generated benchmarks ensure performance fidelity but without the risks associated with releasing the original code. This ability to automatically generate performance-accurate benchmarks from parallel applications is novel and without any precedence, to our knowledge.

  9. Reconciliation of Retailer Claims, 2005 CommissionReport

    E-Print Network [OSTI]

    operator to also report generation (in kilowatt-hours), generator technology, and fuel type consumed (as

  10. Solving Parallel Machine Scheduling Problems by Column Generation

    E-Print Network [OSTI]

    Powell, Warren B.

    and bound #12;1 Introduction We consider a class of problems of scheduling n independent jobs N = f1;2;:::;ng on m identical, uniform, or unrelated parallel machines M = f1;2;:::;mg with an objective of Systems Engineering University of Pennsylvania Philadelphia, PA 19104-6315 Warren B. Powell Department

  11. Full expandable model of parallel self-excited induction generators

    E-Print Network [OSTI]

    Simões, Marcelo Godoy

    for wind and small hydro power plants [1, 2]. They have advantages over conventional synchronous generators, in a wind or small hydro power plant, is subjected to various transient conditions, such as initial self-speed generators in renewable energy systems. Small hydro and wind generating systems have constraints on the size

  12. Bit error rate tester using fast parallel generation of linear recurring sequences

    DOE Patents [OSTI]

    Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.

    2003-05-06

    A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.

  13. Adapting the serial Alpgen event generator to simulate LHC collisions on millions of parallel threads

    E-Print Network [OSTI]

    Childers, J T; LeCompte, T J; Papka, M E; Benjamin, D P

    2015-01-01

    As the LHC moves to higher energies and luminosity, the demand for computing resources increases accordingly and will soon outpace the growth of the Worldwide LHC Computing Grid. To meet this greater demand, event generation Monte Carlo was targeted for adaptation to run on Mira, the supercomputer at the Argonne Leadership Computing Facility. Alpgen is a Monte Carlo event generation application that is used by LHC experiments in the simulation of collisions that take place in the Large Hadron Collider. This paper details the process by which Alpgen was adapted from a single-processor serial-application to a large-scale parallel-application and the performance that was achieved.

  14. 2011 U.S. Small Wind Turbine Market Report

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    production is 100,000 to 130,000 kilowatt-hours per year, and the turbine offsets an energy rate of 10 cents to 12 cents per kilowatt-hour. The turbine is expected to generate...

  15. Interaction of an oblique shock wave with a pair of parallel vortices: Shock dynamics and mechanism of sound generation

    E-Print Network [OSTI]

    Zhang, Yong-Tao

    and the mechanism of sound generation in the interaction between an oblique shock wave and a pair of vortices. WeInteraction of an oblique shock wave with a pair of parallel vortices: Shock dynamics and mechanism of sound generation Shuhai Zhanga China Aerodynamics Research and Development Center, Mianyang, Sichuan

  16. A Development of Design and Control Methodology for Next Generation Parallel Hybrid Electric Vehicle 

    E-Print Network [OSTI]

    Lai, Lin

    2013-01-28

    as the conventional vehicle, and hybridizes with an electrical drive in parallel to improve the fuel economy and performance beyond the conventional cars. By analyzing the HEV fuel economy versus the increasing of the electrical drive power on typical driving...

  17. Energy Intensity Indicators: Electricity Generation Energy Intensity

    Broader source: Energy.gov [DOE]

    A kilowatt-hour (kWh) of electric energy delivered to the final user has an energy equivalent to 3,412 British thermal units (Btu). Figure E1, below, tracks how much energy was used by the various...

  18. A Column Generation Based Decomposition Algorithm for a Parallel Machine Just-In-Time

    E-Print Network [OSTI]

    Powell, Warren B.

    = f1;2;:::;ng to be scheduled on m identical parallel machines M = f1;2;:::;mg. Associated with each Problem Zhi-Long Chen Department of Systems Engineering University of Pennsylvania Philadelphia, PA 19104-6315 Email: zlchen@seas.upenn.edu Warren B. Powell Department of Civil Engineering & Operations Research

  19. A Generic and Efficient E-field Parallel Imaging Correlator for Next-Generation Radio Telescopes

    E-Print Network [OSTI]

    Thyagarajan, Nithyanandan; Bowman, Judd D; Morales, Miguel F

    2015-01-01

    Modern radio telescopes are favoring densely packed array layouts consisting of large numbers of antennas ($N_\\textrm{a}\\gtrsim 1000$). Since the complexity of traditional correlators scales as $\\mathcal{O}(N_\\textrm{a}^2)$, there will be a steep cost for realizing the full imaging potential of these powerful instruments. Through our generic and efficient E-field Parallel Imaging Correlator (EPIC), we present the first software demonstration of a generalized direct imaging algorithm known as the Modular Optimal Frequency Fourier (MOFF) imager. It takes advantage of the multiplication-convolution theorem of Fourier transforms. Not only does it bring down the cost for dense layouts to $\\mathcal{O}(N_\\textrm{a}\\log_2 N_\\textrm{a})$ but can also image from irregularly arranged heterogeneous antenna arrays. EPIC is highly modular and parallelizable, implemented in object oriented Python, and publicly available. We have verified the images produced to be equivalent to those produced using traditional techniques. We...

  20. Parallel generation of quadripartite cluster entanglement in the optical frequency comb

    E-Print Network [OSTI]

    Matthew Pysher; Yoshichika Miwa; Reihaneh Shahrokhshahi; Russell Bloomer; Olivier Pfister

    2011-07-06

    Scalability and coherence are two essential requirements for the experimental implementation of quantum information and quantum computing. Here, we report a breakthrough toward scalability: the simultaneous generation of a record 15 quadripartite entangled cluster states over 60 consecutive cavity modes (Qmodes), in the optical frequency comb of a single optical parametric oscillator. The amount of observed entanglement was constant over the 60 Qmodes, thereby proving the intrnisic scalability of this system. The number of observable Qmodes was restricted by technical limitations, and we conservatively estimate the actual number of similar clusters to be at least three times larger. This result paves the way to the realization of large entangled states for scalable quantum information and quantum computing.

  1. PROCEEDINGS, Thirty-Fourth Workshop on Geothermal Reservoir Engineering Stanford University, Stanford, California, February 9-11, 2009

    E-Print Network [OSTI]

    Stanford University

    installed) as well as the operations-and-maintenance ("O&M") cost (¢ per kilowatt-hour generated spacing and injection rates that minimize the rate of decline in net generation with time. INTRODUCTION calls for minimizing the levelized cost of power (¢ per kilowatt-hour) over the project life. Minimizing

  2. Net Metering

    Broader source: Energy.gov [DOE]

    Net excess generation (NEG) is treated as a kilowatt-hour (kWh) credit or other compensation on the customer's following bill.* When an annual period ends, a utility will purchase unused credits...

  3. Tax Credits, Rebates & Savings | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Wind (All), Biomass, Wind (Small), Hydroelectric (Small) Net Metering Net excess generation (NEG) is treated as a kilowatt-hour (kWh) credit or other compensation on the...

  4. Life Cycle Greenhouse Gas Emissions of Coal-Fired Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Whitaker, M.; Heath, G. A.; O'Donoughue, P.; Vorum, M.

    2012-04-01

    This systematic review and harmonization of life cycle assessments (LCAs) of utility-scale coal-fired electricity generation systems focuses on reducing variability and clarifying central tendencies in estimates of life cycle greenhouse gas (GHG) emissions. Screening 270 references for quality LCA methods, transparency, and completeness yielded 53 that reported 164 estimates of life cycle GHG emissions. These estimates for subcritical pulverized, integrated gasification combined cycle, fluidized bed, and supercritical pulverized coal combustion technologies vary from 675 to 1,689 grams CO{sub 2}-equivalent per kilowatt-hour (g CO{sub 2}-eq/kWh) (interquartile range [IQR]= 890-1,130 g CO{sub 2}-eq/kWh; median = 1,001) leading to confusion over reasonable estimates of life cycle GHG emissions from coal-fired electricity generation. By adjusting published estimates to common gross system boundaries and consistent values for key operational input parameters (most importantly, combustion carbon dioxide emission factor [CEF]), the meta-analytical process called harmonization clarifies the existing literature in ways useful for decision makers and analysts by significantly reducing the variability of estimates ({approx}53% in IQR magnitude) while maintaining a nearly constant central tendency ({approx}2.2% in median). Life cycle GHG emissions of a specific power plant depend on many factors and can differ from the generic estimates generated by the harmonization approach, but the tightness of distribution of harmonized estimates across several key coal combustion technologies implies, for some purposes, first-order estimates of life cycle GHG emissions could be based on knowledge of the technology type, coal mine emissions, thermal efficiency, and CEF alone without requiring full LCAs. Areas where new research is necessary to ensure accuracy are also discussed.

  5. A Communication Backend for Parallel Language Compilers

    E-Print Network [OSTI]

    Shewchuk, Jonathan

    A Communication Backend for Parallel Language Compilers James M. Stichnoth and Thomas Gross Carnegie Mellon University Abstract. Generating good communication code is an important issue for all system usually implement the communication generation routines (e.g., message buffer packing

  6. Earthquake Ground Motion Modeling on Parallel Computers Hesheng Bao

    E-Print Network [OSTI]

    California at Berkeley, University of

    generator, as well as parallel numerical methods for applying seismic forces, incorporating absorbing generation, parallel unstructured PDE solvers, parallelizing compilers, seismic wave propagation, strong as necessary. Assessing the free­field ground motion to which a structure will be exposed during its lifetime

  7. Parallel MATLAB at VT: Parallel For Loops

    E-Print Network [OSTI]

    Crawford, T. Daniel

    discussed Matlab's Parallel Computing Toolbox (PCT), and the Distributed Computing Server (MDCS) that runs;FMINCON: Hidden Parallelism FMINCON is a popular Matlab function available in the Optimization Toolbox using FMINCON involves a boat trying to cross a river against a current. The boat is given 10 minutes

  8. Parallel Seismic Ray Tracing 

    E-Print Network [OSTI]

    Jain, Tarun K

    2013-12-09

    the idea of modeling ray tubes with an additional ray in the center to facilitate parallelism. The parallel wavefront construction algorithm is applied to wide range of models such as simple synthetic models that enable us to study various aspects...

  9. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  10. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, Hsu-Chi (Albuquerque, NM); Cheng, Yung-Sung (Albuquerque, NM)

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  11. The STAPL Parallel Container Framework 

    E-Print Network [OSTI]

    Tanase, Ilie Gabriel

    2012-02-14

    The Standard Template Adaptive Parallel Library (STAPL) is a parallel programming infrastructure that extends C with support for parallelism. STAPL provides a run-time system, a collection of distributed data structures (pContainers) and parallel...

  12. Parallel integrated thermal management

    DOE Patents [OSTI]

    Bennion, Kevin; Thornton, Matthew

    2014-08-19

    Embodiments discussed herein are directed to managing the heat content of two vehicle subsystems through a single coolant loop having parallel branches for each subsystem.

  13. Parallel phase model : a programming model for high-end parallel machines with manycores.

    SciTech Connect (OSTI)

    Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

    2009-04-01

    This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.

  14. Parallel computing works

    SciTech Connect (OSTI)

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  15. Time parallel gravitational collapse simulation

    E-Print Network [OSTI]

    Kreienbuehl, Andreas; Ruprecht, Daniel; Krause, Rolf

    2015-01-01

    This article demonstrates the applicability of the parallel-in-time method Parareal to the numerical solution of the Einstein gravity equations for the spherical collapse of a massless scalar field. To account for the shrinking of the spatial domain in time, a tailored load balancing scheme is proposed and compared to load balancing based on number of time steps alone. The performance of Parareal is studied for both the sub-critical and black hole case; our experiments show that Parareal generates substantial speedup and, in the super-critical regime, can also reproduce the black hole mass scaling law.

  16. Superconnections and Parallel Transport

    E-Print Network [OSTI]

    Dumitrescu, Florin

    2007-01-01

    This note addresses the construction of a notion of parallel transport along superpaths arising from the concept of a superconnection on a vector bundle over a manifold $M$. A superpath in $M$ is, loosely speaking, a path in $M$ together with an odd vector field in $M$ along the path. We also develop a notion of parallel transport associated with a connection (a.k.a. covariant derivative) on a vector bundle over a \\emph{supermanifold} which is a direct generalization of the classical notion of parallel transport for connections over manifolds.

  17. Automatic Parallelization of Hand Written Automotive Engine Control

    E-Print Network [OSTI]

    Kasahara, Hironori

    Automatic Parallelization of Hand Written Automotive Engine Control Codes Using OSCAR Compiler Dan approach to realize the next- generation automobiles integrated control system. However, automotive-core processors for a long time. This paper proposes to parallelize an automotive engine crankshaft control

  18. Modeling of Electric Power Supply Chain Networks with Fuel Suppliers Variational Inequalities

    E-Print Network [OSTI]

    Nagurney, Anna

    participants have, in turn, fundamentally changed not only electricity trading patterns but also the structure and associated algorithmic tools. Moreover, the availability of fuels for electric power generation is a topic kilowatt hours of electric power were generated, with United States being the largest producer and consumer

  19. Office of the President AGENDA ITEM 301 September 7, 2011

    E-Print Network [OSTI]

    Capecchi, Mario R.

    of 85 million kilowatt-hours of green electricity (green.) certified renewable energy and solar panel options: renewable energy certificates, onsite generation and utility green power products. 3 and helps biologists better explore the data they are generating. 6. Mayor Ralph Becker presented the Book

  20. A Comprehensive Look at High Performance Parallel I/O

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    of calculations per second-generating a tsunami of data along the way. In this era of "big data," high performance parallel IO-the way disk drives efficiently read and write...

  1. Parallel optical sampler

    DOE Patents [OSTI]

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  2. Parallel programming with Ada

    SciTech Connect (OSTI)

    Kok, J.

    1988-01-01

    To the human programmer the ease of coding distributed computing is highly dependent on the suitability of the employed programming language. But with a particular language it is also important whether the possibilities of one or more parallel architectures can efficiently be addressed by available language constructs. In this paper the possibilities are discussed of the high-level language Ada and in particular of its tasking concept as a descriptional tool for the design and implementation of numerical and other algorithms that allow execution of parts in parallel. Language tools are explained and their use for common applications is shown. Conclusions are drawn about the usefulness of several Ada concepts.

  3. Parallel programming with PCN

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  4. Learning in Parallel

    E-Print Network [OSTI]

    Vitter, Jeffrey Scott; Lin, Jyh-Han

    1992-01-01

    In this paper, we extend Valiant's sequential model of concept learning from examples [Valiant 1984] and introduce models for the e cient learning of concept classes from examples in parallel. We say that a concept class is NC-learnable if it can...

  5. Control Considerations in the Design of a Parallel Kinematic Machine with Separate Actuation and Metrology Mechanisms

    E-Print Network [OSTI]

    Florida, University of

    that generates optimized parallel kinematic mechanism geometry based on design criteria. This methodology is applied to two parallel kinematic mechanisms to develop a high payload machine that has direct manual and extends existing kinematic analysis of the parallel mechanism to achieve design goals. This methodology

  6. Using and Measuring the Combined Heat and Power Advantage 

    E-Print Network [OSTI]

    John, T.

    2011-01-01

    compared to other power generation systems. Fuel Charged to Power (FCP) is the fuel, net of credit for thermal output, required to produce a kilowatt-hour of electricity. This provides a metric that is used for comparison to the heat rate of other types...

  7. GreenCharge: Managing Renewable Energy in Smart Buildings

    E-Print Network [OSTI]

    Massachusetts at Amherst, University of

    and changing environmental conditions. Since the energy consumption density, in kilowatt-hours (kWh) per square foot, is higher than the energy generation density of solar and wind deployments at most locations on both the total number of participating consumers and the total amount of energy contributed per

  8. Accepted for Presentation at IEEE PES 2000 Winter Meeting, Singapore, January 2000 Assessment of Transmission Constraint Costs

    E-Print Network [OSTI]

    Analysis, Power System Visualization 1. Introduction Electricity markets throughout the world continue to model the expected optimal behavior for these markets. Deregulation of electric power generation cost of electricity to residences in New York in 1995 was 11.1 cents a kilowatt hour but was only 6

  9. Michael Klepinger, Extension Specialist Michigan State University

    E-Print Network [OSTI]

    electricity continues to rise. The aver- age end-user price of electricity in the United States was 8 cents projects are voicing concerns to township, city and county officials. The most common concerns are about per kilowatt hour (kWh) in 2005 (EIA, 2006a). Since the early 1980s, the price of wind-generated elec

  10. On parallel machine scheduling 1

    E-Print Network [OSTI]

    Magdeburg, Universität

    On parallel machine scheduling 1 machines with setup times. The setup has to be performed by a single server. The objective is to minimize even for the case of two identical parallel machines. This paper presents a pseudopolynomial

  11. Standard Templates Adaptive Parallel Library 

    E-Print Network [OSTI]

    Arzu, Francisco Jose

    2000-01-01

    STAPL (Standard Templates Adaptive Parallel Library) is a parallel C++ library designed as a superset of the C++ Standard Template Library (STL), sequentially consistent for functions with the same name, and executed on uni- or multi- processor...

  12. Practical Structured Parallelism using BMF 

    E-Print Network [OSTI]

    Crooke, David

    This thesis concerns the use of the Bird- Meertens Formalism as a mechanism to control parallelism in an imperative programming language. One of the main reasons for the failure of parallelism to enter mainstream computing ...

  13. Descriptive Simplicity in Parallel Computing 

    E-Print Network [OSTI]

    Marr, Marcus

    The programming of parallel computers is recognised as being a difficult task and there exist a wide selection of parallel programming languages and environments. This thesis presents and examines the Hierarchical ...

  14. Ultrascalable petaflop parallel supercomputer

    DOE Patents [OSTI]

    Blumrich, Matthias A. (Ridgefield, CT); Chen, Dong (Croton On Hudson, NY); Chiu, George (Cross River, NY); Cipolla, Thomas M. (Katonah, NY); Coteus, Paul W. (Yorktown Heights, NY); Gara, Alan G. (Mount Kisco, NY); Giampapa, Mark E. (Irvington, NY); Hall, Shawn (Pleasantville, NY); Haring, Rudolf A. (Cortlandt Manor, NY); Heidelberger, Philip (Cortlandt Manor, NY); Kopcsay, Gerard V. (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Salapura, Valentina (Chappaqua, NY); Sugavanam, Krishnan (Mahopac, NY); Takken, Todd (Brewster, NY)

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  15. Parallel Transports in Webs

    E-Print Network [OSTI]

    Christian Fleischhack

    2003-07-17

    For connected reductive linear algebraic structure groups it is proven that every web is holonomically isolated. The possible tuples of parallel transports in a web form a Lie subgroup of the corresponding power of the structure group. This Lie subgroup is explicitly calculated and turns out to be independent of the chosen local trivializations. Moreover, explicit necessary and sufficient criteria for the holonomical independence of webs are derived. The results above can even be sharpened: Given an arbitrary neighbourhood of the base points of a web, then this neighbourhood contains some segments of the web whose parameter intervals coincide, but do not include 0 (that corresponds to the base points of the web), and whose parallel transports already form the same Lie subgroup as those of the full web do.

  16. Parallel grid population

    DOE Patents [OSTI]

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  17. Xyce parallel electronic simulator.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

  18. OSCAR Parallelizing Compiler and Its Performance for

    E-Print Network [OSTI]

    Kasahara, Hironori

    OSCAR Parallelizing Compiler and Its Performance for Embedded Applications Hironori Kasahara Supercomputers and servers Industry Capsule inner cameras Compiler, API Medical servers Heavy particle radiation productivity and reduce power OSCAR Parallelizing Compiler Multigrain Parallelization coarse-grain parallelism

  19. Parallel ptychographic reconstruction

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps tomore »take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.« less

  20. Parallel ptychographic reconstruction

    SciTech Connect (OSTI)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.

  1. 8/30/2001 Parallel Programming -Fall 2001 1 Models of Parallel Computation

    E-Print Network [OSTI]

    Browne, James C.

    8/30/2001 Parallel Programming - Fall 2001 1 Models of Parallel Computation Philosophy Parallel of parallel programming. #12;8/30/2001 Parallel Programming - Fall 2001 2 Models of Parallel Computation will discuss parallelism from the viewpoint of programming but with connections to other domains. #12;8/30/2001

  2. Global synchronization of parallel processors using clock pulse width modulation

    DOE Patents [OSTI]

    Chen, Dong; Ellavsky, Matthew R.; Franke, Ross L.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Jeanson, Mark J.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Littrell, Daniel; Ohmacht, Martin; Reed, Don D.; Schenck, Brandon E.; Swetz, Richard A.

    2013-04-02

    A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

  3. Small file aggregation in a parallel computing system

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Zhang, Jingwang

    2014-09-02

    Techniques are provided for small file aggregation in a parallel computing system. An exemplary method for storing a plurality of files generated by a plurality of processes in a parallel computing system comprises aggregating the plurality of files into a single aggregated file; and generating metadata for the single aggregated file. The metadata comprises an offset and a length of each of the plurality of files in the single aggregated file. The metadata can be used to unpack one or more of the files from the single aggregated file.

  4. An integrated approach to improving the parallel applications development process

    SciTech Connect (OSTI)

    Rasmussen, Craig E [Los Alamos National Laboratory; Watson, Gregory R [IBM; Tibbitts, Beth R [IBM

    2009-01-01

    The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical phenomenon. It is becoming increasingly clear, however, that continued hardware performance improvements through clock scaling and feature-size reduction are simply not going to be achievable for much longer. The hardware vendor's approach to addressing this issue is to employ parallelism through multi-processor and multi-core technologies. While there is little doubt that this approach produces scaling improvements, there are still many significant hurdles to be overcome before parallelism can be employed as a general replacement to more traditional programming techniques. The Parallel Tools Platform (PTP) Project was created in 2005 in an attempt to provide developers with new tools aimed at addressing some of the parallel development issues. Since then, the introduction of a new generation of peta-scale and multi-core systems has highlighted the need for such a platform. In this paper, we describe some of the challenges facing parallel application developers, present the current state of PTP, and provide a simple case study that demonstrates how PTP can be used to locate a potential deadlock situation in an MPI code.

  5. An efficient parallel algorithm for matrix-vector multiplication

    SciTech Connect (OSTI)

    Hendrickson, B.; Leland, R.; Plimpton, S.

    1993-03-01

    The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.

  6. Template based parallel checkpointing in a massively parallel computer system

    DOE Patents [OSTI]

    Archer, Charles Jens (Rochester, MN); Inglett, Todd Alan (Rochester, MN)

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  7. A Parallel Graph Partitioner for STAPL 

    E-Print Network [OSTI]

    Castet, Nicolas

    2013-04-26

    high-level framework to develop parallel applications. One of the first steps of a parallel application is to partition and distribute the data throughout the system. An important data structure for parallel applications to store large amounts of data...

  8. The Parallel Evaluation of General Arithmetic Expressions RICHARD P. BRENT

    E-Print Network [OSTI]

    Bernstein, Daniel

    that apply if a fixed number of processors is available, see Section 5.) Kuck and Maruyama [12] have shown in time 2 log:n + 0 (1). Kuck [10], Maruyama [15], and Muraoka [20] have bonsidered expressions be of practical value, for Kuck [11] has shown that an optimizing compiler for a parallel machine might generat

  9. Parallel mesh adaptation C. Dobrzynski and J.-F. Remacle

    E-Print Network [OSTI]

    Dobrzynski, Cécile

    Introduction Extending mesh adaptation algorithms to parallel is a recent trend in the field of mesh generation contraints on a part of the boundary and we use an existing remeshing software. During the remeshing phase and Engineering, 2003. in preparation. Examples 2d isotropic case: two Archimede's spirals. The following size

  10. Parallel Programming and Optimization for Intel Architecture

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Programming and Optimization for Intel Architecture Parallel Programming and Optimization for Intel Architecture August 14, 2015 by Richard Gerber (0 Comments) Intel is...

  11. Writing parallel programs that work

    E-Print Network [OSTI]

    CERN. Geneva

    2012-01-01

    Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...

  12. Using true concurrency to model execution of parallel programs

    SciTech Connect (OSTI)

    Ben-Asher, Y.; Farchi, E.

    1994-08-01

    Parallel execution of a program R (intuitively regarded as a partial order) is usually modeled by sequentially executing one of the total orders (interleavings) into which it can be embedded. Our work deviates from this serialization principle by using true concurrency to model parallel execution. True concurrency is represented via completions of R to semi total orders, called time diagrams. These orders are characterized via a set of conditions (denoted by Ct), yielding orders of time diagrams which preserve some degree of the intended parallelism in R. Another way to express semi total orders is to use re-writing or derivation rules (denoted by Cx) which for any program R generates a set of semi-total orders. This paper includes a classification of parallel execution into three classes according to three different types of Ct conditions. For each class a suitable Cx is found and a proof of equivalence between the set of all time diagrams satisfying Ct and the set of all terminal Cx derivations of R is devised. This equivalence between time diagram conditions and derivation rules is used to define a novel notion of correctness for parallel programs. This notion is demonstrated by showing that a specific asynchronous program enforces synchronous execution, which always halts, showing that true concurrency can be useful in the context of parallel program verification.

  13. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  14. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  15. High-quality draft assemblies of mammalian genomes from massively parallel sequence data

    E-Print Network [OSTI]

    Gnerre, Sante

    Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used ...

  16. X. Parallel and Distributed Scientific A Numerical Linear Algebra Problem Solving Environment

    E-Print Network [OSTI]

    Dongarra, Jack

    X. Parallel and Distributed Scientific Computing A Numerical Linear Algebra Problem Solving­Quality, Reusable, Mathematical Software : : : : : : : : : : : : : : 467 3. Automatic Generation of Tuned Numerical : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 456 2. Numerical Linear Algebra Libraries : : : : : : : : : : : : : : : : : : : : : : : : : : : : 459

  17. Building the Next Generation of Parallel Applications: Co-Design

    Office of Scientific and Technical Information (OSTI)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantity of NaturalDukeWakefieldSulfate Reducing Bacteria (TechnicalTransmission, Distribution and-- Energy, science,--

  18. Tutorial: Parallel Simulation on Supercomputers

    SciTech Connect (OSTI)

    Perumalla, Kalyan S [ORNL

    2012-01-01

    This tutorial introduces typical hardware and software characteristics of extant and emerging supercomputing platforms, and presents issues and solutions in executing large-scale parallel discrete event simulation scenarios on such high performance computing systems. Covered topics include synchronization, model organization, example applications, and observed performance from illustrative large-scale runs.

  19. Designing a parallel simula machine

    SciTech Connect (OSTI)

    Papazoglou, M.P.; Georgiadis, P.I.; Maritsas, D.G.

    1983-10-01

    The parallel simula machine (PSM) architecture is based upon a master/slave topology, incorporating a master microprocessor. Interconnection circuitry between the master and slave processor modules uses a timesharing system bus and various programmable interrupt control units. Common and private memory modules reside in the PSM, and direct memory access transfers ease the master processor's workload. 5 references.

  20. Parallel HSL port Control Network

    E-Print Network [OSTI]

    Glück, Olivier

    Parallel HSL port FastHSL board HSL links Ethernet Control Network Node 1 PC mother board PCI Bus PCI-DDC Rcube PC mother board Node 3 PCI-DDC Rcube Node 2 PC mother board PCI-DDC Rcube THE MPC is the HSL network router, and PCI-DDC the network controller implementing the Direct Deposit State Less

  1. Parallel Algorithms for Medical Informatics on Data-Parallel Many-Core Processors

    E-Print Network [OSTI]

    Moazeni, Maryam

    2013-01-01

    and analysis of parallel algorithms." , Prentice Hall, Newsymposium on Discrete algorithms, pp. 271-280. Society forintroduction to parallel algorithms. Addison Wesley Longman

  2. The SpiceC Parallel Programming System

    E-Print Network [OSTI]

    Feng, Min

    2012-01-01

    6.5 Programming Speculative Parallel Loops on GPUs 6.5.17.3 SpiceC Programming on Clusters . . . . . . . 7.48 Related Work 8.1 Parallel Programming . . . . . . . . . .

  3. Parallelism Constraints Katrin Erk Joachim Niehren

    E-Print Network [OSTI]

    Paris-Sud XI, Université de

    Parallelism Constraints Katrin Erk Joachim Niehren Programming Systems Lab, Universit¨at des Saarlandes, Saarbr¨ucken, Germany www.ps.uni-sb.de/~{erk,niehren} Abstract. Parallelism constraints

  4. On-the-fly pipeline parallelism

    E-Print Network [OSTI]

    Lee, I-Ting Angelina

    Pipeline parallelism organizes a parallel program as a linear sequence of s stages. Each stage processes elements of a data stream, passing each processed data element to the next stage, and then taking on a new element ...

  5. Massive Parallel Quantum Computer Simulator

    E-Print Network [OSTI]

    K. De Raedt; K. Michielsen; H. De Raedt; B. Trieu; G. Arnold; M. Richter; Th. Lippert; H. Watanabe; N. Ito

    2006-08-30

    We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.

  6. Parallelization

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantityBonneville Power Administration wouldMass mapSpeeding access| Department ofStephen PSeptember|March Study

  7. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A. Michael (Murrysville, PA); Draper, Robert (Churchill Boro, PA)

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row.

  8. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A.M.; Draper, R.

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row. 5 figures.

  9. The Economic Effects of Electricity Deregulation: An Empricial Analysis of Indian States

    E-Print Network [OSTI]

    Sen, A; Jamasb, Tooraj

    with hydroelectricity could benefit from lesser coal dependency and higher efficiency levels; it may also affect the extent of deregulation, as hydroelectric reserves are state-controlled. Finally, state GDP per capita is used to control for effects relating... in a state Kilowatt Hours PWDF Percentage Energy Deficit in states Percentage HYDRO1 Hydroelectric generation capacity in a state Percentage PCGDP Per capita state GDP; adjusted for inflation at constant (1993-94) prices Million Rupees...

  10. GRIDS: Grid-Scale Rampable Intermittent Dispatchable Storage

    SciTech Connect (OSTI)

    2010-09-01

    GRIDS Project: The 12 projects that comprise ARPA-E’s GRIDS Project, short for “Grid-Scale Rampable Intermittent Dispatchable Storage,” are developing storage technologies that can store renewable energy for use at any location on the grid at an investment cost less than $100 per kilowatt hour. Flexible, large-scale storage would create a stronger and more robust electric grid by enabling renewables to contribute to reliable power generation.

  11. Solid state pulsed power generator

    DOE Patents [OSTI]

    Tao, Fengfeng; Saddoughi, Seyed Gholamali; Herbon, John Thomas

    2014-02-11

    A power generator includes one or more full bridge inverter modules coupled to a semiconductor opening switch (SOS) through an inductive resonant branch. Each module includes a plurality of switches that are switched in a fashion causing the one or more full bridge inverter modules to drive the semiconductor opening switch SOS through the resonant circuit to generate pulses to a load connected in parallel with the SOS.

  12. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  13. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, Michael S. (Ridgefield, CT); Strip, David R. (Albuquerque, NM)

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  14. Diophantine Generation,

    E-Print Network [OSTI]

    Shlapentokh, Alexandra

    Diophantine Generation, Horizontal and Vertical Problems, and the Weak Vertical Method Alexandra Shlapentokh Diophantine Sets, Definitions and Generation Diophantine Sets Diophantine Generation Properties of Diophantine Generation Diophantine Family of Z Diophantine Family of a Polynomial Ring Going Down Horizontal

  15. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantity ofkandz-cm11 Outreach Home Room NewsInformationJesseworkSURVEYI/O Streams forOrhanTheoreticalSecurity Complex3Parallel

  16. Parallel_HDF5.pptx

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantity ofkandz-cm11 Outreach Home Room NewsInformationJesseworkSURVEYI/O Streams forOrhanTheoreticalSecurityParallel I/OA Brief

  17. Xyce parallel electronic simulator design.

    SciTech Connect (OSTI)

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  18. Parallel transport quantum logic gates with trapped ions

    E-Print Network [OSTI]

    de Clercq, Ludwig; Marinelli, Matteo; Nadlinger, David; Oswald, Robin; Negnevitsky, Vlad; Kienzler, Daniel; Keitch, Ben; Home, Jonathan P

    2015-01-01

    Quantum information processing will require combinations of gate operations and communication, with each applied in parallel to large numbers of quantum systems. These tasks are often performed sequentially, with gates implemented by pulsed fields and information transported either by moving the physical qubits or using photonic links. For trapped ions, an alternative approach is to implement quantum logic gates by transporting the ions through static laser beams, combining qubit operations with transport. This has significant advantages for scalability since the voltage waveforms required for transport can potentially be generated using micro-electronics integrated into the trap structure itself, while both optical and microwave control elements are significantly more bulky. Using a multi-zone ion trap, we demonstrate transport gates on a qubit encoded in the hyperfine structure of a beryllium ion. We show the ability to perform sequences of operations, and to perform parallel gates on two ions transported t...

  19. Buffered coscheduling for parallel programming and enhanced fault tolerance

    DOE Patents [OSTI]

    Petrini, Fabrizio (Los Alamos, NM); Feng, Wu-chun (Los Alamos, NM)

    2006-01-31

    A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

  20. Device for balancing parallel strings

    DOE Patents [OSTI]

    Mashikian, Matthew S. (Storrs, CT)

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  1. Petascale Parallelization of the Gyrokinetic Toroidal Code

    SciTech Connect (OSTI)

    Ethier, Stephane; Adams, Mark; Carter, Jonathan; Oliker, Leonid

    2010-05-01

    The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application developed to study microturbulence in tokamak fusion devices. The global capability of GTC is unique, allowing researchers to systematically analyze important dynamics such as turbulence spreading. In this work we examine a new radial domain decomposition approach to allow scalability onto the latest generation of petascale systems. Extensive performance evaluation is conducted on three high performance computing systems: the IBM BG/P, the Cray XT4, and an Intel Xeon Cluster. Overall results show that the radial decomposition approach dramatically increases scalability, while reducing the memory footprint - allowing for fusion device simulations at an unprecedented scale. After a decade where high-end computing (HEC) was dominated by the rapid pace of improvements to processor frequencies, the performance of next-generation supercomputers is increasingly differentiated by varying interconnect designs and levels of integration. Understanding the tradeoffs of these system designs is a key step towards making effective petascale computing a reality. In this work, we examine a new parallelization scheme for the Gyrokinetic Toroidal Code (GTC) [?] micro-turbulence fusion application. Extensive scalability results and analysis are presented on three HEC systems: the IBM BlueGene/P (BG/P) at Argonne National Laboratory, the Cray XT4 at Lawrence Berkeley National Laboratory, and an Intel Xeon cluster at Lawrence Livermore National Laboratory. Overall results indicate that the new radial decomposition approach successfully attains unprecedented scalability to 131,072 BG/P cores by overcoming the memory limitations of the previous approach. The new version is well suited to utilize emerging petascale resources to access new regimes of physical phenomena.

  2. Parallel auto-correlative statistics with VTK.

    SciTech Connect (OSTI)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  3. Parallel distributed programming with Haskell + PVM

    E-Print Network [OSTI]

    Winstanley, N.; O'Donnell, J.T.

    Winstanley,N. O'Donnell,J.T. Proceedings of Euro-Par'97: Parallel Processing. Volume No 1300 pp 670-677 Springer

  4. Cost hierarchies for abstract parallel machines

    E-Print Network [OSTI]

    O'Donnell, J.T.

    O'Donnell,J.T. Rauber,T. Ruenger,G. 13th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2000) LNCS, Springer

  5. The Parallel Landscape Part I. Preliminaries

    E-Print Network [OSTI]

    Kaminsky, Alan

    prediction, phar maceutical drug design), geology (seismic data analysis, oil and mineral prospecting. The Parallel Landscape 1­3 · Computational finance: asset pricing, derivative pricing, market model ing

  6. Parallel computing in enterprise modeling.

    SciTech Connect (OSTI)

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  7. Distributed Generation

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Electricity, US Data. 6. Distributed Generation: Standby Generation and Cogeneration Ozz Energy Solutions, Inc. February 28 th , 2005. For more information about...

  8. Using Generative Design Patterns to Generate Parallel Code for a Distributed Memory Environment

    E-Print Network [OSTI]

    Schaeffer, Jonathan

    for developing sequential software. Many of these advances have quickly moved from academia to common practice, University of Alberta,Edmonton,AB, T6G2E8, Canada School of Computer Science, University of Waterloo. INTRODUCTION The past decade has seen enormous strides forward in software engineering methodologies and tools

  9. 3: Parallelism in Microprocessors Course on "Scalable Computing". Vittorio Scarano 3: Parallelism in Microprocessors

    E-Print Network [OSTI]

    Scarano, Vittorio

    -core Architectures From Multi-core to Many-core Power Management Some challenges ahead 2/65 3: Parallelism Multi-core to Many-core Power Management Some challenges ahead 3/65 3: Parallelism in Microprocessors-core to Many-core Power Management Some challenges ahead 5/65 3: Parallelism in Microprocessors Course

  10. Roles of Parallelizing CompilersRoles of Parallelizing Compilers for Low Power Manycoresy

    E-Print Network [OSTI]

    Kasahara, Hironori

    Roles of Parallelizing CompilersRoles of Parallelizing Compilers for Low Power Manycoresy Hironori improve effective performance, cost-performance and Needs of Parallelizing Compilers for Manycores p p p of consumed power byReduction of consumed power by compiler control DVFS and Power gating with hardware

  11. Parallel 3-D Electromagnetic Particle code using High Performance Fortran: Parallel TRISTAN

    E-Print Network [OSTI]

    Nishikawa, Ken-Ichi

    :{cai, ytli, cjxiao, yxy }@is.tsukuba.ac.jp 2 National Space Science and Technology Center, 320 Sparkman Drive using High Performance Fortran (HPF) as a RPM (Real Parallel Machine). In the parallelized HPF code to realize the standard High Performance Fortran specification and can be installed on a number of parallel

  12. Parallel Application Software on High Performance Survey of Parallel Software Packages of potential

    E-Print Network [OSTI]

    Ferreira-Resende, António

    i Parallel Application Software on High Performance Computers Survey of Parallel Software Packages.Lockey Edition 3: 24th June 1996 Abstract Parallel software packages which may be of use in scientific, software packages, scientific applications. This report is available from http://www.dl.ac.uk/TCSC/HPCI/ c

  13. PARALLEL EVOLUTIONARY ALGORITHMS FOR UAV PATH PLANNING

    E-Print Network [OSTI]

    PARALLEL EVOLUTIONARY ALGORITHMS FOR UAV PATH PLANNING Dong Jia Post-Doctoral Research Associate vehicles (UAVs). Premature convergence prevents evolutionary-based algorithms from reaching global optimal. To overcome this problem, this paper presents a framework of parallel evolutionary algorithms for UAV path

  14. A Parallel Geometric Multifrontal Solver Using

    E-Print Network [OSTI]

    We report the detailed parallel performance results in Table III. We also show .... Basic Energy Sciences/Biological and Environmental Research/High Energy Physics/Fusion Energy Sci- ences/Nuclear Physics). The research of ... In Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Con- ference on.

  15. Addendum to "Superconnections and Parallel Transport"

    E-Print Network [OSTI]

    Dumitrescu, Florin

    2011-01-01

    In this addendum to our article "Superconnections and Parallel Transport" we give an alternate construction to the parallel transport of a superconnection contained in Corollary 4.4 of \\cite{D1}, which has the advantage that is independent on the various ways a superconnection splits as a connection plus a bundle endomorphism valued form.

  16. Parallel Communicating Grammar Systems Lila Santean

    E-Print Network [OSTI]

    Kari, Lila

    Parallel Communicating Grammar Systems Lila Santean Academy of Finland and Mathematics Department of massively parallel processing systems increased the importance of interprocessor communication in the new of modelling the process of communication [2]. They consist of a system of grammars working together to produce

  17. Embedding infinitely parallel computation in Newtonian kinematics

    E-Print Network [OSTI]

    Tucker, John V.

    Embedding infinitely parallel computation in Newtonian kinematics E.J. Beggs a,1 J.V. Tucker b,2 a, infinite parallellism 1 Email: e.j.beggs@swansea.ac.uk 2 Email: j.v.tucker@swansea.ac.uk Preprint submitted

  18. Communication Characteristics in the NAS Parallel Benchmarks

    E-Print Network [OSTI]

    Communication Characteristics in the NAS Parallel Benchmarks Ahmad Faraj Xin Yuan Department-- In this paper, we investigate the communication characteris- tics of the Message Passing Interface (MPI) implementation of the NAS parallel benchmarks and study the effectiveness of com- piled communication for MPI

  19. Applications Parallel PIC plasma simulation through particle

    E-Print Network [OSTI]

    Vlad, Gregorio

    Applications Parallel PIC plasma simulation through particle decomposition techniques B. Di Martino 2000 Abstract Parallelization of a particle-in-cell (PIC) code has been accomplished through of these interactions can then be obtained by particle-in-cell (PIC) simulation techniques [2], which consist in fol

  20. IBM Watson, Nov. 2008 1 Parallel Scheduling

    E-Print Network [OSTI]

    Guestrin, Carlos

    IBM Watson, Nov. 2008 1 Parallel Scheduling Theory and Practice Guy Blelloch Carnegie Mellon University #12;IBM Watson, Nov. 2008 2 Parallel Languages User Scheduled MPI, Pthreads (typical usage) System. #12;IBM Watson, Nov. 2008 3 Example: Quicksort procedure QUICKSORT(S): if S contains at most one

  1. IBM Parallel Environment for Linux Introduction

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for Linux Introduction Version 4 Release 2 SA23-2218-00 #12;#12;IBM 2006) This edition applies to version 4, release 2, modification 0 of IBM Parallel Environment by a vertical line ( | ) to the left of the change. IBM welcomes your comments. A form for readers' comments may

  2. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Berg, Jeremy E. (Rochester, MN); Faraj, Ahmad A. (Rochester, MN)

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  3. Evaluating parallel relational databases for medical data analysis.

    SciTech Connect (OSTI)

    Rintoul, Mark Daniel; Wilson, Andrew T.

    2012-03-01

    Hospitals have always generated and consumed large amounts of data concerning patients, treatment and outcomes. As computers and networks have permeated the hospital environment it has become feasible to collect and organize all of this data. This raises naturally the question of how to deal with the resulting mountain of information. In this report we detail a proof-of-concept test using two commercially available parallel database systems to analyze a set of real, de-identified medical records. We examine database scalability as data sizes increase as well as responsiveness under load from multiple users.

  4. Electron dynamics in parallel electric and magnetic fields

    E-Print Network [OSTI]

    Christian Bracher; Tobias Kramer; John B. Delos

    2005-10-13

    We examine the spatial distribution of electrons generated by a fixed energy point source in uniform, parallel electric and magnetic fields. This problem is simple enough to permit analytic quantum and semiclassical solution, and it harbors a rich set of features which find their interpretation in the unusual and interesting properties of the classical motion of the electrons: For instance, the number of interfering trajectories can be adjusted in this system, and the turning surfaces of classical motion contain a complex array of singularities. We perform a comprehensive analysis of both the semiclassical approximation and the quantum solution, and we make predictions that should serve as a guide for future photodetachment experiments.

  5. Parametric instabilities of large-amplitude parallel propagating Alfven waves: 2-D PIC simulation

    E-Print Network [OSTI]

    Yasuhiro Nariyuki; Shuichi Matsukiyo; Tohru Hada

    2008-04-25

    We discuss the parametric instabilities of large-amplitude parallel propagating Alfven waves using the 2-D PIC simulation code. First, we confirmed the results in the past study [Sakai et al, 2005] that the electrons are heated due to the modified two stream instability and that the ions are heated by the parallel propagating ion acoustic waves. However, although the past study argued that such parallel propagating longitudinal waves are excited by transverse modulation of parent Alfven wave, we consider these waves are more likely to be generated by the usual, parallel decay instability. Further, we performed other simulation runs with different polarization of the parent Alfven waves or the different ion thermal velocity. Numerical results suggest that the electron heating by the modified two stream instability due to the large amplitude Alfven waves is unimportant with most parameter sets.

  6. PARALLEL PRESS University of WisconsinMadison Libraries

    E-Print Network [OSTI]

    Sprott, Julien Clinton

    #12;PARALLEL PRESS University of Wisconsin­Madison Libraries parallelpress.library.wisc.edu Parallel Press Catalog 2010­2011 #12;Forthcoming Books from Parallel Press John Adams: Dutiful Patriot John Method Mary Alexandra Agner For ordering information, see page 24. #12;PARALLEL PRESS 3 Parallel Press

  7. An efficient parallel set container for multicore architectures

    E-Print Network [OSTI]

    Fraguela, Basilio B.

    - tional sequential programs. The usage of parallel libraries is one of the best ways to facilitate on multicore systems. Keywords. Multicore architectures, parallel library, data containers, data parallelism 1. Motivation Parallel libraries are a good method to facilitate the expression of parallelism to pro- grammers

  8. Tempest:ASubstrateforPortableParallelProgramsSlide Tempest:ASubstratefor

    E-Print Network [OSTI]

    Lipasti, Mikko H.

    ParallelPrograms* MarkHill,JamesLarus,DavidWood WisconsinWindTunnelProject UniversityofWisconsin http:ASubstrateforPortableParallelPrograms WisconsinWindTunnelProject Slide 2 CanParallelComputingBecomeUbiquitous? ·Parallelhardwarepyramid:ASubstrateforPortableParallelPrograms WisconsinWindTunnelProject Slide 3 NotUnlessParallelSoftwareImproves ·Parallelsoftwaremorass

  9. Electrostatic generator/motor configurations

    DOE Patents [OSTI]

    Post, Richard F

    2014-02-04

    Electrostatic generators/motors designs are provided that generally may include a first cylindrical stator centered about a longitudinal axis; a second cylindrical stator centered about the axis, a first cylindrical rotor centered about the axis and located between the first cylindrical stator and the second cylindrical stator. The first cylindrical stator, the second cylindrical stator and the first cylindrical rotor may be concentrically aligned. A magnetic field having field lines about parallel with the longitudinal axis is provided.

  10. Recharging U.S. Energy Policy: Advocating for a National Renewable Portfolio Standard

    E-Print Network [OSTI]

    Lunt, Robin J.

    2007-01-01

    $0.40/ kilowatt-hour, and wind power cost $0.60/ kilowatt-hour, then the marginal cost of wind power would be $. 0.20/subsidizes the marginal cost of wind power in the case of

  11. Natural Language Generation for the Semantic Web: Unsupervised template extraction 

    E-Print Network [OSTI]

    Duma, Daniel

    2012-11-28

    I propose an architecture for a Natural Language Generation system that automatically learns sentence templates, together with statistical document planning, from parallel RDF data and text. To this end, I design, build ...

  12. Generating Code for High-Level Operations through Code Composition

    E-Print Network [OSTI]

    Generating Code for High-Level Operations through Code Composition James M. Stichnoth August 1997 of the authors and should not be interpreted as necessarily representing the official policies or endorsements: Compilers, code generation, parallelism, communication generation #12;Abstract A traditional compiler

  13. Xyce parallel electronic simulator : users' guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  14. Distributed parallel messaging for multiprocessor systems

    DOE Patents [OSTI]

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  15. Parallel path aspects of transmission modeling

    SciTech Connect (OSTI)

    Kavicky, J.A. [Argonne National Lab., IL (United States); Shahidehpour, S.M. [Illinois Inst. of Tech., Chicago, IL (United States). Dept. of Electrical and Computer Engineering

    1996-11-01

    This paper examines the present methods and modeling techniques available to address the effects of parallel flows resulting from various firm and short-term energy transactions. A survey of significant methodologies is conducted to determine the present status of parallel flow transaction modeling. The strengths and weaknesses of these approaches are identified to suggest areas of further modeling improvements. The motivating force behind this research is to improve transfer capability assessment accuracy by suggesting a real-time modeling environment that adequately represents the influences of parallel flows while recognizing operational constraints and objectives.

  16. Competitive Parallel Disk Prefetching and Buffer Management

    E-Print Network [OSTI]

    Barve, Rakesh; Kallahalla, Mahesh; Varman, Peter J.; Vitter, Jeffrey Scott

    2000-01-01

    Competitive Parallel Disk Prefetching and Buffer Managementa0 Rakesh Barvea1 Mahesh Kallahallaa2 Peter J. Varmana2 Jeffrey Scott Vittera3 rbarve@cs.duke.edu kalla@rice.edu pjv@rice.edu jsv@cs.duke.edu Dept. of CS Dept. of ECE Dept. of ECE Dept. of CS Duke... descriptions of I/O performance metrics, lookahead models, and parallel disk configura- tions are given in section 1.1. Our parallel prefetching algorithms NOM and GREED are described 3 in section 1.2. In section 2, we discuss practical situations in which...

  17. Efficient Parallel Text Compression on GPUs 

    E-Print Network [OSTI]

    Zhang, Xiaoxi

    2012-02-14

    is the combination of LZ77 and arithmetic coding. The dictionary com- pressor produces a stream of literal symbols and phrase references, which encodes one symbol at a time by the range encoder, using a model to make a probability prediction of each bit. The GPU... with range coding. To speedup, we design and implement parallel range encoding on GPUs. Finally we copy the compressed data from device to host and output them to the compressed file. 10 CHAPTER IV PARALLEL FINDER AND MERGER A. Parallel Match Finder We...

  18. OSCAR Parallelizing Compiler Cooperative Heterogeneous Multi-core Architecture

    E-Print Network [OSTI]

    Kasahara, Hironori

    OSCAR Parallelizing Compiler Cooperative Heterogeneous Multi-core Architecture Akihiro Hayashi, powerful parallelizing compiler for hetero- geneous multi-core architectures is expected. Furthermore, cooperative work between parallelizing compiler and hetero- geneous multi-core architectures is important

  19. A Parallelizing Compiler Cooperative Heterogeneous Multicore Processor Architecture

    E-Print Network [OSTI]

    Kasahara, Hironori

    A Parallelizing Compiler Cooperative Heterogeneous Multicore Processor Architecture Yasutaka Wada and a parallelizing com- piler is important. This paper proposes a compiler cooperative hetero- geneous multicore architecture and parallelizing compilation scheme for it. Performance of the proposed scheme is evaluated

  20. Subcontract Report NREL/SR-7A2-48318

    E-Print Network [OSTI]

    Wh kilowatt-hour LED light emitting diode MECO Maui Electric Company MWh megawatt-hour NAECA National

  1. The Daily Gazette Sunday, February 8, 2015 http://www.dailygazette.com/

    E-Print Network [OSTI]

    Radke, Rich

    on electrical energy costs unless they can get them with rebates." Kilowatt-hour prices also have something

  2. The cellular basis for parallel neural transmission of a high-frequency stimulus and its

    E-Print Network [OSTI]

    Benda, Jan

    The cellular basis for parallel neural transmission of a high-frequency stimulus and its low-frequency envelopes of high-frequency signals and also suggest that information about stimuli and their envelopes take EOD frequencies will generate a high-frequency envelope of their EOD that is referred

  3. PARALLEL HIGH THROUGHPUT SOFT-OUTPUT SPHERE DECODER Q. Qi, C. Chakrabarti

    E-Print Network [OSTI]

    Kambhampati, Subbarao

    - putation complexity of only the list generator. Recently, a high speed systolic-like soft-output spherePARALLEL HIGH THROUGHPUT SOFT-OUTPUT SPHERE DECODER Q. Qi, C. Chakrabarti School of Electrical,chaitali}@asu.edu ABSTRACT Multiple-Input-Multiple-Output communication systems de- mand fast sphere decoding with high

  4. Tradeoffs Between Parallelism and Fill in Nested Dissection Claudson F. Bornstein 1

    E-Print Network [OSTI]

    Maggs, Bruce M.

    generate. In particular, we present a new ``less parallel nested dissection'' algorithm (LPND). We prove matrices, at the cost of a small reduction in the paralellism in the orders that it produces. We have also entry A ji , a multiple of the ith row of A is subtracted from the jth row of A. Hence, the entries

  5. New Parallel Randomized Algorithms for the Traveling Salesman Leyuan Shi 1 Sigurdur '

    E-Print Network [OSTI]

    Vázquez-Abad, Felisa J.

    it has many applications in such areas as routing robots through automatic warehouses and drilling holes way and it is highly matched to emerging massively parallel processing capabilities. In this paper, we method generates high quality solutions compared to well known heuristic methods and it can identify

  6. Parallel phase-sensitive three-dimensional imaging camera

    DOE Patents [OSTI]

    Smithpeter, Colin L. (Albuquerque, NM); Hoover, Eddie R. (Sandia Park, NM); Pain, Bedabrata (Los Angeles, CA); Hancock, Bruce R. (Altadena, CA); Nellums, Robert O. (Albuquerque, NM)

    2007-09-25

    An apparatus is disclosed for generating a three-dimensional (3-D) image of a scene illuminated by a pulsed light source (e.g. a laser or light-emitting diode). The apparatus, referred to as a phase-sensitive 3-D imaging camera utilizes a two-dimensional (2-D) array of photodetectors to receive light that is reflected or scattered from the scene and processes an electrical output signal from each photodetector in the 2-D array in parallel using multiple modulators, each having inputs of the photodetector output signal and a reference signal, with the reference signal provided to each modulator having a different phase delay. The output from each modulator is provided to a computational unit which can be used to generate intensity and range information for use in generating a 3-D image of the scene. The 3-D camera is capable of generating a 3-D image using a single pulse of light, or alternately can be used to generate subsequent 3-D images with each additional pulse of light.

  7. Nonlinear parameter estimation in parallel computing environments 

    E-Print Network [OSTI]

    Li, Jie

    1996-01-01

    Paragon supercomputer. We use a two-dimensional permeability estimation problem as the example to test and demonstrate the usage of the parallel PEST. An existing simulator program called US3D, which solves the three-dimensional groudwater flow...

  8. Asynchronous parallel pattern search for nonlinear optimization

    SciTech Connect (OSTI)

    P. D. Hough; T. G. Kolda; V. J. Torczon

    2000-01-01

    Parallel pattern search (PPS) can be quite useful for engineering optimization problems characterized by a small number of variables (say 10--50) and by expensive objective function evaluations such as complex simulations that take from minutes to hours to run. However, PPS, which was originally designed for execution on homogeneous and tightly-coupled parallel machine, is not well suited to the more heterogeneous, loosely-coupled, and even fault-prone parallel systems available today. Specifically, PPS is hindered by synchronization penalties and cannot recover in the event of a failure. The authors introduce a new asynchronous and fault tolerant parallel pattern search (AAPS) method and demonstrate its effectiveness on both simple test problems as well as some engineering optimization problems

  9. Circuit Optimization Using Efficient Parallel Pattern Search 

    E-Print Network [OSTI]

    Narasimhan, Srinath S.

    2011-08-08

    evaluations and difficulty in getting explicit sensitivity information make these problems intractable to standard optimization methods. We propose to explore the recently developed asynchronous parallel pattern search (APPS) method for efficient driver size...

  10. Parallel VLSI Circuit Analysis and Optimization 

    E-Print Network [OSTI]

    Ye, Xiaoji

    2012-02-14

    The prevalence of multi-core processors in recent years has introduced new opportunities and challenges to Electronic Design Automation (EDA) research and development. In this dissertation, a few parallel Very Large Scale Integration (VLSI) circuit...

  11. Feature Clustering for Accelerating Parallel Coordinate Descent

    SciTech Connect (OSTI)

    Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh; Haglin, David J.

    2012-12-06

    We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.

  12. A Massively Parallel Solver for the Mechanical Harmonic Analysis...

    Office of Scientific and Technical Information (OSTI)

    Technical Report: A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities Citation Details In-Document Search Title: A Massively Parallel Solver...

  13. Chassis Dynamometer Testing of Parallel and Series Diesel Hybrid...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Chassis Dynamometer Testing of Parallel and Series Diesel Hybrid Buses Chassis Dynamometer Testing of Parallel and Series Diesel Hybrid Buses Emissions and fuel economy data were...

  14. The Manycore Revolution and Parallel Software Projects at NERSC

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Revolution and Parallel Software The Manycore Revolution and Parallel Software | Tags: Math & Computer Science PGAS.jpg Key Challenges: A new software ecosystem is expected to...

  15. HOPSPACK: Hybrid Optimization Parallel Search Package.

    SciTech Connect (OSTI)

    Gray, Genetha A.; Kolda, Tamara G.; Griffin, Joshua; Taddy, Matt; Martinez-Canales, Monica

    2008-12-01

    In this paper, we describe the technical details of HOPSPACK (Hybrid Optimization Parallel SearchPackage), a new software platform which facilitates combining multiple optimization routines into asingle, tightly-coupled, hybrid algorithm that supports parallel function evaluations. The frameworkis designed such that existing optimization source code can be easily incorporated with minimalcode modification. By maintaining the integrity of each individual solver, the strengths and codesophistication of the original optimization package are retained and exploited.4

  16. Automated Parallel Capillary Electrophoretic System

    DOE Patents [OSTI]

    Li, Qingbo (State College, PA); Kane, Thomas E. (State College, PA); Liu, Changsheng (State College, PA); Sonnenschein, Bernard (Brooklyn, NY); Sharer, Michael V. (Tyrone, PA); Kernan, John R. (Loganton, PA)

    2000-02-22

    An automated electrophoretic system is disclosed. The system employs a capillary cartridge having a plurality of capillary tubes. The cartridge has a first array of capillary ends projecting from one side of a plate. The first array of capillary ends are spaced apart in substantially the same manner as the wells of a microtitre tray of standard size. This allows one to simultaneously perform capillary electrophoresis on samples present in each of the wells of the tray. The system includes a stacked, dual carousel arrangement to eliminate cross-contamination resulting from reuse of the same buffer tray on consecutive executions from electrophoresis. The system also has a gel delivery module containing a gel syringe/a stepper motor or a high pressure chamber with a pump to quickly and uniformly deliver gel through the capillary tubes. The system further includes a multi-wavelength beam generator to generate a laser beam which produces a beam with a wide range of wavelengths. An off-line capillary reconditioner thoroughly cleans a capillary cartridge to enable simultaneous execution of electrophoresis with another capillary cartridge. The streamlined nature of the off-line capillary reconditioner offers the advantage of increased system throughput with a minimal increase in system cost.

  17. Architecture, implementation and parallelization of the software to search for periodic gravitational wave signals

    E-Print Network [OSTI]

    Gevorg Poghosyan; Sanchit Matta; Achim Streit; Micha? Bejger; Andrzej Królak

    2014-10-14

    The parallelization, design and scalability of the \\sky code to search for periodic gravitational waves from rotating neutron stars is discussed. The code is based on an efficient implementation of the F-statistic using the Fast Fourier Transform algorithm. To perform an analysis of data from the advanced LIGO and Virgo gravitational wave detectors' network, which will start operating in 2015, hundreds of millions of CPU hours will be required - the code utilizing the potential of massively parallel supercomputers is therefore mandatory. We have parallelized the code using the Message Passing Interface standard, implemented a mechanism for combining the searches at different sky-positions and frequency bands into one extremely scalable program. The parallel I/O interface is used to escape bottlenecks, when writing the generated data into file system. This allowed to develop a highly scalable computation code, which would enable the data analysis at large scales on acceptable time scales. Benchmarking of the code on a Cray XE6 system was performed to show efficiency of our parallelization concept and to demonstrate scaling up to 50 thousand cores in parallel.

  18. Center for Programming Models for Scalable Parallel Computing: Future Programming Models

    SciTech Connect (OSTI)

    Gao, Guang, R.

    2008-07-24

    The mission of the pmodel center project is to develop software technology to support scalable parallel programming models for terascale systems. The goal of the specific UD subproject is in the context developing an efficient and robust methodology and tools for HPC programming. More specifically, the focus is on developing new programming models which facilitate programmers in porting their application onto parallel high performance computing systems. During the course of the research in the past 5 years, the landscape of microprocessor chip architecture has witnessed a fundamental change – the emergence of multi-core/many-core chip architecture appear to become the mainstream technology and will have a major impact to for future generation parallel machines. The programming model for shared-address space machines is becoming critical to such multi-core architectures. Our research highlight is the in-depth study of proposed fine-grain parallelism/multithreading support on such future generation multi-core architectures. Our research has demonstrated the significant impact such fine-grain multithreading model can have on the productivity of parallel programming models and their efficient implementation.

  19. An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications

    SciTech Connect (OSTI)

    Vydyanathan, Naga; Krishnamoorthy, Sriram; Sabin, Gerald M.; Catalyurek, Umit V.; Kurc, Tahsin; Sadayappan, Ponnuswamy; Saltz, Joel H.

    2009-08-01

    Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application-tasks with dependences. These applications exhibit both task- and data-parallelism, and combining these two (also called mixedparallelism), has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task- and data-parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the taskgraph, the runtime estimates and scalability characteristics of the tasks and the inter-task data communication volumes. A locality conscious scheduling strategy is used to improve inter-task data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications as well as synthetic graphs shows that our algorithm consistently generates schedules with lower makespan as compared to CPR and CPA, two previously proposed scheduling algorithms. Our algorithm also produces schedules that have lower makespan than pure taskand data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.

  20. Distributed generation

    SciTech Connect (OSTI)

    Ness, E.

    1999-09-02

    Distributed generation, locating electricity generators close to the point of consumption, provides some unique benefits to power companies and customers that are not available from centralized electricity generation. Photovoltaic (PV) technology is well suited to distributed applications and can, especially in concert with other distributed resources, provide a very close match to the customer demand for electricity, at a significantly lower cost than the alternatives. In addition to augmenting power from central-station generating plants, incorporating PV systems enables electric utilities to optimize the utilization of existing transmission and distribution.

  1. Parallel log structured file system collective buffering to achieve a compact representation of scientific and/or dimensional data

    DOE Patents [OSTI]

    Grider, Gary A.; Poole, Stephen W.

    2015-09-01

    Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.

  2. Fuel dissipater for pressurized fuel cell generators

    DOE Patents [OSTI]

    Basel, Richard A.; King, John E.

    2003-11-04

    An apparatus and method are disclosed for eliminating the chemical energy of fuel remaining in a pressurized fuel cell generator (10) when the electrical power output of the fuel cell generator is terminated during transient operation, such as a shutdown; where, two electrically resistive elements (two of 28, 53, 54, 55) at least one of which is connected in parallel, in association with contactors (26, 57, 58, 59), a multi-point settable sensor relay (23) and a circuit breaker (24), are automatically connected across the fuel cell generator terminals (21, 22) at two or more contact points, in order to draw current, thereby depleting the fuel inventory in the generator.

  3. IBM Parallel Environment for Linux Operation and Use

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for Linux Operation and Use Using the Parallel Operating Environment Version 4 Release 2 SA23-2217-00 #12;#12;IBM Parallel Environment for Linux Operation and Use Using) This edition applies to version 4, release 2, modification 0 of IBM Parallel Environment for Linux (product

  4. Tempest:ASubstrateforPortableParallelProgramsSlide Tempest:ASubstratefor

    E-Print Network [OSTI]

    Lipasti, Mikko H.

    ParallelPrograms* MarkHill,JamesLarus,DavidWood WisconsinWindTunnelProject UniversityofWisconsin http:ASubstrateforPortableParallelPrograms WisconsinWindTunnelProject Slide 2 CanParallelComputingBecomeUbiquitous? ·Parallelhardwarepyramid;Tempest:ASubstrateforPortableParallelPrograms WisconsinWindTunnelProject Slide 3 Not

  5. Machine Learning Based Online Performance Prediction for Runtime Parallelization and Task Scheduling

    SciTech Connect (OSTI)

    Li, J; Ma, X; Singh, K; Schulz, M; de Supinski, B R; McKee, S A

    2008-10-09

    With the emerging many-core paradigm, parallel programming must extend beyond its traditional realm of scientific applications. Converting existing sequential applications as well as developing next-generation software requires assistance from hardware, compilers and runtime systems to exploit parallelism transparently within applications. These systems must decompose applications into tasks that can be executed in parallel and then schedule those tasks to minimize load imbalance. However, many systems lack a priori knowledge about the execution time of all tasks to perform effective load balancing with low scheduling overhead. In this paper, we approach this fundamental problem using machine learning techniques first to generate performance models for all tasks and then applying those models to perform automatic performance prediction across program executions. We also extend an existing scheduling algorithm to use generated task cost estimates for online task partitioning and scheduling. We implement the above techniques in the pR framework, which transparently parallelizes scripts in the popular R language, and evaluate their performance and overhead with both a real-world application and a large number of synthetic representative test scripts. Our experimental results show that our proposed approach significantly improves task partitioning and scheduling, with maximum improvements of 21.8%, 40.3% and 22.1% and average improvements of 15.9%, 16.9% and 4.2% for LMM (a real R application) and synthetic test cases with independent and dependent tasks, respectively.

  6. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-11-12

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.

  7. 18.337J / 6.338J Applied Parallel Computing (SMA 5505), Spring 2005

    E-Print Network [OSTI]

    Edelman, Alan

    Applied Parallel Computing is an advanced interdisciplinary introduction to applied parallel computing on modern supercomputers.

  8. Parallel Transport on Principal Bundles over Stacks

    E-Print Network [OSTI]

    Brian Collier; Eugene Lerman; Seth Wolbert

    2015-09-16

    In this paper we introduce a notion of parallel transport for principal bundles with connections over differentiable stacks. We show that principal bundles with connections over stacks can be recovered from their parallel transport thereby extending the results of Barrett, Caetano and Picken, and Schreiber and Waldof from manifolds to stacks. In the process of proving our main result we simplify Schreiber and Waldorf's definition of a transport functor for principal bundles with connections over manifolds and provide a more direct proof of the correspondence between principal bundles with connections and transport functors.

  9. Parallel Implementation of Power System Dynamic Simulation

    SciTech Connect (OSTI)

    Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng; Wu, Di; Chen, Yousu

    2013-07-21

    Dynamic simulation of power system transient stability is important for planning, monitoring, operation, and control of electrical power systems. However, modeling the system dynamics and network involves the computationally intensive time-domain solution of numerous differential and algebraic equations (DAE). This results in a transient stability implementation that may not maintain the real-time constraints of an online security assessment. This paper presents a parallel implementation of the dynamic simulation on a high-performance computing (HPC) platform using parallel simulation algorithms and computation architectures. It enables the simulation to run even faster than real time, enabling the “look-ahead” capability of upcoming stability problems in the power grid.

  10. Regional weather modeling on parallel computers.

    SciTech Connect (OSTI)

    Baillie, C.; Michalakes, J.; Skalin, R.; Mathematics and Computer Science; NOAA Forecast Systems Lab.; Norwegian Meteorological Inst.

    1997-01-01

    This special issue on 'regional weather models' complements the October 1995 special issue on 'climate and weather modeling', which focused on global models. In this introduction we review the similarities and differences between regional and global atmospheric models. Next, the structure of regional models is described and we consider how the basic algorithms applied in these models influence the parallelization strategy. Finally, we give a brief overview of the eight articles in this issue and discuss some remaining challenges in the area of adapting regional weather models to parallel computers.

  11. Parallel I/O in Practice

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantity ofkandz-cm11 Outreach Home Room NewsInformationJesseworkSURVEYI/O Streams forOrhanTheoreticalSecurityParallel I/O Parallel I/O

  12. Next Generation Radioisotope Generators | Department of Energy

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Generators Next Generation Radioisotope Generators Advanced Stirling Radioisotope Generator (ASRG) - The ASRG is currently being developed as a high-efficiency RPS technology...

  13. Users manual for the Chameleon parallel programming tools

    SciTech Connect (OSTI)

    Gropp, W.; Smith, B.

    1993-06-01

    Message passing is a common method for writing programs for distributed-memory parallel computers. Unfortunately, the lack of a standard for message passing has hampered the construction of portable and efficient parallel programs. In an attempt to remedy this problem, a number of groups have developed their own message-passing systems, each with its own strengths and weaknesses. Chameleon is a second-generation system of this type. Rather than replacing these existing systems, Chameleon is meant to supplement them by providing a uniform way to access many of these systems. Chameleon`s goals are to (a) be very lightweight (low over-head), (b) be highly portable, and (c) help standardize program startup and the use of emerging message-passing operations such as collective operations on subsets of processors. Chameleon also provides a way to port programs written using PICL or Intel NX message passing to other systems, including collections of workstations. Chameleon is tracking the Message-Passing Interface (MPI) draft standard and will provide both an MPI implementation and an MPI transport layer. Chameleon provides support for heterogeneous computing by using p4 and PVM. Chameleon`s support for homogeneous computing includes the portable libraries p4, PICL, and PVM and vendor-specific implementation for Intel NX, IBM EUI (SP-1), and Thinking Machines CMMD (CM-5). Support for Ncube and PVM 3.x is also under development.

  14. Storing files in a parallel computing system based on user-specified parser function

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  15. A mirror for lab-based quasi-monochromatic parallel x-rays

    SciTech Connect (OSTI)

    Nguyen, Thanhhai; Lu, Xun; Lee, Chang Jun; Jeon, Insu; Jung, Jin-Ho; Jin, Gye-Hwan; Kim, Sung Youb

    2014-09-15

    A multilayered parabolic mirror with six W/Al bilayers was designed and fabricated to generate monochromatic parallel x-rays using a lab-based x-ray source. Using this mirror, curved bright bands were obtained in x-ray images as reflected x-rays. The parallelism of the reflected x-rays was investigated using the shape of the bands. The intensity and monochromatic characteristics of the reflected x-rays were evaluated through measurements of the x-ray spectra in the band. High intensity, nearly monochromatic, and parallel x-rays, which can be used for high resolution x-ray microscopes and local radiation therapy systems, were obtained.

  16. The External Magnetic Field Created by the Superposition of Identical Parallel Finite Solenoids

    E-Print Network [OSTI]

    Lim, Melody Xuan

    2015-01-01

    Using superposition and numerical approximations of a published analytical expression for the magnetic field generated by a finite solenoid, we show that the magnetic field external to parallel identical solenoids can be nearly uniform and substantial, even when the solenoids have lengths that are large compared to their radii. We study two arrangements of solenoids---a ring of parallel solenoids whose surfaces are tangent to a common cylindrical surface and to nearest neighbours, and a large finite hexagonal array of parallel solenoids---and summarize how the magnitude and uniformity of the resultant external field depend on the solenoid length and distances between solenoids. We also report some novel results about single solenoids, e.g., that the energy stored in the internal magnetic field exceeds the energy stored in the spatially infinite external magnetic field for even short solenoids. These results should be broadly interesting to undergraduates learning about electricity and magnetism as novel examp...

  17. Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program. II: Wavelength Parallelization

    E-Print Network [OSTI]

    E. Baron; Peter H. Hauschildt

    1997-09-24

    We describe an important addition to the parallel implementation of our generalized NLTE stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000--300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard MPI library calls and is fully portable between serial and parallel computers.

  18. Parallel processor-based raster graphics system architecture

    DOE Patents [OSTI]

    Littlefield, Richard J. (Seattle, WA)

    1990-01-01

    An apparatus for generating raster graphics images from the graphics command stream includes a plurality of graphics processors connected in parallel, each adapted to receive any part of the graphics command stream for processing the command stream part into pixel data. The apparatus also includes a frame buffer for mapping the pixel data to pixel locations and an interconnection network for interconnecting the graphics processors to the frame buffer. Through the interconnection network, each graphics processor may access any part of the frame buffer concurrently with another graphics processor accessing any other part of the frame buffer. The plurality of graphics processors can thereby transmit concurrently pixel data to pixel locations in the frame buffer.

  19. Parallel performance optimizations on unstructured mesh-based simulations

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; Huck, Kevin; Hollingsworth, Jeffrey; Malony, Allen; Williams, Samuel; Oliker, Leonid

    2015-06-01

    This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more »We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less

  20. CHAPTER 30 (in old edition) Parallel Algorithms

    E-Print Network [OSTI]

    Dragan, Feodor F.

    for Parallel Random Access Machine ­ Consists of p processors (PEs), P0, P1, P2, ... , Pp-1 connected can be assumed to be either synchronous or asynchronous. · When synchronous, all operations (e to be added to perform the required data movement on real machines. ­ However, the constant-time global data

  1. Automatic Loop Parallelization via Compiler Guided Refactoring

    E-Print Network [OSTI]

    . Lyngby, Denmark Email: {pl,ska}@imm.dtu.dk Computer Science Engineering Chalmers U. Technology, 412 96 by the computing industry today. Yet, applications are often written in ways that prevent automatic parallelization Gothenburg, Sweden Email: lidman@student.chalmers.se, mckee@chalmers.se IBM Haifa Research Labs Mount Carmel

  2. Parallel programming with PCN. Revision 2

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  3. Message passing with parallel queue traversal

    DOE Patents [OSTI]

    Underwood, Keith D. (Albuquerque, NM); Brightwell, Ronald B. (Albuquerque, NM); Hemmert, K. Scott (Albuquerque, NM)

    2012-05-01

    In message passing implementations, associative matching structures are used to permit list entries to be searched in parallel fashion, thereby avoiding the delay of linear list traversal. List management capabilities are provided to support list entry turnover semantics and priority ordering semantics.

  4. Parallel Performance of a Combustion Chemistry Simulation

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Skinner, Gregg; Eigenmann, Rudolf

    1995-01-01

    We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.

  5. Parallel Programming for High-Performance Applications

    E-Print Network [OSTI]

    Bal, Henri E.

    into a rather dull and routine occupation'' · S. Gill, Computer Journal, 1958 #12;Why do we need parallel important #12;Moore's law (1975) · Circuit complexity doubles every 18 months · Exponential transistor machines (Blue Gene) · 2000s: grid computing: combining resources world- wide (Globus) · Now: multicores

  6. Pictorial Representation of Parallel Programs Susan Stepney

    E-Print Network [OSTI]

    Stepney, Susan

    of its tools, GRAIL. 1 Introduction Parallel programs have considerably more complicated structures than been developed as part of the Alvey ParSiFal project, for use by one of its tools, GRAIL [1 design. 3 Two-Dimensional Display The pictorial representation used in GRAIL is two dimensional

  7. Parallelism for quantum computation with qudits

    SciTech Connect (OSTI)

    O'Leary, Dianne P. [Department of Computer Science and Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland 20742, USA and Mathematical and Computational Sciences Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899 (United States); IDA Center for Computing Sciences, 17100 Science Drive, Bowie, Maryland 20715-4300 (United States); Brennen, Gavin K. [Institute for Quantum Optics and Quantum Information of the Austrian Academy of Sciences, A-6020, Innsbruck (Austria); Bullock, Stephen S. [IDA Center for Computing Sciences, 17100 Science Drive, Bowie, Maryland 20715-4300 (United States)

    2006-09-15

    Robust quantum computation with d-level quantum systems (qudits) poses two requirements: fast, parallel quantum gates and high-fidelity two-qudit gates. We first describe how to implement parallel single-qudit operations. It is by now well known that any single-qudit unitary can be decomposed into a sequence of Givens rotations on two-dimensional subspaces of the qudit state space. Using a coupling graph to represent physically allowed couplings between pairs of qudit states, we then show that the logical depth (time) of the parallel gate sequence is equal to the height of an associated tree. The implementation of a given unitary can then optimize the tradeoff between gate time and resources used. These ideas are illustrated for qudits encoded in the ground hyperfine states of the alkali-metal atoms {sup 87}Rb and {sup 133}Cs. Second, we provide a protocol for implementing parallelized nonlocal two-qudit gates using the assistance of entangled qubit pairs. Using known protocols for qubit entanglement purification, this offers the possibility of high-fidelity two-qudit gates.

  8. PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY

    E-Print Network [OSTI]

    Batory, Don

    PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY MARTIN D. SCHATZ, ROBERT A. VAN DE GEIJN- trix matrix multiplication algorithms. The journey starts with a description of how matrices implementation of matrix-vector multiplication and rank-1 update, continues on to reveal a fam- ily of matrix-matrix

  9. Effecting Parallel Graph Eigensolvers Through Library Composition

    E-Print Network [OSTI]

    Lumsdaine, Andrew

    is not possible in general. Conventional linear algebra libraries cannot operate on graph data types. Likewise exploitation of this duality. Graph libraries and matrix libraries use different data types, and despiteEffecting Parallel Graph Eigensolvers Through Library Composition Alex Breuer, Peter Gottschling

  10. WEBPIE: A WEB-SCALE PARALLEL INFERENCE

    E-Print Network [OSTI]

    WEBPIE: A WEB-SCALE PARALLEL INFERENCE ENGINE Jacopo Urbani, Spyros Kotoulas, Jason Maassen, Niels Amsterdam Monday 10 May 2010 #12;The Semantic Web The Semantic Web is an extension of the current Web where the semantics is defined Basically the idea is to move from Web of Documents (Traditional Web) Web of data

  11. Particle injection and cosmic ray acceleration at collisionless parallel shocks

    SciTech Connect (OSTI)

    Quest, K.B.

    1987-01-01

    The structure of collisionless parallel shocks is studied using one-dimensional hybrid simulations, with emphasis on particle injection into the first-order Fermi acceleration process. It is argued that for sufficiently high Mach number shocks, and in the absence of wave turbulence, the fluid firehose marginal stability condition will be exceeded at the interface between the upstream, unshocked, plasma and the heated plasma downstream. As a consequence, nonlinear, low-frequency, electromagnetic waves are generated and act to slow the plasma and provide dissipation for the shock. It is shown that large amplitude waves at the shock ramp scatter a small fraction of the upstream ions back into the upstream medium. These ions, in turn, resonantly generate the electromagnetic waves that are swept back into the shock. As these waves propagate through the shock they are compressed and amplified, allowing them to non-resonantly scatter the bulk of the plasma. Moreover, the compressed waves back-scatter a small fraction of the upstream ions, maintaining the shock structure in a quasi-steady state. The back-scattered ions are accelerated during the wave generation process to 2 to 4 times the ram energy and provide a likely seed population for cosmic rays. 49 refs., 7 figs.

  12. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Data communications in a parallel active messaging interface ('PAMI') or a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution of a compute node, including specification of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications instruction, the instruction characterized by instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance witht the instruction type, the transfer data from the origin endpoin to the target endpoint.

  13. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-10-29

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.

  14. VIP-FS: A Virtual, Parallel file System for High Performance Parallel and Distributed Computing *

    E-Print Network [OSTI]

    -passing li- blclries only provide part of the support necessary for most high performan.ce distributed computing applzca- tzcjns - support for hagh speed parallel l/O is still lark- 211q. In this paper, we

  15. Microelectromechanical power generator and vibration sensor

    DOE Patents [OSTI]

    Roesler, Alexander W. (Tijeras, NM); Christenson, Todd R. (Albuquerque, NM)

    2006-11-28

    A microelectromechanical (MEM) apparatus is disclosed which can be used to generate electrical power in response to an external source of vibrations, or to sense the vibrations and generate an electrical output voltage in response thereto. The MEM apparatus utilizes a meandering electrical pickup located near a shuttle which holds a plurality of permanent magnets. Upon movement of the shuttle in response to vibrations coupled thereto, the permanent magnets move in a direction substantially parallel to the meandering electrical pickup, and this generates a voltage across the meandering electrical pickup. The MEM apparatus can be fabricated by LIGA or micromachining.

  16. Adaptive parallelism mapping in dynamic environments using machine learning 

    E-Print Network [OSTI]

    Emani, Murali Krishna

    2015-06-29

    Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mainstream parallel applications execute in the same system competing for resources. This resource contention may lead to a ...

  17. Frame: An Imperative Coordination Language for Parallel Programming 

    E-Print Network [OSTI]

    Cole, Murray

    2000-01-01

    We present Frame, a simple language which facilitates structured expression of imperative parallelism. Programs are described at two levels. The top level captures the main parallel algorithmic structure (which may be nested) and is independent...

  18. Characterizing the parallelism in rule-based expert systems

    SciTech Connect (OSTI)

    Douglass, R.J.

    1984-01-01

    A brief review of two classes of rule-based expert systems is presented, followed by a detailed analysis of potential sources of parallelism at the production or rule level, the subrule level (including match, select, and act parallelism), and at the search level (including AND, OR, and stream parallelism). The potential amount of parallelism from each source is discussed and characterized in terms of its granularity, inherent serial constraints, efficiency, speedup, dynamic behavior, and communication volume, frequency, and topology. Subrule parallelism will yield, at best, two- to tenfold speedup, and rule level parallelism will yield a modest speedup on the order of 5 to 10 times. Rule level can be combined with OR, AND, and stream parallelism in many instances to yield further parallel speedups.

  19. Kalman Filter Tracking on Parallel Architectures

    E-Print Network [OSTI]

    Cerati, Giuseppe; Lantz, Steven; McDermott, Kevin; Riley, Dan; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi

    2015-01-01

    Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors, but the future will be even more exciting. In order to stay within the power density limits but still obtain Moore's Law performance/price gains, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Example technologies today include Intel's Xeon Phi and GPGPUs. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High Luminosity LHC, for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques including Cellular Automata or returning to Hough Transform. The most common track finding techniques in use today are however those based on the Kalman Filter. Significant experience has...

  20. Locating hardware faults in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-04-13

    Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.

  1. Parallel machine architecture for production rule systems

    DOE Patents [OSTI]

    Allen, Jr., John D. (Knoxville, TN); Butler, Philip L. (Knoxville, TN)

    1989-01-01

    A parallel processing system for production rule programs utilizes a host processor for storing production rule right hand sides (RHS) and a plurality of rule processors for storing left hand sides (LHS). The rule processors operate in parallel in the recognize phase of the system recognize -Act Cycle to match their respective LHS's against a stored list of working memory elements (WME) in order to find a self consistent set of WME's. The list of WME is dynamically varied during the Act phase of the system in which the host executes or fires rule RHS's for those rules for which a self-consistent set has been found by the rule processors. The host transmits instructions for creating or deleting working memory elements as dictated by the rule firings until the rule processors are unable to find any further self-consistent working memory element sets at which time the production rule system is halted.

  2. Stochastic Particle Acceleration in Parallel Relativistic Shocks

    E-Print Network [OSTI]

    Joni J. P. Virtanen; Rami Vainio

    2005-03-03

    We present results of test-particle simulations on both the first- and the second-order Fermi acceleration for relativistic parallel shock waves. Our studies suggest that the role of the second-order mechanism in the turbulent downstream of a relativistic shock may have been underestimated in the past, and that the stochastic mechanism may have significant effects on the form of the particle spectra and its time evolution.

  3. Parallel Exact Inference Yinglong Xia1

    E-Print Network [OSTI]

    Hwang, Kai

    using p processors, we show an execution time of O(nk2 m + n2w + (nw2 + wN log n + rwwN + rwN log N)/p of the classical technique of converting a Bayesian network to a junction tree before computing inference. We propose a parallel algorithm for constructing potential ta- bles for a junction tree and explore

  4. Alexandru Iosup Parallel and Distributed Systems Group

    E-Print Network [OSTI]

    Langendoen, Koen

    ­ the Netherlands ­ Europe founded 13th century pop: 100,000 pop.: 100,000 pop: 16.5 M pop: 100,000 founded 1842 pop: 13,000 pop.: 100,000 (We are here) #12;The Parallel and Distributed Systems Group at TU Delft Johan Challenges and High Quality Time ­ A. Iosup 5 #12;Online Gaming used to be art, may now be computing

  5. Parallel Heuristics for Scalable Community Detection

    SciTech Connect (OSTI)

    Lu, Howard; Kalyanaraman, Anantharaman; Halappanavar, Mahantesh; Choudhury, Sutanay

    2014-05-17

    Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability to problems that can be solved on desktops. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose multiple heuristics that are designed to break the sequential barrier. Our heuristics are agnostic to the underlying parallel architecture. For evaluation purposes, we implemented our heuristics on shared memory (OpenMP) and distributed memory (MapReduce-MPI) machines, and tested them over real world graphs derived from multiple application domains (internet, biological, natural language processing). Experimental results demonstrate the ability of our heuristics to converge to high modularity solutions comparable to those output by the serial algorithm in nearly the same number of iterations, while also drastically reducing time to solution.

  6. Kalman Filter Tracking on Parallel Architectures

    E-Print Network [OSTI]

    Giuseppe Cerati; Peter Elmer; Steven Lantz; Kevin McDermott; Dan Riley; Matevž Tadel; Peter Wittich; Frank Würthwein; Avi Yagil

    2015-05-18

    Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors, but the future will be even more exciting. In order to stay within the power density limits but still obtain Moore's Law performance/price gains, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Example technologies today include Intel's Xeon Phi and GPGPUs. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High Luminosity LHC, for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques including Cellular Automata or returning to Hough Transform. The most common track finding techniques in use today are however those based on the Kalman Filter. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust and are exactly those being used today for the design of the tracking system for HL-LHC. Our previous investigations showed that, using optimized data structures, track fitting with Kalman Filter can achieve large speedup both with Intel Xeon and Xeon Phi. We report here our further progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a realistic simulation setup.

  7. Xyce parallel electronic simulator : reference guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.

  8. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect (OSTI)

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  9. Magnetic Braiding and Parallel Electric Fields

    E-Print Network [OSTI]

    A. L. Wilmot-Smith; G. Hornig; D. I. Pontin

    2008-10-08

    The braiding of the solar coronal magnetic field via photospheric motions - with subsequent relaxation and magnetic reconnection -- is one of the most widely debated ideas of solar physics. We readdress the theory in the light of developments in three-dimensional magnetic reconnection theory. It is known that the integrated parallel electric field along field lines is the key quantity determining the rate of reconnection, in contrast with the two-dimensional case where the electric field itself is the important quantity. We demonstrate that this difference becomes crucial for sufficiently complex magnetic field structures. A numerical method is used to relax a braided magnetic field to an ideal force-free equilibrium; that equilibrium is found to be smooth, with only large- scale current structures. However, the equilibrium is shown to have a highly filamentary integrated parallel current structure with extremely short length- scales. An analytical model is developed to show that, in a coronal situation, the length scales associated with the integrated parallel current structures will rapidly decrease with increasing complexity, or degree of braiding, of the magnetic field. Analysis shows the decrease in these length scales will, for any finite resistivity, eventually become inconsistent with the stability of a force- free field. Thus the inevitable consequence of the magnetic braiding process is shown to be a loss of equilibrium of the coronal field, probably via magnetic reconnection events.

  10. An XYZ Parallel-Kinematic Flexure Mechanism With Geometrically

    E-Print Network [OSTI]

    Awtar, Shorya

    An XYZ Parallel-Kinematic Flexure Mechanism With Geometrically Decoupled Degrees of Freedom Shorya of Michigan, Ann Arbor, MI 48109 A novel parallel-kinematic flexure mechanism that provides highly decoupled parallel-kinematic flexure mechanism. The proposed concept is inherently free of geometric overconstraints

  11. Exploiting Visualization and Direct Manipulation to Make Parallel Tools More

    E-Print Network [OSTI]

    Pancake, Cherri M.

    Exploiting Visualization and Direct Manipulation to Make Parallel Tools More Communicative Cherri M@cs.orst.edu http://www.cs.orst.edu/ pancake Abstract. Parallel tools rely on graphical techniques to improve be exploited in parallel tools, in order to improve the naturalness with which the user interacts

  12. IBM Parallel Environment for AIX 5L Introduction

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for AIX 5L Introduction Version 4 Release 2, Modification 2 SA22-7947-04 #12;#12;IBM Parallel Environment for AIX 5L Introduction Version 4 Release 2, Modification 2 SA22 of IBM Parallel Environment for AIX 5L (product number 5765-F83) and to all subsequent releases

  13. IBM Parallel Environment for AIX 5L MPI Programming Guide

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for AIX 5L MPI Programming Guide Version 4 Release 2, Modification 2 SA22-7945-04 #12;#12;IBM Parallel Environment for AIX 5L MPI Programming Guide Version 4 Release 2, Modification 2, Modification 2 of IBM Parallel Environment for AIX 5L (product number 5765-F83) and to all subsequent releases

  14. IBM Parallel Environment for Linux MPI Programming Guide

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for Linux MPI Programming Guide Version 4 Release 2 SA23-2219-00 #12;#12;IBM Parallel Environment for Linux MPI Programming Guide Version 4 Release 2 SA23-2219-00 #12;Note. First Edition (April 2006) This edition applies to version 4, release 2, modification 0 of IBM Parallel

  15. Photon generator

    DOE Patents [OSTI]

    Srinivasan-Rao, Triveni (Shoreham, NY)

    2002-01-01

    A photon generator includes an electron gun for emitting an electron beam, a laser for emitting a laser beam, and an interaction ring wherein the laser beam repetitively collides with the electron beam for emitting a high energy photon beam therefrom in the exemplary form of x-rays. The interaction ring is a closed loop, sized and configured for circulating the electron beam with a period substantially equal to the period of the laser beam pulses for effecting repetitive collisions.

  16. Cluster generator

    DOE Patents [OSTI]

    Donchev, Todor I. (Urbana, IL); Petrov, Ivan G. (Champaign, IL)

    2011-05-31

    Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow cathode includes one or more walls. The one or more walls define a sputtering chamber within the hollow cathode and include a material to be sputtered. A hollow anode is positioned at an end of the sputtering chamber, and atom clusters are formed when a gas discharge is generated between the hollow anode and the hollow cathode.

  17. Electric generator

    DOE Patents [OSTI]

    Foster, Jr., John S. (Pleasanton, CA); Wilson, James R. (Livermore, CA); McDonald, Jr., Charles A. (Danville, CA)

    1983-01-01

    1. In an electrical energy generator, the combination comprising a first elongated annular electrical current conductor having at least one bare surface extending longitudinally and facing radially inwards therein, a second elongated annular electrical current conductor disposed coaxially within said first conductor and having an outer bare surface area extending longitudinally and facing said bare surface of said first conductor, the contiguous coaxial areas of said first and second conductors defining an inductive element, means for applying an electrical current to at least one of said conductors for generating a magnetic field encompassing said inductive element, and explosive charge means disposed concentrically with respect to said conductors including at least the area of said inductive element, said explosive charge means including means disposed to initiate an explosive wave front in said explosive advancing longitudinally along said inductive element, said wave front being effective to progressively deform at least one of said conductors to bring said bare surfaces thereof into electrically conductive contact to progressively reduce the inductance of the inductive element defined by said conductors and transferring explosive energy to said magnetic field effective to generate an electrical potential between undeformed portions of said conductors ahead of said explosive wave front.

  18. OS and Runtime Support for Efficiently Managing Cores in Parallel Applications

    E-Print Network [OSTI]

    Klues, Kevin Alan

    2015-01-01

    Composing parallel software efficiently with lithe. ” In:Runtime Support for Efficiently Managing Cores in ParallelRuntime Support for Efficiently Managing Cores in Parallel

  19. Hybrid Parallelism for Volume Rendering on Large, Multi-core Systems

    E-Print Network [OSTI]

    Howison, Mark

    2010-01-01

    W. , and Childs, H. (2010) MPI-hybrid parallelism for volumeHybrid Parallelism for Volume Rendering on Large, Multi-corecharacteristics of “hybrid” parallel program- ming and

  20. Hybrid Parallelism for Volume Rendering on Large, Multi-core Systems

    E-Print Network [OSTI]

    Howison, Mark

    2010-01-01

    Hybrid Parallelism for Volume Rendering on Large, Multi-corendings indicate that the hybrid-parallel implementation, atpassing against a “hybrid” parallel im- plementation, which

  1. Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program. III: A parallel algorithm for direct opacity sampling

    E-Print Network [OSTI]

    Peter H. Hauschildt; David K. Lowenthal; E. Baron

    2001-04-16

    We describe two parallel algorithms for line opacity calculations based on a local file and on a global file approach. The performance and scalability of both approaches is discussed for different test cases and very different parallel computing systems. The results show that a global file approach is more efficient on high-performance parallel supercomputers with dedicated parallel I/O subsystem whereas the local file approach is very useful on farms of workstations, e.g., cheap PC clusters.

  2. Enhancement by noise in parallel arrays of sensors with power-law characteristics Franois Chapeau-Blondeau and David Rousseau

    E-Print Network [OSTI]

    Chapeau-Blondeau, François

    for a potentially useful generation of "intelligent" sensors, owing to their response to noise. For the transmissionEnhancement by noise in parallel arrays of sensors with power-law characteristics François Chapeau; published 22 December 2004) An optimally tuned power-law sensor is shown capable of amplifying the signal

  3. IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, VOL. 41, NO. 4, JULY/AUGUST 2005 1099 Dynamic Simulation and Analysis of Parallel

    E-Print Network [OSTI]

    Simões, Marcelo Godoy

    mathematical model to de- scribe the transient behavior of a system of self-excited induction generators (SEIGs) operating in parallel and supplying a common load is proposed. Wind turbines with SEIGs are increasingly model. An aggregated model of a small wind power system is also proposed. This model was applied

  4. The Parallel and Distributed Algorithms This chapter contains a brief review of both parallel and distributed computing,

    E-Print Network [OSTI]

    Goddard III, William A.

    146 Chapter VI The Parallel and Distributed Algorithms #12; 147 Abstract This chapter contains of concurrent computing. Based on the resonance method and program design described in Chapter V, an algorithm algorithm lends itself to effective parallelization. The main requirement for effective parallelization

  5. Parallel Dimers and Anti-parallel Tetramers Formed by Epidermal Growth Factor Receptor Pathway Substrate Clone 15 (EPS15)*

    E-Print Network [OSTI]

    Kirchhausen, Tomas

    Parallel Dimers and Anti-parallel Tetramers Formed by Epidermal Growth Factor Receptor Pathway- dependent endocytic traffic. We report here that Eps15 forms dimers and tetramers of distinct shape. The Eps tetramer has a "dumbbell" shape, 31 nm in length; it is formed by the anti-parallel association of two Eps

  6. Towards next generation ocean models : novel discontinuous Galerkin schemes for 2D unsteady biogeochemical models

    E-Print Network [OSTI]

    Ueckermann, Mattheus Percy

    2009-01-01

    A new generation of efficient parallel, multi-scale, and interdisciplinary ocean models is required for better understanding and accurate predictions. The purpose of this thesis is to quantitatively identify promising ...

  7. Effect on Non-Uniform Heat Generation on Thermionic Reactions

    SciTech Connect (OSTI)

    Schock, Alfred

    2012-01-19

    The penalty resulting from non-uniform heat generation in a thermionic reactor is examined. Operation at sub-optimum cesium pressure is shown to reduce this penalty, but at the risk of a condition analogous to burnout. For high pressure diodes, a simple empirical correlation between current, voltage and heat flux is developed and used to analyze the performance penalty associated with two different heat flux profiles, for series-and parallel-connected converters. The results demonstrate that series-connected converters require much finer power flattening than parallel converters. For example, a ±10% variation in heat generation across a series array can result in a 25 to 50% power penalty.

  8. A brief parallel I/O tutorial.

    SciTech Connect (OSTI)

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about the I/O transfer per request.

  9. Carbothermic reduction with parallel heat sources

    DOE Patents [OSTI]

    Troup, Robert L. (Murrysville, PA); Stevenson, David T. (Washington Township, Washington County, PA)

    1984-12-04

    Disclosed are apparatus and method of carbothermic direct reduction for producing an aluminum alloy from a raw material mix including aluminum oxide, silicon oxide, and carbon wherein parallel heat sources are provided by a combustion heat source and by an electrical heat source at essentially the same position in the reactor, e.g., such as at the same horizontal level in the path of a gravity-fed moving bed in a vertical reactor. The present invention includes providing at least 79% of the heat energy required in the process by the electrical heat source.

  10. Parallel State Estimation Assessment with Practical Data

    SciTech Connect (OSTI)

    Chen, Yousu; Jin, Shuangshuang; Rice, Mark J.; Huang, Zhenyu

    2014-10-31

    This paper presents a full-cycle parallel state estimation (PSE) implementation using a preconditioned conjugate gradient algorithm. The developed code is able to solve large-size power system state estimation within 5 seconds using real-world data, comparable to the Supervisory Control And Data Acquisition (SCADA) rate. This achievement allows the operators to know the system status much faster to help improve grid reliability. Case study results of the Bonneville Power Administration (BPA) system with real measurements are presented. The benefits of fast state estimation are also discussed.

  11. Parallel heater system for subsurface formations

    DOE Patents [OSTI]

    Harris, Christopher Kelvin (Houston, TX); Karanikas, John Michael (Houston, TX); Nguyen, Scott Vinh (Houston, TX)

    2011-10-25

    A heating system for a subsurface formation is disclosed. The system includes a plurality of substantially horizontally oriented or inclined heater sections located in a hydrocarbon containing layer in the formation. At least a portion of two of the heater sections are substantially parallel to each other. The ends of at least two of the heater sections in the layer are electrically coupled to a substantially horizontal, or inclined, electrical conductor oriented substantially perpendicular to the ends of the at least two heater sections.

  12. Parallel Programming and Optimization for Intel Architecture

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantityBonneville Power Administration wouldMass mapSpeeding access| Department ofStephen PSeptember|March Study Could LeadParallel

  13. A 60 mW per Lane, 4 23-Gb/s 27 1 PRBS Generator

    E-Print Network [OSTI]

    Voinigescu, Sorin Petre

    ) generators and checkers are widely used for testing the correct functionality of high speed digital circuits reports an ultra-low-power 27 ­ 1 PRBS generator with 4, appropriately delayed, parallel output streams circuitry. The 4-channel PRBS generator consumes 235 mW from 2.5 V, which results in only 60 mW per output

  14. Engineering innovation to reduce wind power COE

    SciTech Connect (OSTI)

    Ammerman, Curtt Nelson

    2011-01-10

    There are enough wind resources in the US to provide 10 times the electric power we currently use, however wind power only accounts for 2% of our total electricity production. One of the main limitations to wind use is cost. Wind power currently costs 5-to-8 cents per kilowatt-hour, which is more than twice the cost of electricity generated by burning coal. Our Intelligent Wind Turbine LDRD Project is applying LANL's leading-edge engineering expertise in modeling and simulation, experimental validation, and advanced sensing technologies to challenges faced in the design and operation of modern wind turbines.

  15. Electric power monthly, February 1999 with data for November 1998

    SciTech Connect (OSTI)

    1999-02-01

    The Electric Power Monthly presents monthly electricity statistics for a wide audience including Congress, Federal and State agencies, the electric utility industry, and the general public. The purpose of this publication is to provide energy decision makers with accurate and timely information that may be used in forming various perspectives on electric issues that lie ahead. Statistics are provided for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt-hour of electricity sold.

  16. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, D.B.

    1994-07-19

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination. 9 figs.

  17. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, Dario B. (DeSoto, TX)

    1994-01-01

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination.

  18. Efficient Algorithms for Parallel Excitation and Parallel Imaging with Large Arrays 

    E-Print Network [OSTI]

    Feng, Shuo

    2013-08-12

    in reconstructions. 2.3 Parallel Excitation The field strength of the current clinical scanners are advancing to 3 Tesla or even 7 Tesla which can tremendously improve the imaging quality. However, many high field related problems remain unsolved, for example...

  19. Sub-Second Parallel State Estimation

    SciTech Connect (OSTI)

    Chen, Yousu; Rice, Mark J.; Glaesemann, Kurt R.; Wang, Shaobu; Huang, Zhenyu

    2014-10-31

    This report describes the performance of Pacific Northwest National Laboratory (PNNL) sub-second parallel state estimation (PSE) tool using the utility data from the Bonneville Power Administrative (BPA) and discusses the benefits of the fast computational speed for power system applications. The test data were provided by BPA. They are two-days’ worth of hourly snapshots that include power system data and measurement sets in a commercial tool format. These data are extracted out from the commercial tool box and fed into the PSE tool. With the help of advanced solvers, the PSE tool is able to solve each BPA hourly state estimation problem within one second, which is more than 10 times faster than today’s commercial tool. This improved computational performance can help increase the reliability value of state estimation in many aspects: (1) the shorter the time required for execution of state estimation, the more time remains for operators to take appropriate actions, and/or to apply automatic or manual corrective control actions. This increases the chances of arresting or mitigating the impact of cascading failures; (2) the SE can be executed multiple times within time allowance. Therefore, the robustness of SE can be enhanced by repeating the execution of the SE with adaptive adjustments, including removing bad data and/or adjusting different initial conditions to compute a better estimate within the same time as a traditional state estimator’s single estimate. There are other benefits with the sub-second SE, such as that the PSE results can potentially be used in local and/or wide-area automatic corrective control actions that are currently dependent on raw measurements to minimize the impact of bad measurements, and provides opportunities to enhance the power grid reliability and efficiency. PSE also can enable other advanced tools that rely on SE outputs and could be used to further improve operators’ actions and automated controls to mitigate effects of severe events on the grid. The power grid continues to grow and the number of measurements is increasing at an accelerated rate due to the variety of smart grid devices being introduced. A parallel state estimation implementation will have better performance than traditional, sequential state estimation by utilizing the power of high performance computing (HPC). This increased performance positions parallel state estimators as valuable tools for operating the increasingly more complex power grid.

  20. Numerically Efficient Parallel Algorithms Prepared for the

    E-Print Network [OSTI]

    Simulation using High Performance Computing Prepared by New Mexico Tech New Mexico Institute of Mining of solutions is found by taking the randomness of wind generation and loads into consideration. A new method

  1. Substantially parallel flux uncluttered rotor machines

    DOE Patents [OSTI]

    Hsu, John S.

    2012-12-11

    A permanent magnet-less and brushless synchronous system includes a stator that generates a magnetic rotating field when sourced by polyphase alternating currents. An uncluttered rotor is positioned within the magnetic rotating field and is spaced apart from the stator. An excitation core is spaced apart from the stator and the uncluttered rotor and magnetically couples the uncluttered rotor. The brushless excitation source generates a magnet torque by inducing magnetic poles near an outer peripheral surface of the uncluttered rotor, and the stator currents also generate a reluctance torque by a reaction of the difference between the direct and quadrature magnetic paths of the uncluttered rotor. The system can be used either as a motor or a generator

  2. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Daniel A

    2014-11-18

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  3. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  4. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2014-08-19

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  5. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2014-10-21

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  6. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    2014-04-30

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  7. Link failure detection in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J. (Rochester, MN); Blocksome, Michael A. (Rochester, MN); Megerian, Mark G. (Rochester, MN); Smith, Brian E. (Rochester, MN)

    2010-11-09

    Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.

  8. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2014-01-07

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a computer node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  9. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Miller, Douglas R.; Parker, Jeffrey J.; Ratterman, Joseph D.; Smith, Brian E.

    2013-09-03

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  10. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2013-07-23

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a compute node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  11. Parallel detecting, spectroscopic ellipsometers/polarimeters

    DOE Patents [OSTI]

    Furtak, Thomas E. (15927 W. Ellsworth, Golden, CO 80401)

    2002-01-01

    The parallel detecting spectroscopic ellipsometer/polarimeter sensor has no moving parts and operates in real-time for in-situ monitoring of the thin film surface properties of a sample within a processing chamber. It includes a multi-spectral source of radiation for producing a collimated beam of radiation directed towards the surface of the sample through a polarizer. The thus polarized collimated beam of radiation impacts and is reflected from the surface of the sample, thereby changing its polarization state due to the intrinsic material properties of the sample. The light reflected from the sample is separated into four separate polarized filtered beams, each having individual spectral intensities. Data about said four individual spectral intensities is collected within the processing chamber, and is transmitted into one or more spectrometers. The data of all four individual spectral intensities is then analyzed using transformation algorithms, in real-time.

  12. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  13. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Ahmad A

    2013-04-16

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  14. Evaluating and Utilizing Compute Capabilities of Parallel CPU...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    in a single machine. It is therefore important to use benchmarks that can evaluate relative performance of these architectures in exploiting different types of parallelism....

  15. Parallel machine match-up scheduling with manufacturing cost considerations

    E-Print Network [OSTI]

    Aktürk, M. Selim; Atamtürk, Alper; Gürel, Sinan

    2010-01-01

    approach for the single machine scheduling problem. Journaldecisions on parallel CNC machines: -constraint approach.mechanism for the CNC machine scheduling problems with

  16. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and...

    Office of Scientific and Technical Information (OSTI)

    Technical Report: PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes Citation Details In-Document Search...

  17. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and...

    Office of Scientific and Technical Information (OSTI)

    PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes Lichtner, Peter OFM Research; Karra, Satish Los...

  18. Mesoscale Simulations of Particulate Flows with Parallel Distributed...

    Office of Scientific and Technical Information (OSTI)

    Mesoscale Simulations of Particulate Flows with Parallel Distributed Lagrange Multiplier Technique Citation Details In-Document Search Title: Mesoscale Simulations of Particulate...

  19. Mesoscale simulations of particulate flows with parallel distributed...

    Office of Scientific and Technical Information (OSTI)

    Journal Article: Mesoscale simulations of particulate flows with parallel distributed Lagrange multiplier technique Citation Details In-Document Search Title: Mesoscale simulations...

  20. Compiling array computations for the Fresh Breeze Parallel Processor

    E-Print Network [OSTI]

    Ginzburg, Igor Arkadiy

    2007-01-01

    Fresh Breeze is a highly parallel architecture currently under development, which strives to provide high performance scientific computing with simple programmability. The architecture provides for multithreaded determinate ...

  1. A set of parallel, implicit methods for a reconstructed discontinuous...

    Office of Scientific and Technical Information (OSTI)

    methods for a reconstructed discontinuous Galerkin method for compressible flows on 3D hybrid grids Citation Details In-Document Search Title: A set of parallel, implicit methods...

  2. Parallel Large-Neighborhood Search Techniques for LNG Inventory ...

    E-Print Network [OSTI]

    Apr 17, 2014 ... Parallel Large-Neighborhood Search Techniques for LNG Inventory Routing. Badrinarayanan Velamur Asokan(badri.velamur.asokan ***at*** ...

  3. Massively parallel DNA sequencing: the new frontier in biogeography

    E-Print Network [OSTI]

    Rocha, Luiz A.; Bernal, Moisés A.; Gaither, Michelle R.; Alfaro, Michael E.

    2013-01-01

    2007) Popula? tion  genomics:  whole?genome  analysis  of evolutionary scales.  BMC  Genomics, 13, 403.   Bickford, 2012)  Population  genomics  of  parallel  adaptation  in 

  4. Massively Parallel Simulations of Solar Flares and Plasma Turbulence

    E-Print Network [OSTI]

    Grauer, Rainer

    in space- and astrophysical plasmasystems include solar flares and hydro- or magnetohydrodynamic turbulence a pure MPI parallelization, which, however requires a careful optimization of the multi

  5. Hybrid MPI/OpenMP parallel support vector machine training

    E-Print Network [OSTI]

    Kristian Woodsend

    2009-01-12

    Jan 12, 2009 ... A parallel implementation of Support Vector Machine training has been developed, using a combination of MPI and OpenMP. Using an interior ...

  6. De Novo Ultrascale Atomistic Simulations On High-End Parallel...

    Office of Scientific and Technical Information (OSTI)

    model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto...

  7. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-16

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  8. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D.; Faraj, Daniel A.

    2014-07-22

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  9. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-02

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  10. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

  11. Data communications for a collective operation in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2013-07-16

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  12. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A; Mamidala, Amith R

    2014-02-11

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  13. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D; Faraj, Daniel A

    2013-07-09

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  14. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2013-09-03

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  15. Coiled transmission line pulse generators

    DOE Patents [OSTI]

    McDonald, Kenneth Fox (Columbia, MO)

    2010-11-09

    Methods and apparatus are provided for fabricating and constructing solid dielectric "Coiled Transmission Line" pulse generators in radial or axial coiled geometries. The pour and cure fabrication process enables a wide variety of geometries and form factors. The volume between the conductors is filled with liquid blends of monomers, polymers, oligomers, and/or cross-linkers and dielectric powders; and then cured to form high field strength and high dielectric constant solid dielectric transmission lines that intrinsically produce ideal rectangular high voltage pulses when charged and switched into matched impedance loads. Voltage levels may be increased by Marx and/or Blumlein principles incorporating spark gap or, preferentially, solid state switches (such as optically triggered thyristors) which produce reliable, high repetition rate operation. Moreover, these Marxed pulse generators can be DC charged and do not require additional pulse forming circuitry, pulse forming lines, transformers, or an a high voltage spark gap output switch. The apparatus accommodates a wide range of voltages, impedances, pulse durations, pulse repetition rates, and duty cycles. The resulting mobile or flight platform friendly cylindrical geometric configuration is much more compact, light-weight, and robust than conventional linear geometries, or pulse generators constructed from conventional components. Installing additional circuitry may accommodate optional pulse shape improvements. The Coiled Transmission Lines can also be connected in parallel to decrease the impedance, or in series to increase the pulse length.

  16. An improved RNS generator kn 2 based on

    E-Print Network [OSTI]

    Sousa, Leonel

    An improved RNS generator kn ±2 based on threshold logic line 2: name Affiliation (Author) Abstract of the Residue Number System (RNS) offers the potential for high- speed and parallel arithmetic. RNS is a carry systems [1]. RNS has shown significant efficiency in implementing different types of Digital Signal

  17. Current parallel I/O limitations to scalable data analysis.

    SciTech Connect (OSTI)

    Mascarenhas, Ajith Arthur; Pebay, Philippe Pierre

    2011-07-01

    This report describes the limitations to parallel scalability which we have encountered when applying our otherwise optimally scalable parallel statistical analysis tool kit to large data sets distributed across the parallel file system of the current premier DOE computational facility. This report describes our study to evaluate the effect of parallel I/O on the overall scalability of a parallel data analysis pipeline using our scalable parallel statistics tool kit [PTBM11]. In this goal, we tested it using the Jaguar-pf DOE/ORNL peta-scale platform on a large combustion simulation data under a variety of process counts and domain decompositions scenarios. In this report we have recalled the foundations of the parallel statistical analysis tool kit which we have designed and implemented, with the specific double intent of reproducing typical data analysis workflows, and achieving optimal design for scalable parallel implementations. We have briefly reviewed those earlier results and publications which allow us to conclude that we have achieved both goals. However, in this report we have further established that, when used in conjuction with a state-of-the-art parallel I/O system, as can be found on the premier DOE peta-scale platform, the scaling properties of the overall analysis pipeline comprising parallel data access routines degrade rapidly. This finding is problematic and must be addressed if peta-scale data analysis is to be made scalable, or even possible. In order to attempt to address these parallel I/O limitations, we will investigate the use the Adaptable IO System (ADIOS) [LZL+10] to improve I/O performance, while maintaining flexibility for a variety of IO options, such MPI IO, POSIX IO. This system is developed at ORNL and other collaborating institutions, and is being tested extensively on Jaguar-pf. Simulation code being developed on these systems will also use ADIOS to output the data thereby making it easier for other systems, such as ours, to process that data.

  18. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    SciTech Connect (OSTI)

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  19. Generators, Recursion, and Fractals 1 Generators

    E-Print Network [OSTI]

    Verschelde, Jan

    Generators, Recursion, and Fractals 1 Generators computing a list of Fibonacci numbers defining a generator with yield putting yield in the function fib 2 Recursive Functions computing factorials, 24 April 2015 Intro to Computer Science (MCS 260) generators and recursion L-41 24 April 2015 1 / 36

  20. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (`PAMI`) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  1. Computing Nash Equilibria for Scheduling on Restricted Parallel Links

    E-Print Network [OSTI]

    Mavronicolas, Marios

    Computing Nash Equilibria for Scheduling on Restricted Parallel Links Martin Gairing Thomas L of assigning n jobs to m parallel machines. In a pure Nash equilibrium, no user may im- prove its own main result, we introduce a polynomial time algorithm to compute from any given assignment a pure Nash

  2. Parallelization of DQMC Simulation for Strongly Correlated Electron Systems

    E-Print Network [OSTI]

    California at Davis, University of

    with novelty by presenting a hybrid granularity parallelization (HGP) scheme that combines algo- rithmic, the HGP scheme explores the parallelism on different levels and maps the underlying algorithms onto and load balancing are also considered in the proposed HGP scheme. We have implemented the DQMC simulation

  3. Abstract State Machines Capture Parallel Andreas Blass Yuri Gurevich y

    E-Print Network [OSTI]

    Blass, Andreas R.

    Abstract State Machines Capture Parallel Algorithms Andreas Blass #3; Yuri Gurevich y Technical Microsoft Way Redmond, WA 98052 Abstract We give an axiomatic description of parallel, synchronous algo state machine with a background that provides for multisets. #3; Partially supported by NSF grant DMS

  4. Partitioning strategies for parallel KIVA-4 engine simulations

    SciTech Connect (OSTI)

    Torres, D J [Los Alamos National Laboratory; Kong, S C [IOWA STATE UNIV

    2008-01-01

    Parallel KIVA-4 is described and simulated in four different engine geometries. The Message Passing-Interface (MPl) was used to parallelize KIVA-4. Par itioning strategies ar accesed in light of the fact that cells can become deactivated and activated during the course of an engine simulation which will affect the load balance between processors.

  5. Can Users Play an Effective Role in Parallel Tools Research?

    E-Print Network [OSTI]

    Pancake, Cherri M.

    Can Users Play an Effective Role in Parallel Tools Research? Cherri M. Pancake Department is a cost-effective way of improving both the quality and the acceptability of tool products. In this paper the foundation for parallel tool design. Integrating users changes the basic nature of the software process

  6. THE INHERENT QUEUING DELAY OF PARALLEL PACKET SWITCHES

    E-Print Network [OSTI]

    Attiya, Hagit

    THE INHERENT QUEUING DELAY OF PARALLEL PACKET SWITCHES (Extended Abstract) Hagit Attiya and David {hagit,hdavid}@cs.technion.ac.il Abstract The parallel packet switch (PPS) is extensively used as the core of con- temporary commercial switches. This paper investigates the inherent queuing delay

  7. The Inherent Queuing Delay of Parallel Packet Switches

    E-Print Network [OSTI]

    Hay, David

    The Inherent Queuing Delay of Parallel Packet Switches Hagit Attiya and David Hay Abstract--The parallel packet switch (PPS) extends the inverse multiplexing architecture and is widely used as the core of contemporary commercial switches. This paper investigates the inherent queuing delay introduced by the PPS

  8. 2-SATISFIABILITY AND DIAGNOSING FAULTY PROCESSORS MASSIVELY PARALLEL COMPUTING SYSTEMS

    E-Print Network [OSTI]

    Servatius, Brigitte

    2-SATISFIABILITY AND DIAGNOSING FAULTY PROCESSORS IN MASSIVELY PARALLEL COMPUTING SYSTEMS ANSUMAN the number of processors in the system. 1. Introduction In a massively parallel computing system, such as CM be used more often. In multi-processor computers there are two different system modes, a normal operation

  9. Robust Resource Allocations in Parallel Computing Systems: Model and Heuristics

    E-Print Network [OSTI]

    Maciejewski, Anthony A. "Tony"

    Robust Resource Allocations in Parallel Computing Systems: Model and Heuristics Vladimir Shestak1 in parallel computer systems (including heterogeneous clusters) should be allocated to the computational was supported by the Colorado State University Center for Robustness in Computer Systems (funded by the Colorado

  10. Study of Stability Regions in Parallel Connected Boost Converters

    E-Print Network [OSTI]

    Tse, Chi K. "Michael"

    Study of Stability Regions in Parallel Connected Boost Converters Yuehui Huang and Chi K. Tse attractors of parallel connected boost switching converters under a master- slave current sharing scheme. We boost converters. Under the master- slave scheme, one of the converters is the master and the other

  11. Probabilistic Adaptive Load Balancing for Parallel Daniel M. Yellin #1

    E-Print Network [OSTI]

    Paton, Norman

    , Jerusalem, Israel 96951 1 dmy@us.ibm.com Departmento de Computaci´on, CINVESTAV-IPN Av. Inst. Pol. Nal. 2508 D.F, M´exico 07360 2 jbuenabad@cs.cinvestav.mx + School of Computer Science, University partitioned or pipelined parallelism. Partitioned parallelism has the potential to provide scaleable query

  12. System Support for Implicitly Parallel Programming Matthew I. Frank

    E-Print Network [OSTI]

    Frank, Matthew I.

    System Support for Implicitly Parallel Programming Matthew I. Frank Coordinated Science Laboratory Electrical and Computer Engineering University of Illinois at Urbana-Champaign Abstract Implicit sequential semantics, e.g., the C programming lan- guage. System tools convert the parallel algorithms

  13. Application of Parallel Imaging to Murine Magnetic Resonance Imaging 

    E-Print Network [OSTI]

    Chang, Chieh-Wei 1980-

    2012-09-21

    . This dissertation describes foundational level work to enable parallel imaging of mice on a 4.7 Tesla/40 cm bore research scanner. Reducing the size of the hardware setup associated with typical parallel imaging was an integral part of achieving the work, as animal...

  14. Parallel Picoliter RT-PCR Assays Using Microfluidics

    E-Print Network [OSTI]

    Quake, Stephen R.

    Parallel Picoliter RT-PCR Assays Using Microfluidics Joshua S. Marcus,, W. French Anderson The development of microfluidic tools for high-throughput nucleic acid analysis has become a burgeoning area of research in the post-genome era. Here, we have developed a microfluidic chip to perform 72 parallel 450-p

  15. Analyzing Parallelism and Domain Similarities in the MAREC Patent Corpus

    E-Print Network [OSTI]

    Riezler, Stefan

    Analyzing Parallelism and Domain Similarities in the MAREC Patent Corpus Katharina W}@cl.uni-heidelberg.de Abstract. Statistical machine translation of patents requires large a- mounts of sentence-parallel data. Translations of patent text often exist for parts of the patent document, namely title, abstract and claims

  16. On the Interplay of Parallelization, Program Performance, and Energy Consumption

    E-Print Network [OSTI]

    Scarano, Vittorio

    On the Interplay of Parallelization, Program Performance, and Energy Consumption Sangyeun Cho to either minimize the total energy consumption or minimize the energy-delay product. The impact of static through parallel execution of applications, suppressing the power and energy consumption remains an even

  17. A Taxonomy of Parallel Prefix Networks David Harris

    E-Print Network [OSTI]

    Harris, David Money

    A Taxonomy of Parallel Prefix Networks David Harris Harvey Mudd College / Sun Microsystems of logic levels, fanout, and wiring tracks. This paper presents a three-dimensional taxonomy that not only for wide adders. This paper develops a taxonomy of parallel prefix networks based on stages, fanout

  18. A Parallel Visualization Pipeline for Terascale Earthquake Simulations

    E-Print Network [OSTI]

    Ma, Kwan-Liu

    A Parallel Visualization Pipeline for Terascale Earthquake Simulations Hongfeng Yu Kwan-Liu Ma welling@psc.edu ABSTRACT This paper presents a parallel visualization pipeline imple- mented earth- quake and reduce its risk to the general population. The 0 0-7695-2153-3/04 $20.00 (c)2004 IEEE

  19. The Parallel BGL: A Generic Library for Distributed Graph Computations

    E-Print Network [OSTI]

    Lumsdaine, Andrew

    ] and written in a style similar to the C++ Standard Template Library (STL) [38, 46], 1 #12;data types providedThe Parallel BGL: A Generic Library for Distributed Graph Computations Douglas Gregor and Andrew,lums}@osl.iu.edu Abstract This paper presents the Parallel BGL, a generic C++ library for distributed graph computation

  20. A Parallel Jacobi Method for the Takagi Factorization Xiaohong Wang

    E-Print Network [OSTI]

    Qiao, Sanzheng

    A Parallel Jacobi Method for the Takagi Factorization Xiaohong Wang Department of Computing-symmetric matrix. We present a multthreading parallel Jacobi algorithm for computing the Takagi factorization Jacobi method, Multithreading. 1 Introduction A matrix A of order n is symmetric if A = A T . When

  1. An Overview of Parallel Ccomputing Marc Moreno Maza

    E-Print Network [OSTI]

    Moreno Maza, Marc

    (Canada) CS2101 #12;Plan 1 Hardware 2 Types of Parallelism 3 Concurrency Platforms: Three Examples Cilk CUDA MPI #12;Hardware Plan 1 Hardware 2 Types of Parallelism 3 Concurrency Platforms: Three Examples Cilk CUDA MPI #12;Hardware von Neumann Architecture In 1945, the Hungarian mathematician John von

  2. A Hierarchical and Parallel Method for Training Support Vector Machines

    E-Print Network [OSTI]

    Lu, Bao-Liang

    handled by many modules. After training, all the trained modules are integrated into a modular system [4A Hierarchical and Parallel Method for Training Support Vector Machines Yimin Wen1,2 and Baoliang sequential methods need long training time, and some of parallel methods lead to generalization accuracy

  3. Is sex categorization from faces really parallel to face recognition?

    E-Print Network [OSTI]

    Rossion, Bruno

    Is sex categorization from faces really parallel to face recognition? Bruno Rossion Department of face processing (Bruce & Young, 1986), sex processing on faces is a parallel function to individual face recognition. One consequence of the model is thus that sex categorization on faces

  4. Highly Available, Fault-Tolerant, Parallel Dataflows Mehul A. Shah

    E-Print Network [OSTI]

    Hellerstein, Joseph M.

    Highly Available, Fault-Tolerant, Parallel Dataflows Mehul A. Shah U.C. Berkeley mashah@cs.berkeley.edu Joseph M. Hellerstein U.C. Berkeley Intel Research, Berkeley jmh@cs.berkeley.edu Eric Brewer U.C. This delicate inte- gration allows us to tolerate failures of portions of a parallel dataflow without

  5. Visualization in a Parallel Processing Environment Robert Haimes \\Lambda

    E-Print Network [OSTI]

    Peraire, Jaime

    is becoming US industry's super­ computer. Computational simulations are being developed today that run super­computer class hardware (which includes parallel ma­ chines or a high­speed network of computational methodologies employing visualization and parallel processing for the extraction of information

  6. Tsunami: Massively Parallel Homomorphic Hashing on Many-core GPUs

    E-Print Network [OSTI]

    Li, Zongpeng

    1 Tsunami: Massively Parallel Homomorphic Hashing on Many-core GPUs Xiaowen Chu Department. In this paper, we present a massively parallel solution, named Tsunami, by exploiting the widely available many-core Graphic Processing Units (GPUs). Tsunami includes the following optimization techniques to achieve

  7. SWAMP+: multiple subsequence alignment using associative massive parallelism

    SciTech Connect (OSTI)

    Steinfadt, Shannon Irene [Los Alamos National Laboratory; Baker, Johnnie W [KENT STATE UNIV.

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  8. Broadcasting collective operation contributions throughout a parallel computer

    DOE Patents [OSTI]

    Faraj, Ahmad (Rochester, MN)

    2012-02-21

    Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.

  9. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Olivier, Stephen L.; de Supinski, Bronis R.; Schulz, Martin; Prins, Jan F.

    2013-01-01

    Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMAmore »systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler. « less

  10. Transparent runtime parallelization of the R scripting language

    SciTech Connect (OSTI)

    Yoginath, Srikanth B [ORNL

    2011-01-01

    Scripting languages such as R and Matlab are widely used in scientific data processing. As the data volume and the complexity of analysis tasks both grow, sequential data processing using these tools often becomes the bottleneck in scientific workflows. We describe pR, a runtime framework for automatic and transparent parallelization of the popular R language used in statistical computing. Recognizing scripting languages interpreted nature and data analysis codes use pattern, we propose several novel techniques: (1) applying parallelizing compiler technology to runtime, whole-program dependence analysis of scripting languages, (2) incremental code analysis assisted with evaluation results, and (3) runtime parallelization of file accesses. Our framework does not require any modification to either the source code or the underlying R implementation. Experimental results demonstrate that pR can exploit both task and data parallelism transparently and overall has better performance as well as scalability compared to an existing parallel R package that requires code modification.

  11. Parallel architecture for real-time simulation. Master's thesis

    SciTech Connect (OSTI)

    Cockrell, C.D.

    1989-01-01

    This thesis is concerned with the development of a very fast and highly efficient parallel computer architecture for real-time simulation of continuous systems. Currently, several parallel processing systems exist that may be capable of executing a complex simulation in real-time. These systems are examined and the pros and cons of each system discussed. The thesis then introduced a custom-designed parallel architecture based upon The University of Alabama's OPERA architecture. Each component of this system is discussed and rationale presented for its selection. The problem selected, real-time simulation of the Space Shuttle Main Engine for the test and evaluation of the proposed architecture, is explored, identifying the areas where parallelism can be exploited and parallel processing applied. Results from the test and evaluation phase are presented and compared with the results of the same problem that has been processed on a uniprocessor system.

  12. Generation and analysis of digital terrainy g models with parallel guidance systems for

    E-Print Network [OSTI]

    · Conclusions and outlook Rostock University, Chair for Geodäsy and GeoInformatics #12;2 Introduction and GeoInformatics · Beside the guiding function, GPS also provides the altitude which may be used.000 Rostock University, Chair for Geodäsy and GeoInformatics Suitable für high precision DEM No Yes Yes #12

  13. Compiler Transformations to Generate Reentrant C Programs to Assist Software Parallelization

    E-Print Network [OSTI]

    Smith, Adam

    2009-06-16

    -consuming, and error-prone. In this paper we describe a system to provide a semi-automated mechanism for users to still be able to use statics and globals in their programs, and to let the compiler automatically convert them into their semantically-equivalent reentrant...

  14. Stochastic Acceleration in Relativistic Parallel Shocks

    E-Print Network [OSTI]

    Joni J. P. Virtanen; Rami Vainio

    2004-11-08

    (abridged) We present results of test-particle simulations on both the first and the second order Fermi acceleration at relativistic parallel shock waves. We consider two scenarios for particle injection: (i) particles injected at the shock front, then accelerated at the shock by the first order mechanism and subsequently by the stochastic process in the downstream region; and (ii) particles injected uniformly throughout the downstream region to the stochastic process. We show that regardless of the injection scenario, depending on the magnetic field strength, plasma composition, and the employed turbulence model, the stochastic mechanism can have considerable effects on the particle spectrum on temporal and spatial scales too short to be resolved in extragalactic jets. Stochastic acceleration is shown to be able to produce spectra that are significantly flatter than the limiting case of particle energy spectral index -1 of the first order mechanism. Our study also reveals a possibility of re-acceleration of the stochastically accelerated spectrum at the shock, as particles at high energies become more and more mobile as their mean free path increases with energy. Our findings suggest that the role of the second order mechanism in the turbulent downstream of a relativistic shock with respect to the first order mechanism at the shock front has been underestimated in the past, and that the second order mechanism may have significant effects on the form of the particle spectra and its evolution.

  15. Understanding and Managing Generation Y

    E-Print Network [OSTI]

    Wallace, Kevin

    2007-12-14

    There are four generations in the workplace today; they consist of the Silent Generation, Baby Boom Generation, Generation X, and Generation Y. Generation Y, being the newest generation, is the least understood generation although marketers...

  16. Enhancing the Performance of a Multiplayer Game by Using a Parallelizing Compiler

    E-Print Network [OSTI]

    Kasahara, Hironori

    Enhancing the Performance of a Multiplayer Game by Using a Parallelizing Compiler Yasir I. M. Al performance enhancement in Video Games when using parallelizing compilers and the difficulties involved parallelizing compilers in extracting parallelism. Next, the program is compiled using a parallelizing compiler

  17. Development of Large Scale High Performance Applications with a Parallelizing Compiler

    E-Print Network [OSTI]

    Vlad, Gregorio

    Development of Large Scale High Performance Applications with a Parallelizing Compiler B. DI parallel computations, and lack of robustness of parallelizing HPF compilers in handling large sized codes directives, into explicitly parallel code, by means of parallelizing compilers. This method is not only

  18. A Parallel Coiled-Coil Tetramer with Offset Helices

    SciTech Connect (OSTI)

    Liu,J.; Deng, Y.; Zheng, Q.; Cheng, C.; Kallenbach, N.; Lu, M.

    2006-01-01

    Specific helix-helix interactions are fundamental in assembling the native state of proteins and in protein-protein interfaces. Coiled coils afford a unique model system for elucidating principles of molecular recognition between {alpha} helices. The coiled-coil fold is specified by a characteristic seven amino acid repeat containing hydrophobic residues at the first (a) and fourth (d) positions. Nonpolar side chains spaced three and four residues apart are referred to as the 3-4 hydrophobic repeat. The presence of apolar amino acids at the e or g positions (corresponding to a 3-3-1 hydrophobic repeat) can provide new possibilities for close-packing of {alpha}-helices that includes examples such as the lac repressor tetramerization domain. Here we demonstrate that an unprecedented coiled-coil interface results from replacement of three charged residues at the e positions in the dimeric GCN4 leucine zipper by nonpolar valine side chains. Equilibrium circular dichroism and analytical ultracentrifugation studies indicate that the valine-containing mutant forms a discrete {alpha}-helical tetramer with a significantly higher stability than the parent leucine-zipper molecule. The 1.35 {angstrom} resolution crystal structure of the tetramer reveals a parallel four-stranded coiled coil with a three-residue interhelical offset. The local packing geometry of the three hydrophobic positions in the tetramer conformation is completely different from that seen in classical tetrameric structures yet bears resemblance to that in three-stranded coiled coils. These studies demonstrate that distinct van der Waals interactions beyond the a and d side chains can generate a diverse set of helix-helix interfaces and three-dimensional supercoil structures.

  19. Performance of a 3D Spectral code on the Cray T3D and IBM SP2 parallel supercomputers

    E-Print Network [OSTI]

    Brummell, Nic

    Performance of a 3D Spectral code on the Cray T3D and IBM SP2 parallel supercomputers Clive F of the new generation of distributed memory supercomputers, in particular the Cray T3D and IBM SP2, we of 256 3 and 512 3 . The first two tables are for the Cray T3D and the other two for the IBM SP2

  20. Xyce Parallel Electronic Simulator : users' guide, version 4.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-02-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  1. Xyce parallel electronic simulator : users' guide. Version 5.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-11-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

  2. Parallel Hall effect from 3D single-component metamaterials

    E-Print Network [OSTI]

    Kern, Christian; Wegener, Martin

    2015-01-01

    We propose a class of three-dimensional metamaterial architectures composed of a single doped semiconductor (e.g., n-Si) in air or vacuum that lead to unusual effective behavior of the classical Hall effect. Using an anisotropic structure, we numerically demonstrate a Hall voltage that is parallel---rather than orthogonal---to the external static magnetic-field vector ("parallel Hall effect"). The sign of this parallel Hall voltage can be determined by a structure parameter. Together with the previously demonstrated positive or negative orthogonal Hall voltage, we demonstrate four different sign combinations

  3. TECA: A Parallel Toolkit for Extreme Climate Analysis

    SciTech Connect (OSTI)

    Prabhat, Mr; Ruebel, Oliver; Byna, Surendra; Wu, Kesheng; Li, Fuyu; Wehner, Michael; Bethel, E. Wes

    2012-03-12

    We present TECA, a parallel toolkit for detecting extreme events in large climate datasets. Modern climate datasets expose parallelism across a number of dimensions: spatial locations, timesteps and ensemble members. We design TECA to exploit these modes of parallelism and demonstrate a prototype implementation for detecting and tracking three classes of extreme events: tropical cyclones, extra-tropical cyclones and atmospheric rivers. We process a modern TB-sized CAM5 simulation dataset with TECA, and demonstrate good runtime performance for the three case studies.

  4. Generator stator core vent duct spacer posts

    DOE Patents [OSTI]

    Griffith, John Wesley (Schenectady, NY); Tong, Wei (Clifton Park, NY)

    2003-06-24

    Generator stator cores are constructed by stacking many layers of magnetic laminations. Ventilation ducts may be inserted between these layers by inserting spacers into the core stack. The ventilation ducts allow for the passage of cooling gas through the core during operation. The spacers or spacer posts are positioned between groups of the magnetic laminations to define the ventilation ducts. The spacer posts are secured with longitudinal axes thereof substantially parallel to the core axis. With this structure, core tightness can be assured while maximizing ventilation duct cross section for gas flow and minimizing magnetic loss in the spacers.

  5. Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

    2014-11-11

    Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.

  6. Earthquake Ground Motion Modeling on Parallel Computers Hesheng Bao

    E-Print Network [OSTI]

    California at Berkeley, University of

    numerical methods for applying seismic forces, incorporating absorbing boundaries, and solving unstructured PDE solvers, parallelizing compilers, seismic wave propagation, strong ground motion. 1 be designed to resist earthquakes and existing structures be retrofitted as necessary. Assessing the free

  7. Parallel Performance of Some Two-Level ASPIN Algorithms

    E-Print Network [OSTI]

    Cai, Xiao-Chuan

    Parallel Performance of Some Two-Level ASPIN Algorithms Leszek Marcinkowski1 and Xiao-Chuan Cai2 1 Marcinkowski and Xiao-Chuan Cai preconditioned inexact Newton methods (ASPIN) were recently proposed in Cai

  8. Control system design for a parallel hybrid electric vehicle 

    E-Print Network [OSTI]

    Buntin, David Leighton

    1994-01-01

    This thesis addresses the design of control systems for a parallel hybrid electric drive train which is an alternative to conventional passenger vehicles. The principle components of the drive train are a small internal combustion engine...

  9. Distributed Point Objects: A new concept for parallel nite elements

    E-Print Network [OSTI]

    Wieners, Christian

    , S. Diebels, W. Ehlers: Parallel 3-d simulations for porous media models in soil mechanics, Comput, porous media simulation, soil mechanics Preprint submitted to Elsevier Science 28 February 2003 #12; 1

  10. Reducing Concurrency Bottlenecks in Parallel I/O Workloads

    SciTech Connect (OSTI)

    Manzanares, Adam C. [Los Alamos National Laboratory; Bent, John M. [Los Alamos National Laboratory; Wingate, Meghan [Los Alamos National Laboratory

    2011-01-01

    To enable high performance parallel checkpointing we introduced the Parallel Log Structured File System (PLFS). PLFS is middleware interposed on the file system stack to transform concurrent writing of one application file into many non-concurrently written component files. The promising effectiveness of PLFS makes it important to examine its performance for workloads other than checkpoint capture, notably the different ways that state snapshots may be later read, to make the case for using PLFS in the Exascale I/O stack. Reading a PLFS file involved reading each of its component files. In this paper we identify performance limitations on broader workloads in an early version of PLFS, specifically the need to build and distribute an index for the overall file, and the pressure on the underlying parallel file system's metadata server, and show how PLFS's decomposed components architecture can be exploited to alleviate bottlenecks in the underlying parallel file system.

  11. Some applications of pipelining techniques in parallel scientific computing 

    E-Print Network [OSTI]

    Deng, Yuanhua

    1996-01-01

    columnwise partitioning schemes. For chasing algorithms, in addition to the pipelining, we apply block-cyclic partitioning, group message-passing techniques to enhance the performance of the pipelined parallel algorithms. The numerical results for the use...

  12. Scioto: A Framework for Global-ViewTask Parallelism

    SciTech Connect (OSTI)

    Dinan, James S.; Krishnamoorthy, Sriram; Larkins, D. B.; Nieplocha, Jaroslaw; Sadayappan, Ponnuswamy

    2008-09-09

    We introduce Scioto, Shared Collections of Task Objects, a framework for supporting task-parallelism in one-sided and global-view parallel programming models. Scioto provides lightweight, locality aware dynamic load balancing and interoperates with existing parallel models including MPI, SHMEM, CAF, and Global Arrays. Through task parallelism, the Scioto framework provides a solution for overcoming load imbalance and heterogeneity as well as dynamic mapping of computation onto emerging multicore architectures. In this paper, we present the design and implementation of the Scioto framework and demonstrate its effectiveness on the Unbalanced Tree Search (UTS) benchmark and two quantum chemistry codes: the closed shell Self-Consistent Field (SCF) method and a sparse tensor contraction kernel extracted from a coupled cluster computation. We explore the efficiency and scalability of Scioto through these sample applications and demonstrate that is offers low overhead, achieves good performance on heterogeneous and multicore clusters, and scales to hundreds of processors.

  13. A parallel hypothesis method of autonomous underwater vehicle navigation

    E-Print Network [OSTI]

    LaPointe, Cara Elizabeth Grupe

    2009-01-01

    This research presents a parallel hypothesis method for autonomous underwater vehicle navigation that enables a vehicle to expand the operating envelope of existing long baseline acoustic navigation systems by incorporating ...

  14. Scheduling on the MasPar SIMD parallel computer 

    E-Print Network [OSTI]

    Perkins, Keith Douglas

    1995-01-01

    This thesis studies the feasibility of a task scheduler for a parallel operating system. After analyzing several task scheduling algorithms, the highest level first algorithm was chosen. This algorithm has been empirically found to build schedules...

  15. Design and evaluation of the Hamal parallel computer

    E-Print Network [OSTI]

    Grossman, J. P., 1973-

    2003-01-01

    Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new ...

  16. Design and Evaluation of the Hamal Parallel Computer

    E-Print Network [OSTI]

    Grossman, J.P.

    2002-12-05

    Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new ...

  17. Introduction of static load balancing in incremental parallel programming

    E-Print Network [OSTI]

    Goodman, J.

    Goodman,J. O'Donnell,J.T. Proceedings of Euro-Par 2001 Parallel Processing, Lecture Notes in Computer Science, pp. 535-539, vol. 2150 pp 535-539 Springer

  18. A network based model for heterogeneous parallel computation 

    E-Print Network [OSTI]

    Sathye, Adwait B.

    1993-01-01

    The computational requirements of science and engineering demand computational resources orders of magnitude of the current day sequential machines. Most of the research effort has been concentrated upon the creation of parallel algorithms...

  19. Speeding up Parallel Graph Coloring Assefaw H. Gebremedhin1,

    E-Print Network [OSTI]

    Manne, Fredrik

    Speeding up Parallel Graph Coloring Assefaw H. Gebremedhin1, , Fredrik Manne2 , and Tom Woods2 1. Gebremedhin, Fredrik Manne, and Tom Woods However, in practice greedy sequential coloring heuristics have been

  20. Natural convection flows in parallel connected vertical channels with boiling

    E-Print Network [OSTI]

    Eselgroth, Peter Ward

    1967-01-01

    The steady-state flow configuration in an array of parallel heated channels is examined with the objective of predicting the behavior of a reactor during a loss of flow accident. A method of combining the results of single ...

  1. Optimized control studies of a parallel hybrid electric vehicle 

    E-Print Network [OSTI]

    Bougler, Benedicte Bernadette

    1995-01-01

    This thesis addresses the development of a control scheme to maximize automobile fuel economy and battery state-of-charge (SOC) while meeting exhaust emission standards for parallel hybrid electric vehicles, which are an alternative to conventional...

  2. Parallel Processing Letters fc World Scientific Publishing Company

    E-Print Network [OSTI]

    Mavronicolas, Marios

    Parallel Processing Letters fc World Scientific Publishing Company THE PRICE OF ANARCHY of Computer Science, Electrical Engineering and Mathematics, University of Paderborn, D-33102 Paderborn, Germany § Department of Computer Science, University of Cyprus, Nicosia CY-1678, Cyprus Received (received

  3. Parallel adaptive numerical schemes for hyperbolic systems of ...

    E-Print Network [OSTI]

    1910-50-80

    We describe a parallel implementation of the algorithm on the ... methods that use piecewise linear functions as approximations to solutions of evolution equations: ..... execution of a system disk management routine for a fraction of a second.

  4. Scheduling of real-time communication network for parallel processing 

    E-Print Network [OSTI]

    Li, Hung

    1995-01-01

    As real-time applications become more and more complicated their demands of processing capacity can hardly be satisfied. Massively parallel computers, such as Intel Paragon, with their scalable architecture and tremendous ...

  5. Automatic Thread-Level Parallelization in the Chombo AMR Library

    E-Print Network [OSTI]

    Christen, Matthias

    2012-01-01

    Automatic Thread-Level Parallelization in the Chombo AMRused target language for an automatic migration of the largemacros a perfect target for automatic ?ne-grained loop-level

  6. Large-Scale Molecular Dynamics Simulations for Highly Parallel Infrastructures

    E-Print Network [OSTI]

    Pazúriková, Jana

    2014-01-01

    Computational chemistry allows researchers to experiment in sillico: by running a computer simulations of a biological or chemical processes of interest. Molecular dynamics with molecular mechanics model of interactions simulates N-body problem of atoms$-$it computes movements of atoms according to Newtonian physics and empirical descriptions of atomic electrostatic interactions. These simulations require high performance computing resources, as evaluations within each step are computationally demanding and billions of steps are needed to reach interesting timescales. Current methods decompose the spatial domain of the problem and calculate on parallel/distributed infrastructures. Even the methods with the highest strong scaling hit the limit at half a million cores: they are not able to cut the time to result if provided with more processors. At the dawn of exascale computing with massively parallel computational resources, we want to increase the level of parallelism by incorporating parallel-in-time comput...

  7. Design algorithms for parallel transmission in magnetic resonance imaging

    E-Print Network [OSTI]

    Setsompop, Kawin

    2008-01-01

    The focus of this dissertation is on the algorithm design, implementation, and validation of parallel transmission technology in Magnetic Resonance Imaging (MRI). Novel algorithms are proposed which yield excellent excitation ...

  8. Towards Energy Aware Scheduling for Precedence Constrained Parallel

    E-Print Network [OSTI]

    Towards Energy Aware Scheduling for Precedence Constrained Parallel Tasks: Jong Youl Cho Community Grids Lab, Indian University #12;Outlook · Background;Energy model #12;Cluster model #12;Job model DAG model: T= (J, E) #12;Job

  9. Provably good race detection that runs in parallel

    E-Print Network [OSTI]

    Fineman, Jeremy T

    2005-01-01

    A multithreaded parallel program that is intended to be deterministic may exhibit nondeterminism clue to bugs called determinacy races. A key capability of race detectors is to determine whether one thread executes logically ...

  10. Parallel Lossless Image Compression Using Huffman and Arithmetic Coding

    E-Print Network [OSTI]

    Howard, Paul G.; Vitter, Jeffrey Scott

    1996-01-01

    We show that high-resolution images can be encoded and decoded e ciently in parallel. We present an algorithm based on the hierarchical MLP method, used either with Hu man coding or with a new variant of arithmetic coding ...

  11. Embracing diversity : improving performance for parallel storage systems built with heterogeneous disks

    E-Print Network [OSTI]

    Bruno, Gregory DuVall

    2008-01-01

    Figure I.2: Parallel Storage System Architecture FigureHeterogeneous Parallel Storage Systems . . . . . B. Modeldisks on a multimedia storage system with random data

  12. MEMS-based Massively-parallelized Mechanoporation Instrumentation for Ultrahigh Throughput Cellular Manipulation

    E-Print Network [OSTI]

    Zhang, Yanyan

    2012-01-01

    OF CALIFORNIA RIVERSIDE MEMS-based Massively-parallelizedOF THE DISSERTATION MEMS-based Massively-parallelizeda massively- parallelized MEMS-based platform for passively

  13. Generation gaps in engineering?

    E-Print Network [OSTI]

    Kim, David J. (David Jinwoo)

    2008-01-01

    There is much enthusiastic debate on the topic of generation gaps in the workplace today; what the generational differences are, how to address the apparent challenges, and if the generations themselves are even real. ...

  14. Shift: A Massively Parallel Monte Carlo Radiation Transport Package

    SciTech Connect (OSTI)

    Pandya, Tara M [ORNL; Johnson, Seth R [ORNL; Davidson, Gregory G [ORNL; Evans, Thomas M [ORNL; Hamilton, Steven P [ORNL

    2015-01-01

    This paper discusses the massively-parallel Monte Carlo radiation transport package, Shift, de- veloped at Oak Ridge National Laboratory. It reviews the capabilities, implementation, and parallel performance of this code package. Scaling results demonstrate very good strong and weak scaling behavior of the implemented algorithms. Benchmark results from various reactor problems show that Shift results compare well to other contemporary Monte Carlo codes and experimental results.

  15. Allinea DDT as a Parallel Debugging Alternative to Totalview

    SciTech Connect (OSTI)

    Antypas, K.B.

    2007-03-05

    Totalview, from the Etnus Corporation, is a sophisticated and feature rich software debugger for parallel applications. As Totalview has gained in popularity and market share its pricing model has increased to the point where it is often prohibitively expensive for massively parallel supercomputers. Additionally, many of Totalview's advanced features are not used by members of the scientific computing community. For these reasons, supercomputing centers have begun to search for a basic parallel debugging tool which can be used as an alternative to Totalview. As the cost and complexity of Totalview has increased over the years, scientific computing centers have started searching for a viable parallel debugging alternative. DDT (Distributed Debugging Tool) from Allinea Software is a relatively new parallel debugging tool which aims to provide much of the same functionality as Totalview. This review outlines the basic features and limitations of DDT to determine if it can be a reasonable substitute for Totalview. DDT was tested on the NERSC platforms Bassi, Seaborg, Jacquard and Davinci with Fortran90, C, and C++ codes using MPI and OpenMP for parallelism.

  16. Sort-First, Distributed Memory Parallel Visualization and Rendering

    SciTech Connect (OSTI)

    Bethel, E. Wes; Humphreys, Greg; Paul, Brian; Brederson, J. Dean

    2003-07-15

    While commodity computing and graphics hardware has increased in capacity and dropped in cost, it is still quite difficult to make effective use of such systems for general-purpose parallel visualization and graphics. We describe the results of a recent project that provides a software infrastructure suitable for general-purpose use by parallel visualization and graphics applications. Our work combines and extends two technologies: Chromium, a stream-oriented framework that implements the OpenGL programming interface; and OpenRM Scene Graph, a pipelined-parallel scene graph interface for graphics data management. Using this combination, we implement a sort-first, distributed memory, parallel volume rendering application. We describe the performance characteristics in terms of bandwidth requirements and highlight key algorithmic considerations needed to implement the sort-first system. We characterize system performance using a distributed memory parallel volume rendering application, a nd present performance gains realized by using scene specific knowledge to accelerate rendering through reduced network bandwidth. The contribution of this work is an exploration of general-purpose, sort-first architecture performance characteristics as applied to distributed memory, commodity hardware, along with a description of the algorithmic support needed to realize parallel, sort-first implementations.

  17. Small Generator Aggregation (Maine)

    Broader source: Energy.gov [DOE]

    This section establishes requirements for electricity providers to purchase electricity from small generators, with the goal of ensuring that small electricity generators (those with a nameplate...

  18. DOE Office of Indian Energy Project Development and Finance Course...

    Broader source: Energy.gov (indexed) [DOE]

    renewable energy based on the electrical output of the project in kilowatt hours 10 PV - photovoltaic. This is a solar resource converter to electricity. R Remaining Life - the...

  19. Thermoelectrics Combined with Solar Concentration for Electrical and Thermal Cogeneration

    E-Print Network [OSTI]

    Jackson, Philip Robert

    2012-01-01

    of energy are incident on the Earth per square foot persquare feet per square mile, translating to 12.2 billion kilowatt-hours of energy

  20. Oregon State Energy-Efficiency Appliance Rebate Program Helps...

    Broader source: Energy.gov (indexed) [DOE]

    estimated 1,140,000 kilowatt hours of electricity per year-equivalent to the annual electricity consumption of more than 70 homes. The Building Technologies Office (BTO)...

  1. Lighting in the Library

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    by your library lights E Kilowatt-hours consumed by your library lights F Annual cost of operating your library lights H Current lighting index for your library ...

  2. Technical Report NREL/TP-7A2-48267

    E-Print Network [OSTI]

    -conditioning KIUC Kauai Island Utility Cooperative kWh kilowatt-hour LCOE levelized cost of energy M&V measurement

  3. NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable Energy, LLC.

    E-Print Network [OSTI]

    and amortization ERCOT Electric Reliability Council of Texas kW kilowatt kWh kilowatt-hour LCOE levelized cost

  4. Energy Use in China: Sectoral Trends and Future Outlook

    E-Print Network [OSTI]

    2008-01-01

    Total Variable: Urban: Useful Energy Intensity (MegajouleUse Variable: Office: Useful Energy Intensity (Kilowatt-HourCooling Variable: Retail: Useful Energy Intensity (Kilowatt-

  5. Project Profile: Innovative Thermal Energy Storage for Baseload...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    lower system costs. Approach Existing thermal energy storage (TES) concepts cost about 27 per kilowatt hour thermal (kWht). The University of South Florida proposes a...

  6. Parallel object-oriented data mining system

    DOE Patents [OSTI]

    Kamath, Chandrika; Cantu-Paz, Erick

    2004-01-06

    A data mining system uncovers patterns, associations, anomalies and other statistically significant structures in data. Data files are read and displayed. Objects in the data files are identified. Relevant features for the objects are extracted. Patterns among the objects are recognized based upon the features. Data from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) sky survey was used to search for bent doubles. This test was conducted on data from the Very Large Array in New Mexico which seeks to locate a special type of quasar (radio-emitting stellar object) called bent doubles. The FIRST survey has generated more than 32,000 images of the sky to date. Each image is 7.1 megabytes, yielding more than 100 gigabytes of image data in the entire data set.

  7. Manzanita Hybrid Power system Project Final Report

    SciTech Connect (OSTI)

    Trisha Frank

    2005-03-31

    The Manzanita Indian Reservation is located in southeastern San Diego County, California. The Tribe has long recognized that the Reservation has an abundant wind resource that could be commercially utilized to its benefit, and in 1995 the Tribe established the Manzanita Renewable Energy Office. Through the U.S. Department of Energy's Tribal Energy Program the Band received funds to install a hybrid renewable power system to provide electricity to one of the tribal community buildings, the Manzanita Activities Center (MAC building). The project began September 30, 1999 and was completed March 31, 2005. The system was designed and the equipment supplied by Northern Power Systems, Inc, an engineering company with expertise in renewable hybrid system design and development. Personnel of the National Renewable Energy Laboratory provided technical assistance in system design, and continued to provide technical assistance in system monitoring. The grid-connected renewable hybrid wind/photovoltaic system provides a demonstration of a solar/wind energy hybrid power-generating project on Manzanita Tribal land. During the system design phase, the National Renewable Energy Lab estimated that the wind turbine is expected to produce 10,000-kilowatt hours per year and the solar array 2,000-kilowatt hours per year. The hybrid system was designed to provide approximately 80 percent of the electricity used annually in the MAC building. The project proposed to demonstrate that this kind of a system design would provide highly reliable renewable power for community uses.

  8. Gamma ray generator

    DOE Patents [OSTI]

    Firestone, Richard B; Reijonen, Jani

    2014-05-27

    An embodiment of a gamma ray generator includes a neutron generator and a moderator. The moderator is coupled to the neutron generator. The moderator includes a neutron capture material. In operation, the neutron generator produces neutrons and the neutron capture material captures at least some of the neutrons to produces gamma rays. An application of the gamma ray generator is as a source of gamma rays for calibration of gamma ray detectors.

  9. Generation to Generation: The Heart of Family Medicine

    E-Print Network [OSTI]

    Winter, Robin O

    2012-01-01

    Ageism in the Workplace. Generations Spring, 5. Westman,of caring for multiple generations simultaneously. StronglyGeneration to Generation: The Heart of Family Medicine

  10. Parallel Computation of Nash Equilibria in N-Player Games Jonathan Widger

    E-Print Network [OSTI]

    Grosu, Daniel

    Parallel Computation of Nash Equilibria in N-Player Games Jonathan Widger Department of Computer@cs.wayne.edu Abstract--We propose a parallel algorithm for finding Nash equilibria in n-player noncooperative games to show the performance of the parallel algorithm. Keywords-game theory; Nash equilibrium; parallel algo

  11. Mobile Agents Based Collective Communication: An Application to a Parallel Plasma Simulation

    E-Print Network [OSTI]

    Vlad, Gregorio

    to communicate by Internet. In high performance computing it represents a parallel programming paradigm

  12. Automatic Parallelization of Classification Systems based on Support Vector Machines: Comparison and Application to JET Database

    E-Print Network [OSTI]

    Automatic Parallelization of Classification Systems based on Support Vector Machines: Comparison and Application to JET Database

  13. IBM Parallel Environment for AIX 5L Operation and Use, Volume 1

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for AIX 5L Operation and Use, Volume 1 Using the Parallel Operating Environment Version 4 Release 2, Modification 2 SA22-7948-04 #12;#12;IBM Parallel Environment for AIX 5L, modification 2 of IBM Parallel Environment for AIX 5L (product number 5765-F83) and to all subsequent releases

  14. IBM Parallel Environment for AIX 5L Operation and Use, Volume 2

    E-Print Network [OSTI]

    Hickman, Mark

    IBM Parallel Environment for AIX 5L Operation and Use, Volume 2 Using the Parallel Operating Environment Version 4 Release 2, Modification 2 SA22-7949-04 #12;#12;IBM Parallel Environment for AIX 5L, Modification 2 of IBM Parallel Environment for AIX 5L (product number 5765-F83) and to all subsequent releases

  15. Parallel Processing of Large Datasets from NanoLC-FTICR-MS Measurements

    E-Print Network [OSTI]

    van Nieuwpoort, Rob V.

    with the massively parallel processing approach described here allows the scientist to reprocess data

  16. PARALLEL ACTIVITY ROADMAPS Daniel Citron, Dror G. Feitelson \\Lambda , and Iaakov Exman

    E-Print Network [OSTI]

    Feitelson, Dror

    1 PARALLEL ACTIVITY ROADMAPS Daniel Citron, Dror G. Feitelson \\Lambda , and Iaakov Exman Institute@cs.huji.ac.il Parallel Roadmaps are simple visual constructs, useful for displaying the evolution of large­scale parallel programs with dynamic parallelism. Roadmaps remain intelligible and provide invaluable debugging clues even

  17. Parallel Breadth-First Search on Distributed Memory Systems

    SciTech Connect (OSTI)

    Computational Research Division; Buluc, Aydin; Madduri, Kamesh

    2011-04-15

    Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned par- allel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix- partitioning-based approach that mitigates parallel commu- nication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execu- tion regimes in which these approaches will be competitive, and we demonstrate extremely high performance on lead- ing distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.

  18. A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; Shephard, Mark S.

    2013-01-01

    Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order inmore »a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation. « less

  19. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo (Hercules, CA)

    2009-12-29

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  20. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo (Hercules, CA)

    2008-04-22

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  1. Cylindrical neutron generator

    DOE Patents [OSTI]

    Leung, Ka-Ngo

    2005-06-14

    A cylindrical neutron generator is formed with a coaxial RF-driven plasma ion source and target. A deuterium (or deuterium and tritium) plasma is produced by RF excitation in a cylindrical plasma ion generator using an RF antenna. A cylindrical neutron generating target is coaxial with the ion generator, separated by plasma and extraction electrodes which contain many slots. The plasma generator emanates ions radially over 360.degree. and the cylindrical target is thus irradiated by ions over its entire circumference. The plasma generator and target may be as long as desired. The plasma generator may be in the center and the neutron target on the outside, or the plasma generator may be on the outside and the target on the inside. In a nested configuration, several concentric targets and plasma generating regions are nested to increase the neutron flux.

  2. Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

    SciTech Connect (OSTI)

    Skinner, David; Verdier, Francesca; Anand, Harsh; Carter,Jonathan; Durst, Mark; Gerber, Richard

    2005-03-05

    This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems. An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.

  3. Parallel vacuum arc discharge with microhollow array dielectric and anode

    SciTech Connect (OSTI)

    Feng, Jinghua; Zhou, Lin; Fu, Yuecheng; Zhang, Jianhua; Xu, Rongkun; Chen, Faxin; Li, Linbo; Meng, Shijian

    2014-07-15

    An electrode configuration with microhollow array dielectric and anode was developed to obtain parallel vacuum arc discharge. Compared with the conventional electrodes, more than 10 parallel microhollow discharges were ignited for the new configuration, which increased the discharge area significantly and made the cathode eroded more uniformly. The vacuum discharge channel number could be increased effectively by decreasing the distances between holes or increasing the arc current. Experimental results revealed that plasmas ejected from the adjacent hollow and the relatively high arc voltage were two key factors leading to the parallel discharge. The characteristics of plasmas in the microhollow were investigated as well. The spectral line intensity and electron density of plasmas in microhollow increased obviously with the decease of the microhollow diameter.

  4. Adaptive, multiresolution visualization of large data sets using parallel octrees.

    SciTech Connect (OSTI)

    Freitag, L. A.; Loy, R. M.

    1999-06-10

    The interactive visualization and exploration of large scientific data sets is a challenging and difficult task; their size often far exceeds the performance and memory capacity of even the most powerful graphics work-stations. To address this problem, we have created a technique that combines hierarchical data reduction methods with parallel computing to allow interactive exploration of large data sets while retaining full-resolution capability. The hierarchical representation is built in parallel by strategically inserting field data into an octree data structure. We provide functionality that allows the user to interactively adapt the resolution of the reduced data sets so that resolution is increased in regions of interest without sacrificing local graphics performance. We describe the creation of the reduced data sets using a parallel octree, the software architecture of the system, and the performance of this system on the data from a Rayleigh-Taylor instability simulation.

  5. Explicit spatial scattering for load balancing in conservatively synchronized parallel discrete-event simulations

    SciTech Connect (OSTI)

    Thulasidasan, Sunil [Los Alamos National Laboratory; Kasiviswanathan, Shiva [Los Alamos National Laboratory; Eidenbenz, Stephan [Los Alamos National Laboratory; Romero, Philip [Los Alamos National Laboratory

    2010-01-01

    We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic 'hot-spots' - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest (i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that spatial scattering can often obviate the need for dynamic load balancing.

  6. Methods for operating parallel computing systems employing sequenced communications

    DOE Patents [OSTI]

    Benner, R.E.; Gustafson, J.L.; Montry, G.R.

    1999-08-10

    A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.

  7. Methods for operating parallel computing systems employing sequenced communications

    DOE Patents [OSTI]

    Benner, Robert E. (Albuquerque, NM); Gustafson, John L. (Albuquerque, NM); Montry, Gary R. (Albuquerque, NM)

    1999-01-01

    A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.

  8. Parallel matrix computations. Interim report, April 1984-April 1985

    SciTech Connect (OSTI)

    Stewart, G.W.; O'Leary, D.P.

    1985-04-01

    This project concerns the design and analysis of algorithms to be run in a processor-rich environment. It focuses primarily on algorithms that require no global control and that can be run on systems with only local connections among processors. The properties of these algorithms both theoretically and experimentally are investigated. The experimental work is done on the ZMOB, a working parallel computer operated by the Laboratory for Parallel Computation of the Computer Science Department at the University of Maryland. The emphasis is on two areas: 1) Dense problems from numerical linear algebra; and 2) The iterative and direct solution of sparse linear systems.

  9. Data-flow algorithms for parallel matrix computations

    SciTech Connect (OSTI)

    O'Leary, D.P.; Stewart, G.W.

    1985-08-01

    This document develops some algorithms and tools for solving matrix problems on parallel-processing computers. Operations are synchronized through data-flow alone, which makes global synchronization unnecessary and enables the algorithms to be implemented on machines with very simple operating systems and communication protocols. As examples, the authors present algorithms that form the main modules for solving Liapounov matrix equations. They compare this approach to wave-front array processors and systolic arrays, and note its advantages in handling mis-sized problems, in evaluating variations of algorithms or architectures, in moving algorithms from system to system, and in debugging parallel algorithms on sequential machines.

  10. Parallel matrix computations. Interim report, April 1985-April 1986

    SciTech Connect (OSTI)

    Stewart, G.W.; O'Leary, D.P.

    1986-05-12

    This project concerns the design and analysis of algorithms to be run in a processor-rich environment. The authors focus primarily on algorithms that require no global control and that can be run on systems with only local connections among processors. They investigate the properties of these algorithms both theoretically and experimentally. The experimental work is done on the ZMOB, a working parallel computer operated by the Laboratory for Parallel Computation of the Computer Science Department at the University of Maryland. To give this work direction, they focused on two areas: Dense problems from numerical linear algebra; and The iterative and direct solution of sparse linear systems.

  11. Parallel Simulation of Electron Cooling Physics and Beam Transport

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    AFDC Printable Version Share this resource Send a link to EERE: Alternative Fuels Data Center Home Page to someone by E-mail Share EERE: Alternative Fuels Data Center Home Page on Facebook Tweet about EERE: Alternative Fuels Data Center Home Page on Twitter Bookmark EERE: Alternative Fuels Data Center Homesum_a_epg0_fpd_mmcf_m.xls" ,"Available from WebQuantity ofkandz-cm11 Outreach Home Room NewsInformationJesseworkSURVEYI/O Streams forOrhanTheoreticalSecurityParallel I/O Parallel

  12. PARABOLOIDAL DISH SOLAR CONCENTRATORS FOR MULTI-MEGAWATT POWER GENERATION Keith Lovegrove , Tui Taumoefolau, Sawat Paitoonsurikarn, Piya Siangsukone, Greg Burgess, Andreas Luzzi,

    E-Print Network [OSTI]

    PARABOLOIDAL DISH SOLAR CONCENTRATORS FOR MULTI-MEGAWATT POWER GENERATION Keith Lovegrove , Tui of distributed dish, central generation solar thermal power systems using either direct steam generation-dish, steam-based, solar thermal power station in White Cliffs (Kaneff 1991). A parallel line

  13. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; Anderson, Steve; Woodward, Paul; Dietz, Hank

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore »have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  14. New wave generation

    E-Print Network [OSTI]

    Mercier, Matthieu J.

    We present the results of a combined experimental and numerical study of the generation of internal waves using the novel internal wave generator design of Gostiaux et al. (Exp. Fluids, vol. 42, 2007, pp. 123–130). This ...

  15. features Utility Generator

    E-Print Network [OSTI]

    Chang, Shih-Fu

    #12;#12;#12;#12;features function utility Training Pool Utility Generator Per-frame function content utility classes utility classes utility Tree Decision Generator Module Utility Clustering Adaptive

  16. Generating Functions Introduction

    E-Print Network [OSTI]

    Gould, Ron

    CHAPTER 10 Ordinary Generating Functions Introduction We'll begin this chapter by introducing the notion of ordinary generating functions and discussing the basic techniques for manipulating them must master these basic ideas before reading further. In Section 2, we apply generating functions

  17. The Clemson First Generation

    E-Print Network [OSTI]

    Stuart, Steven J.

    The Clemson First Generation Success Program A First-RAte expeRience College is an experience college. First-generation college students are students whose parents do not hold a degree from a four-year college or university. Clemson is proud of its first- generation students and is committed

  18. Superconducting Power Generation

    E-Print Network [OSTI]

    Mario Rabinowitz

    2003-02-20

    The superconducting ac generator has the greatest potential for large-scale commercial application of superconductivity that can benefit the public. Electric power is a vital ingredient of modern society, and generation may be considered to be the vital ingredient of a power system. This articles gives background, and an insight into the physics and engineering of superconducting power generation.

  19. Mesh Generator Matthew Hanlon

    E-Print Network [OSTI]

    Nebel, Jean-Christophe

    1 Mesh Generator Matthew Hanlon 9804817 hanlonmj@dsc.gla.ac.uk Class CS4H Session 2002 from two dimensional slices. Medical data stored as sets of slices can be used to generate a three was developed with the following requirements: · Load a set of slices into the system · Generate a mesh for each

  20. RGG: Reactor geometry (and mesh) generator

    SciTech Connect (OSTI)

    Jain, R.; Tautges, T.

    2012-07-01

    The reactor geometry (and mesh) generator RGG takes advantage of information about repeated structures in both assembly and core lattices to simplify the creation of geometry and mesh. It is released as open source software as a part of the MeshKit mesh generation library. The methodology operates in three stages. First, assembly geometry models of various types are generated by a tool called AssyGen. Next, the assembly model or models are meshed by using MeshKit tools or the CUBIT mesh generation tool-kit, optionally based on a journal file output by AssyGen. After one or more assembly model meshes have been constructed, a tool called CoreGen uses a copy/move/merge process to arrange the model meshes into a core model. In this paper, we present the current state of tools and new features in RGG. We also discuss the parallel-enabled CoreGen, which in several cases achieves super-linear speedups since the problems fit in available RAM at higher processor counts. Several RGG applications - 1/6 VHTR model, 1/4 PWR reactor core, and a full-core model for Monju - are reported. (authors)

  1. Building a Parallel Cloud Storage System using OpenStack’s Swift Object Store and Transformative Parallel I/O

    SciTech Connect (OSTI)

    Burns, Andrew J.; Lora, Kaleb D.; Martinez, Esteban; Shorter, Martel L.

    2012-07-30

    Our project consists of bleeding-edge research into replacing the traditional storage archives with a parallel, cloud-based storage solution. It used OpenStack's Swift Object Store cloud software. It's Benchmarked Swift for write speed and scalability. Our project is unique because Swift is typically used for reads and we are mostly concerned with write speeds. Cloud Storage is a viable archive solution because: (1) Container management for larger parallel archives might ease the migration workload; (2) Many tools that are written for cloud storage could be utilized for local archive; and (3) Current large cloud storage practices in industry could be utilized to manage a scalable archive solution.

  2. Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines

    E-Print Network [OSTI]

    Rodrigues, Antonio Wendell De Oliveira; Dekeyser, Jean-Luc; Menach, Yvonnick Le

    2011-01-01

    The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we propose an approach to generate code for hybrid architectures (e.g. CPU + GPU) using OpenCL, an open standard for parallel programming of heterogeneous systems. This approach is based on Model Driven Engineering (MDE) and the MARTE profile, standard proposed by Object Management Group (OMG). The aim is to provide resources to non-specialists in parallel programming to implement their applications. Moreover, thanks to model reuse capacity, we can add/change functionalities or the target architecture. Consequently, this approach helps industries to achieve their time-to-market constraints and confirms by experimental tests, performance improvements using multi-GPU environmen...

  3. A Study of Successive Over-relaxation Method Parallelization over Modern HPC Languages

    SciTech Connect (OSTI)

    Mittal, Sparsh [ORNL

    2014-01-01

    Successive over-relaxation (SOR) is a computationally intensive, yet extremely important iterative solver for solving linear systems. Due to recent trends of exponential growth in amount of data generated and increasing problem sizes, serial platforms have proved to be insucient in providing the required computational power. In this paper, we present parallel implementations of red-black SOR method using three modern programming languages namely Chapel, D and Go. We employ SOR method for solving 2D steady-state heat conduction problem. We discuss the optimizations incorporated and the features of these languages which are crucial for improving the program performance. Experiments have been performed using 2, 4, and 8 threads and performance results are compared with serial execution. The analysis of results provides important insights into working of SOR method.

  4. A New Achievable Rate for the Gaussian Parallel Relay Channel

    E-Print Network [OSTI]

    Waterloo, University of

    emulate distributed transmit antennas to combat the multi-path fading effect and increase the physical to under- stand how to efficiently utilize the available power and bandwidth resources. The parallel relay was introduced for the first time by Van der Meulen in 1971 [1]. The most important capacity result of the relay

  5. Algorithms for VLSI Circuit Optimization and GPU-Based Parallelization 

    E-Print Network [OSTI]

    Liu, Yifang

    2010-07-14

    . . . . . . . . . . . . . . . . . . . . 86 3. Min-cost Solution . . . . . . . . . . . . . . . . . . . . 88 F. Conclusion and Future Work . . . . . . . . . . . . . . . . . 90 V GPU-BASED PARALLELIZATION FOR FAST CIRCUIT OPTIMIZATION . . . . . . . . . . . . . . . . . . . . . . . . . . 91 A... for 200 nets. . . . . . . . . . . . . . . . . . 86 VI Min-cost solution results for 200 nets. . . . . . . . . . . . . . . . . . 88 VII Comparison on power (??) and runtime (seconds). All solutions satisfy timing constraints...

  6. Hardware packet pacing using a DMA in a parallel computer

    DOE Patents [OSTI]

    Chen, Dong; Heidelberger, Phillip; Vranas, Pavlos

    2013-08-13

    Method and system for hardware packet pacing using a direct memory access controller in a parallel computer which, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter.

  7. High-Resolution Simulations of Parallel BladeVortex Interactions

    E-Print Network [OSTI]

    Alonso, Juan J.

    to that encountered in the simulation of realistic helicopter blade­vortex interaction, but the computational costs aeroacoustics rotor tests [2,3]. These tests were performed on a Mach-scaled Bo-105 rotor and the blade loadsHigh-Resolution Simulations of Parallel Blade­Vortex Interactions Alasdair Thom University

  8. Parallel heat transport in integrable and chaotic magnetic fields

    SciTech Connect (OSTI)

    Del-Castillo-Negrete, Diego B [ORNL; Chacon, Luis [ORNL

    2012-01-01

    The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly chal- lenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), , and the perpendicular, , conductivities ( / may exceed 1010 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green s function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geom- etry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island chain), weakly chaotic (devil s staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local closures, is non-diffusive, thus casting doubts on the appropriateness of the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.

  9. Parallel Implementation of a Vehicle-Tire-Terrain Interaction Model

    E-Print Network [OSTI]

    Negrut, Dan

    (VTTIM) · Three components o Vehicle o Tire o Terrain/Soil mechanics · Two interfaces o Vehicle support for ANCF `tire' 9 #12;Types of Soil Mechanics Models · Empirical Methods o WES numerics, Bekker of Tire Models · Rigid o Simple to implement in parallel o Only accurate if deformation of soil is much

  10. A Library Hierarchy for Implementing Scalable Parallel Search Algorithms

    E-Print Network [OSTI]

    Ralphs, Ted

    of Mathematical Sciences, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, ladanyi@us.ibm for performing large-scale parallel search in distributed-memory computing environments. To support the devel a hierarchy implementing additional functionality needed for specific applications. Department of Industrial

  11. Computational Experience with a Software Framework for Parallel Integer Programming

    E-Print Network [OSTI]

    Ralphs, Ted

    from NSF grant DMI-0522796 Department of Mathematical Sciences, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, ladanyi@us.ibm.com § Department of Mathematical Sciences, Clemson UniversityComputational Experience with a Software Framework for Parallel Integer Programming Y. Xu T. K

  12. A Library Hierarchy for Implementing Scalable Parallel Search Algorithms

    E-Print Network [OSTI]

    Ralphs, Ted

    Partnership Award y Department of Mathematical Sciences, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, ladanyi@us.ibm.com z Department of Mathematical Sciences, Clemson University, Clemson, SC scalable algorithms for performing large-scale parallel search in distributed-memory computing environments

  13. elastic wave propagation in media with parallel fractures and ...

    E-Print Network [OSTI]

    M . SCHOENBERG2 and J . DOUMA

    2002-02-14

    A model of parallel slip interfaces simulates the behaviour of a fracture system composed of large, closely .... Note that when the ith constituent layer is isotropic, c44i = c66i = pi, clli = c,,~ = ,Ii + 2pi and .... Thus (14) becomes. Define the .... system's characteristic properties, such as crack size, crack density or the contents of.

  14. An intercalation-locked parallel-stranded DNA tetraplex

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Tripathi, S.; Zhang, D.; Paukstelis, P. J.

    2015-01-27

    DNA has proved to be an excellent material for nanoscale construction because complementary DNA duplexes are programmable and structurally predictable. However, in the absence of Watson–Crick pairings, DNA can be structurally more diverse. Here, we describe the crystal structures of d(ACTCGGATGAT) and the brominated derivative, d(ACBrUCGGABrUGAT). These oligonucleotides form parallel-stranded duplexes with a crystallographically equivalent strand, resulting in the first examples of DNA crystal structures that contains four different symmetric homo base pairs. Two of the parallel-stranded duplexes are coaxially stacked in opposite directions and locked together to form a tetraplex through intercalation of the 5'-most A–A base pairs betweenmore »adjacent G–G pairs in the partner duplex. The intercalation region is a new type of DNA tertiary structural motif with similarities to the i-motif. 1H–1H nuclear magnetic resonance and native gel electrophoresis confirmed the formation of a parallel-stranded duplex in solution. Finally, we modified specific nucleotide positions and added d(GAY) motifs to oligonucleotides and were readily able to obtain similar crystals. This suggests that this parallel-stranded DNA structure may be useful in the rational design of DNA crystals and nanostructures.« less

  15. Predicting Performance of Parallel Garbage Collectors on Shared Memory Multiprocessors

    E-Print Network [OSTI]

    Predicting Performance of Parallel Garbage Collectors on Shared Memory Multiprocessors Toshio ENDO Enterprise 10000 Origin 2000 1 (GC) GC GC [4] GC (SMP) (DSM) GC GC Cilk [2] [6] Sun Enterprise 10000 (SMP ) Sl 20 180(ns), 260 500(ns) 3.6.2 Enterprise 10000 Sun Enterprise 10000 ( E10000) 250 MHz Ultra

  16. Butterfly Project Report DARPA Parallel Architecture Benchmark Study

    E-Print Network [OSTI]

    Scott, Michael L.

    Butterfly Project Report 13 DARPA Parallel Architecture Benchmark Study C. Brown, R. Fowler, T. Le Butterfly Project Report 11 5. Hough Transformation Butterfly Project Report 10 6. Geometrical Constructions In this document 7. Visibility Calculations In this document 8. Graph Matching Butterfly Project Report 14 9

  17. HIGH-PERFORMANCE PARALLEL ADDITION USING HYBRID WAVE-PIPELINING

    E-Print Network [OSTI]

    Nyathi, Jabulani

    HIGH-PERFORMANCE PARALLEL ADDITION USING HYBRID WAVE-PIPELINING James Levy, Jabulani Nyathi, jabu, jdelgado)@eecs.wsu.edu Abstract-- Pipelining digital systems has been shown to provide significant performance gains over non-pipelined systems and remains a standard in microprocessor design

  18. A parallel algorithm for 3D dislocation dynamics

    SciTech Connect (OSTI)

    Wang Zhiqiang [University of California - Los Angeles, Los Angeles, CA 90095-1597 (United States)]. E-mail: zhiqiang@lanl.gov; Ghoniem, Nasr [University of California - Los Angeles, Los Angeles, CA 90095-1597 (United States); Swaminarayan, Sriram [University of California, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); LeSar, Richard [University of California, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States)

    2006-12-10

    Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals.

  19. CX: A Scalable, Robust Network for Parallel Computing

    E-Print Network [OSTI]

    Cappello, Peter

    CX: A Scalable, Robust Network for Parallel Computing}@cs.ucsb.edu telephone: 805.893.4383; fax: 805.893.853 Abstract CX, a network are distributed throughout the server network via a simple "diffusion" process. CX is intended as a test

  20. Geometric Characterization of Series-Parallel Variable Resistor Networks

    E-Print Network [OSTI]

    Tygar, Doug

    simultaneously by an algorithm of complexity O(nk). Key Words: Worst case analysis, linear circuits, series-parallel networks, projective geometry. 1. Introduction The task of worst case circuit analysis [7] involves a method for performing a worst case analysis of a variable linear resistor network by casting

  1. The Average Case Complexity of the Parallel Prefix Problem

    E-Print Network [OSTI]

    Reischuk, Rüdiger

    double logarithmic delay while keeping the circuit size linear. The analysis and results are illustrated this can be done in parallel using only linear circuit size [LF80]. Snir has obtained exact bounds­fanin circuits and classify semigroups according to the property of having linear size prefix circuits

  2. Geometric Characterization of SeriesParallel Variable Resistor Networks #

    E-Print Network [OSTI]

    Bryant, Randal E.

    simultaneously by an algorithm of complexity O(nk). Key Words: Worst case analysis, linear circuits, series­parallel networks, projective geometry. 1. Introduction The task of worst case circuit analysis [7] involves. In his book on circuit theory [2], Calahan describes a method for performing a worst case analysis

  3. Object Oriented Parallel Computation for Plasma Charles D. Norton

    E-Print Network [OSTI]

    Bystroff, Chris

    Organization]: Multiple Data Stream Architectures---parallel processors; D.1.5 [Software]: Programming the software design and program­ ming process by providing an application oriented view of programming while facil­ itating modification and reuse. Since the software design crisis is particularly acute

  4. Parallel and Adaptive Simulation of Fuel Cells Robert Klfkorn1

    E-Print Network [OSTI]

    Münster, Westfälische Wilhelms-Universität

    Parallel and Adaptive Simulation of Fuel Cells in 3d Robert Klöfkorn1 , Dietmar Kröner1 , Mario) fuel cells. Hereby, we focus on the simulation done in 3d us- ing modern techniques like higher order and the transport of species in the cathodic gas diffusion layer of the fuel cell. Therefore, from the detailed

  5. Evaluating Memory Energy Efficiency in Parallel I/O Workloads

    E-Print Network [OSTI]

    Zhu, Yifeng

    Evaluating Memory Energy Efficiency in Parallel I/O Workloads Jianhui Yue,Yifeng Zhu , Zhao Cai the ever- widening gap between disk and processor speeds, memory energy efficiency becomes an increasingly management policies heavily influence the overall memory energy efficiency. In partic- ular, under the same

  6. An Approach for Energy Efficient Execution of Hybrid Parallel Programs

    E-Print Network [OSTI]

    Teo, Yong-Meng

    An Approach for Energy Efficient Execution of Hybrid Parallel Programs Lavanya Ramapantulu system. One of the key challenges for energy efficient execution of hybrid programs is to determine time and energy efficient hardware configurations among a large system configuration space. Given a hybrid program

  7. Dynamic Algorithm Selection in Parallel GAMESS Calculations Nurzhan Ustemirov

    E-Print Network [OSTI]

    Sosonkina, Masha

    and Molecular Electronic Structure System (GAMESS) used for ab initio molecular quantum chemistry calculationsDynamic Algorithm Selection in Parallel GAMESS Calculations Nurzhan Ustemirov Masha Sosonkina, network, or disk I/O. For large-scale scientific applications, dynamic adjustments to a computationally

  8. A BSP performance prediction model for parallel multigrid algorithms

    E-Print Network [OSTI]

    Osoba, B.O.; Rabhi, F.A.; Ould-Khaoua, M.

    Osoba,B.O. Rabhi,F.A. Ould-Khaoua,M. Proceedings IEEE 7th Int. Conf. Electronics, Circuits and Systems, Special workshop on Formal Methods for Engineering Special-Purpose Parallel Systems, Kaslik, Lebanon, December 2000. pp 403-406 IEEE

  9. Energy-Efficient Sensing and Communication of Parallel Gaussian Sources

    E-Print Network [OSTI]

    Erkip, Elza

    Energy-Efficient Sensing and Communication of Parallel Gaussian Sources Xi Liu, Osvaldo Simeone to be operated in an energy-efficient manner in order to attain a satisfactory lifetime. Energy consumption efficiency [2] [3]. We refer to the energy cost associated with measurements and compression of information

  10. Parallel Seismic Ray Tracing in a Global Earth Model

    E-Print Network [OSTI]

    Genaud, Stéphane

    from the hypocenter (source) to one station. The #28;nal objective of the seismic tomography process1 Parallel Seismic Ray Tracing in a Global Earth Model Marc Grunberg * , Stéphane Genaud of the Earth interior, and seismic tomogra- phy is a means to improve knowledge in this #28;eld. In order

  11. A MICROFLUIDIC BIOCHIP DEDICATED TO HIGHLY PARALLELIZED ELECTROFUSION

    E-Print Network [OSTI]

    Paris-Sud XI, Université de

    0065 A MICROFLUIDIC BIOCHIP DEDICATED TO HIGHLY PARALLELIZED ELECTROFUSION F. Hamdi1, 2 , O: Microfluidics, Biochip, Electrofusion, Cell trapping INTRODUCTION The electrofusion between a dendritic i) the trapping of cells flowing in the microfluidic channel ii) their pairing prior to fusion, iii

  12. A massively parallel fractional step solver for incompressible flows

    SciTech Connect (OSTI)

    Houzeaux, G. Vazquez, M. Aubry, R. Cela, J.M.

    2009-09-20

    This paper presents a parallel implementation of fractional solvers for the incompressible Navier-Stokes equations using an algebraic approach. Under this framework, predictor-corrector and incremental projection schemes are seen as sub-classes of the same class, making apparent its differences and similarities. An additional advantage of this approach is to set a common basis for a parallelization strategy, which can be extended to other split techniques or to compressible flows. The predictor-corrector scheme consists in solving the momentum equation and a modified 'continuity' equation (namely a simple iteration for the pressure Schur complement) consecutively in order to converge to the monolithic solution, thus avoiding fractional errors. On the other hand, the incremental projection scheme solves only one iteration of the predictor-corrector per time step and adds a correction equation to fulfill the mass conservation. As shown in the paper, these two schemes are very well suited for massively parallel implementation. In fact, when compared with monolithic schemes, simpler solvers and preconditioners can be used to solve the non-symmetric momentum equations (GMRES, Bi-CGSTAB) and to solve the symmetric continuity equation (CG, Deflated CG). This gives good speedup properties of the algorithm. The implementation of the mesh partitioning technique is presented, as well as the parallel performances and speedups for thousands of processors.

  13. FUTURE POWER GRID INITIATIVE GridPACK: Grid Parallel Advanced

    E-Print Network [OSTI]

    FUTURE POWER GRID INITIATIVE GridPACK: Grid Parallel Advanced Computational Kernels OBJECTIVE The U Pacific Northwest National Laboratory (509) 375-3899 bruce.palmer@pnnl.gov ABOUT FPGI The Future Power and ensure a more secure, efficient and reliable future grid. Building on the Electricity Infrastructure

  14. USING INTRADISK PARALLELISM TO BUILD ENERGY-EFFICIENT

    E-Print Network [OSTI]

    Gurumurthi, Sudhanva

    INTRADISK PARALLELISM TO BUILD ENERGY-EFFICIENT STORAGE SYSTEMS .................................................................................................................................................................................................................. SERVER STORAGE SYSTEMS USE NUMEROUS DISKS TO ACHIEVE HIGH PERFORMANCE, THEREBY CONSUMING A SIGNIFICANT it to process and deliver content to users. In addition to storage capacity, storage systems within data centers

  15. Job Management Requirements for NAS Parallel Systems and Clusters

    E-Print Network [OSTI]

    Feitelson, Dror

    that are becoming increasingly important in the high performance computing industry. Newer job management systems in high perfor- mance computing for NASA. This paper focuses on job management require- ments for two1 Job Management Requirements for NAS Parallel Systems and Clusters William Saphir1, Leigh Ann

  16. Advanced Research Compu2ng Parallel Programming with

    E-Print Network [OSTI]

    Crawford, T. Daniel

    Advanced Research Compu2ng Parallel Programming with MPI Advanced Research Computing #12;Advanced Research Compu2ng Advanced Research Compu2ng Outline · Message · Collec2ve Communica2on #12;Advanced Research Compu2ng Advanced Research Compu2ng 3

  17. A PLANAR PARALLEL MANIPULATOR WITH HOLONOMIC HIGHER PAIRS: INVERSE KINEMATICS

    E-Print Network [OSTI]

    Hayes, John

    kinematic analysis. Very little literature on such planar mechanisms was found. The e ects of initialA PLANAR PARALLEL MANIPULATOR WITH HOLONOMIC HIGHER PAIRS: INVERSE KINEMATICS Matthew John D. HAYES of Mechanical Engineering 817 r. Sherbrooke O., Rm 454, Montreal, Quebec, H3A 2K6 Canada, Tel: (514) 398

  18. Opportunities to Parallelize Path Planning Algorithms for Autonomous Underwater Vehicles

    E-Print Network [OSTI]

    Kremer, Ulrich

    , Germany Email: mike.eichhorn@tu-ilmenau.de Ulrich Kremer Dept. of Computer Science Rutgers University. energy tradeoffs that can be exploited in energy- constrained environments such as battery operated 2005, when the series production of multi core processors started. Before then, writing parallel

  19. PARALLEL PROCESSING: A RELATIONSHIP BETWEEN RETINAL AND FOURIER OPTICS TECHNIQUES

    E-Print Network [OSTI]

    Moore, John Barratt

    PARALLEL PROCESSING: A RELATIONSHIP BETWEEN RETINAL AND FOURIER OPTICS TECHNIQUES by D,J.H. Moore P #12;ABSTIbiCT Two methods of describing and processing visual data by replacing the pattern a retinal-machine approach. Modifications are suggested for the Fourier-optics approach which greatly

  20. Motor/generator

    DOE Patents [OSTI]

    Hickam, Christopher Dale (Glasford, IL)

    2008-05-13

    A motor/generator is provided for connecting between a transmission input shaft and an output shaft of a prime mover. The motor/generator may include a motor/generator housing, a stator mounted to the motor/generator housing, a rotor mounted at least partially within the motor/generator housing and rotatable about a rotor rotation axis, and a transmission-shaft coupler drivingly coupled to the rotor. The transmission-shaft coupler may include a clamp, which may include a base attached to the rotor and a plurality of adjustable jaws.

  1. Xyce Parallel Electronic Simulator Users Guide Version 6.2.

    SciTech Connect (OSTI)

    Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Schiek, Richard; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason; Baur, David Gregory

    2014-09-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. Trademarks The information herein is subject to change without notice. Copyright c 2002-2014 Sandia Corporation. All rights reserved. Xyce TM Electronic Simulator and Xyce TM are trademarks of Sandia Corporation. Portions of the Xyce TM code are: Copyright c 2002, The Regents of the University of California. Produced at the Lawrence Livermore National Laboratory. Written by Alan Hindmarsh, Allan Taylor, Radu Serban. UCRL-CODE-2002-59 All rights reserved. Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence Design Systems, Inc. Microsoft, Windows and Windows 7 are registered trademarks of Microsoft Corporation. Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation. Amtec and TecPlot are trademarks of Amtec Engineering, Inc. Xyce 's expression library is based on that inside Spice 3F5 developed by the EECS Department at the University of California. The EKV3 MOSFET model was developed by the EKV Team of the Electronics Laboratory-TUC of the Technical University of Crete. All other trademarks are property of their respective owners. Contacts Bug Reports (Sandia only) http://joseki.sandia.gov/bugzilla http://charleston.sandia.gov/bugzilla World Wide Web http://xyce.sandia.gov http://charleston.sandia.gov/xyce (Sandia only) Email xyce%40sandia.gov (outside Sandia) xyce-sandia%40sandia.gov (Sandia only)

  2. Combined fuel and air staged power generation system

    SciTech Connect (OSTI)

    Rabovitser, Iosif K; Pratapas, John M; Boulanov, Dmitri

    2014-05-27

    A method and apparatus for generation of electric power employing fuel and air staging in which a first stage gas turbine and a second stage partial oxidation gas turbine power operated in parallel. A first portion of fuel and oxidant are provided to the first stage gas turbine which generates a first portion of electric power and a hot oxidant. A second portion of fuel and oxidant are provided to the second stage partial oxidation gas turbine which generates a second portion of electric power and a hot syngas. The hot oxidant and the hot syngas are provided to a bottoming cycle employing a fuel-fired boiler by which a third portion of electric power is generated.

  3. SAMPLING-BASED ROADMAP OF TREES FOR PARALLEL MOTION PLANNING 1 Sampling-Based Roadmap of Trees for Parallel

    E-Print Network [OSTI]

    Chen, Brian Y.

    SAMPLING-BASED ROADMAP OF TREES FOR PARALLEL MOTION PLANNING 1 Sampling-Based Roadmap of Trees for multiple query motion planning (Probabilistic Roadmap Method - PRM) with sampling-based tree methods algorithms, roadmap, tree, PRM, EST, RRT, SRT. I. INTRODUCTION HIGH-DIMENSIONAL problems such as those

  4. Method of grid generation

    DOE Patents [OSTI]

    Barnette, Daniel W. (Veguita, NM)

    2002-01-01

    The present invention provides a method of grid generation that uses the geometry of the problem space and the governing relations to generate a grid. The method can generate a grid with minimized discretization errors, and with minimal user interaction. The method of the present invention comprises assigning grid cell locations so that, when the governing relations are discretized using the grid, at least some of the discretization errors are substantially zero. Conventional grid generation is driven by the problem space geometry; grid generation according to the present invention is driven by problem space geometry and by governing relations. The present invention accordingly can provide two significant benefits: more efficient and accurate modeling since discretization errors are minimized, and reduced cost grid generation since less human interaction is required.

  5. Steam generator support system

    DOE Patents [OSTI]

    Moldenhauer, James E. (Simi Valley, CA)

    1987-01-01

    A support system for connection to an outer surface of a J-shaped steam generator for use with a nuclear reactor or other liquid metal cooled power source. The J-shaped steam generator is mounted with the bent portion at the bottom. An arrangement of elongated rod members provides both horizontal and vertical support for the steam generator. The rod members are interconnected to the steam generator assembly and a support structure in a manner which provides for thermal distortion of the steam generator without the transfer of bending moments to the support structure and in a like manner substantially minimizes forces being transferred between the support structure and the steam generator as a result of seismic disturbances.

  6. Steam generator support system

    DOE Patents [OSTI]

    Moldenhauer, J.E.

    1987-08-25

    A support system for connection to an outer surface of a J-shaped steam generator for use with a nuclear reactor or other liquid metal cooled power source is disclosed. The J-shaped steam generator is mounted with the bent portion at the bottom. An arrangement of elongated rod members provides both horizontal and vertical support for the steam generator. The rod members are interconnected to the steam generator assembly and a support structure in a manner which provides for thermal distortion of the steam generator without the transfer of bending moments to the support structure and in a like manner substantially minimizes forces being transferred between the support structure and the steam generator as a result of seismic disturbances. 4 figs.

  7. Parallelization and checkpointing of GPU applications through program transformation

    SciTech Connect (OSTI)

    Solano-Quinde, Lizandro Dami#19; an [Ames Laboratory

    2012-11-15

    GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have beneffited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and to develop support for application-level fault tolerance in applications using multiple GPUs. Our techniques reduce the burden of enhancing single-GPU applications to support these features. To achieve our goal, this work designs and implements a framework for enhancing a single-GPU OpenCL application through application transformation.

  8. Electric power monthly, September 1996, with data for June 1996

    SciTech Connect (OSTI)

    1996-09-01

    The Coal and Electric Data and Renewables Division; Office of Coal, Nuclear, Electric and Alternate Fuels, Energy Information Administration (EIA), Department of Energy prepares the EPM. This publication provides monthly statistics at the State, Census division, and U.S. levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics in the EPM on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant.

  9. Electric power monthly, July 1999, with data for April 1999

    SciTech Connect (OSTI)

    1999-07-01

    The Electric Power Division, Office of Coal, Nuclear, Electric and Alternate Fuels, Energy Information Administration (EIA), Department of Energy prepares the Electric Power Monthly (EPM). This publication provides monthly statistics at the State, Census division, and US levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics in the EPM on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant. 1 fig., 64 tabs.

  10. Electric power monthly, December 1996 with data for September 1996

    SciTech Connect (OSTI)

    1996-12-01

    The report presents monthly electricity statistics for a wide audience including Congress, Federal and State agencies, the electric utility industry, and the general public. The purpose of this publication is to provide energy decisionmakers with accurate and timely information that may be used in forming various perspectives on electric issues that lie ahead. This publication provides monthly statistics at the State, Census division, and US levels for net generation, fossil fuel consumption and stocks, quantity and quality of fossil fuels, cost of fossil fuels, electricity retail sales, associated revenue, and average revenue per kilowatt hour of electricity sold. In addition, data on net generation, fuel consumption, fuel stocks, quantity and cost of fossil fuels are also displayed for the North American Electric Reliability Council (NERC) regions. The EIA publishes statistics on net generation by energy source; consumption, stocks, quantity, quality, and cost of fossil fuels; and capability of new generating units by company and plant. 57 tabs.

  11. Thermophotovoltaic energy generation

    DOE Patents [OSTI]

    Celanovic, Ivan; Chan, Walker; Bermel, Peter; Yeng, Adrian Y. X.; Marton, Christopher; Ghebrebrhan, Michael; Araghchini, Mohammad; Jensen, Klavs F.; Soljacic, Marin; Joannopoulos, John D.; Johnson, Steven G.; Pilawa-Podgurski, Robert; Fisher, Peter

    2015-08-25

    Inventive systems and methods for the generation of energy using thermophotovoltaic cells are described. Also described are systems and methods for selectively emitting electromagnetic radiation from an emitter for use in thermophotovoltaic energy generation systems. In at least some of the inventive energy generation systems and methods, a voltage applied to the thermophotovoltaic cell (e.g., to enhance the power produced by the cell) can be adjusted to enhance system performance. Certain embodiments of the systems and methods described herein can be used to generate energy relatively efficiently.

  12. Fourth Generation Majorana Neutrinos

    E-Print Network [OSTI]

    Alexander Lenz; Heinrich Päs; Dario Schalla

    2012-05-02

    We investigate the possibility of a fourth sequential generation in the lepton sector. Assuming neutrinos to be Majorana particles and starting from a recent - albeit weak - evidence for a non-zero admixture of a fourth generation neutrino from fits to weak lepton and meson decays we discuss constraints from neutrinoless double beta decay, radiative lepton decay and like-sign dilepton production at hadron colliders. Also an idea for fourth generation neutrino mass model building is briefly outlined. Here we soften the large hierarchy of the neutrino masses within an extradimensional model that locates each generation on different lepton number violating branes without large hierarchies.

  13. SNE TRAFIC GENERATOR

    Energy Science and Technology Software Center (OSTI)

    003027MLTPL00 Network Traffic Generator for Low-rate Small Network Equipment Software  http://eln.lbl.gov/sne_traffic_gen.html 

  14. Renewable Electricity Generation

    SciTech Connect (OSTI)

    2012-09-01

    This document highlights DOE's Office of Energy Efficiency and Renewable Energy's advancements in renewable electricity generation technologies including solar, water, wind, and geothermal.

  15. Talkin’ Bout Wind Generation

    Office of Energy Efficiency and Renewable Energy (EERE)

    The amount of electricity generated by the wind industry started to grow back around 1999, and since 2007 has been increasing at a rapid pace.

  16. Next-generation transcriptome assembly

    E-Print Network [OSTI]

    Martin, Jeffrey A.

    2012-01-01

    technologies - the next generation. Nat Rev Genet 11, 31-algorithms for next-generation sequencing data. Genomicsassembly from next- generation sequencing data. Genome Res

  17. ENERGY RECOVERY COUNCIL WEEKLY UPDATE

    E-Print Network [OSTI]

    apply to calendar year 2009 sales of kilowatt hours of electricity produced in the United States or one-loop biomass, geothermal energy, and solar energy; and 1.1 cent per kilowatt hour on the sale of electricity the House Education and Labor Committee where he served as Senior Labor Policy Advisor for Health and Safety

  18. Optimizing Parallel Access to the BaBar Database System Using...

    Office of Scientific and Technical Information (OSTI)

    Technical Report: Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Citation Details In-Document Search Title: Optimizing Parallel Access to the BaBar...

  19. An evaluation of a parallel-resonant current-source converter for an electrothermal thruster 

    E-Print Network [OSTI]

    Tchamdjou, Aristide-Marie

    1996-01-01

    The Parallel-Resonant Current-Source Converter promises highly efficient DCDC power conversion. It uses zero-voltage switching to reduce the losses and improve the converter efficiency. The Parallel-Resonant Current-Source Converter has been...

  20. FULL ABSTRACTION FOR A SIMPLE PARALLEL PROGRAMMINGLANGUAGE M.C.B. Hennessy

    E-Print Network [OSTI]

    Plotkin, Gordon

    construction, but for (certain kinds) of cpos. For example the power- domain~(S±) of the flat cpo Si, formed with parallelism was given, treating parallelism in terms of non-deterministic mergeing of uninterruptible actions

  1. Numerical field simulation for parallel transmission in MRI at 7 tesla

    E-Print Network [OSTI]

    Bernier, Jessica A. (Jessica Ashley)

    2011-01-01

    Parallel transmission (pTx) is a promising improvement to coil design that has been demonstrated to mitigate B1* inhomogeneity, manifest as center brightening, for high-field magnetic resonance imaging (MRI). Parallel ...

  2. J. Parallel Distrib. Comput. ( ) Contents lists available at SciVerse ScienceDirect

    E-Print Network [OSTI]

    Hwu, Wen-mei W.

    J. Parallel Distrib. Comput. ( ) ­ Contents lists available at SciVerse ScienceDirect J. Parallel to water diffusion. MRI image reconstruction consists of solving a large linear system relating

  3. KINEMATIC, DYNAMIC AND WORKSPACE ANALYSIS OF A NOVEL 6-DOF PARALLEL MANIPULATOR

    E-Print Network [OSTI]

    Krovi, Venkat

    KINEMATIC, DYNAMIC AND WORKSPACE ANALYSIS OF A NOVEL 6-DOF PARALLEL MANIPULATOR by Hrishi L. Shah Shah SUNY BUFFALO KINEMATIC, DYNAMIC AND WORKSPACE ANALYSIS OF A NOVEL 6-DOF PARALLEL MANIPULATOR #12 .........................................................................................12 3.1. Kinematic Analysis .......

  4. Fast Algorithms for Image Reconstruction with Application to Partially Parallel MR Imaging

    E-Print Network [OSTI]

    Yin, Wotao

    Fast Algorithms for Image Reconstruction with Application to Partially Parallel MR Imaging Yunmei. Key words. Image reconstruction, Variable splitting, TV denoising, Nonlinear optimization 1 from an emerging magnetic resonance (MR) medical imaging technique known as partially parallel imaging

  5. A Parallel Statistical Learning Approach to the Prediction of Building Energy Consumption Based on Large Datasets

    E-Print Network [OSTI]

    Paris-Sud XI, Université de

    A Parallel Statistical Learning Approach to the Prediction of Building Energy Consumption Based consumption of buildings based on historical performances is an important approach to achieve energy (SVMs), Prediction, Model, Energy Efficiency, Parallel Computing. 1. INTRODUCTION Building energy

  6. Astrocytes generate Na -mediated metabolic waves Yann Bernardinelli*, Pierre J. Magistretti*

    E-Print Network [OSTI]

    Newman, Eric A.

    Astrocytes generate Na -mediated metabolic waves Yann Bernardinelli*, Pierre J. Magistretti waves. Here we show that intercellular Na waves are also evoked by activation of single cultured cortical mouse astrocytes in parallel with Ca2 waves; however, there are spatial and temporal differences

  7. Parallel resistivity and ohmic heating of laboratory dipole plasmas

    SciTech Connect (OSTI)

    Fox, W.

    2012-08-15

    The parallel resistivity is calculated in the long-mean-free-path regime for the dipole plasma geometry; this is shown to be a neoclassical transport problem in the limit of a small number of circulating electrons. In this regime, the resistivity is substantially higher than the Spitzer resistivity due to the magnetic trapping of a majority of the electrons. This suggests that heating the outer flux surfaces of the plasma with low-frequency parallel electric fields can be substantially more efficient than might be naively estimated. Such a skin-current heating scheme is analyzed by deriving an equation for diffusion of skin currents into the plasma, from which quantities such as the resistive skin-depth, lumped-circuit impedance, and power deposited in the plasma can be estimated. Numerical estimates indicate that this may be a simple and efficient way to couple power into experiments in this geometry.

  8. Center for Programming Models for Scalable Parallel Computing

    SciTech Connect (OSTI)

    John Mellor-Crummey

    2008-02-29

    Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.

  9. Parallel matrix computations. Interim report, 1983-1984

    SciTech Connect (OSTI)

    Stewart, G.W.; O'Leary, D.P.

    1985-03-28

    This project concerns the design and analysis of algorithms to be run in a processor-rich environment. Focus is primarily on algorithms that require no global control and that can be run on systems with only local connections among processors. Properties of these algorithms are investigated both theoretically and experimentally. The experimental work is done on the ZMOB, a working parallel computer operated by the Laboratory for Parallel Computation of the Computer Science Department at the University of Maryland. To give the work direction, the authors focused on two areas: 1. dense problems from numerical linear algebra; and 2. the iterative and direct solution of sparse linear systems. The ZMOB hardware and the research projects pursued under this grant support are discussed.

  10. Administering truncated receive functions in a parallel messaging interface

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-12-09

    Administering truncated receive functions in a parallel messaging interface (`PMI`) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.

  11. Parallel Algorithms for Graph Optimization using Tree Decompositions

    SciTech Connect (OSTI)

    Sullivan, Blair D; Weerapurage, Dinesh P; Groer, Christopher S

    2012-06-01

    Although many $\\cal{NP}$-hard graph optimization problems can be solved in polynomial time on graphs of bounded tree-width, the adoption of these techniques into mainstream scientific computation has been limited due to the high memory requirements of the necessary dynamic programming tables and excessive runtimes of sequential implementations. This work addresses both challenges by proposing a set of new parallel algorithms for all steps of a tree decomposition-based approach to solve the maximum weighted independent set problem. A hybrid OpenMP/MPI implementation includes a highly scalable parallel dynamic programming algorithm leveraging the MADNESS task-based runtime, and computational results demonstrate scaling. This work enables a significant expansion of the scale of graphs on which exact solutions to maximum weighted independent set can be obtained, and forms a framework for solving additional graph optimization problems with similar techniques.

  12. Ultrafast stimulated Raman parallel adiabatic passage by shaped pulses

    SciTech Connect (OSTI)

    Dridi, G.; Guerin, S.; Hakobyan, V.; Jauslin, H. R.; Eleuch, H.

    2009-10-15

    We present a general and versatile technique of population transfer based on parallel adiabatic passage by femtosecond shaped pulses. Their amplitude and phase are specifically designed to optimize the adiabatic passage corresponding to parallel eigenvalues at all times. We show that this technique allows the robust adiabatic population transfer in a Raman system with the total pulse area as low as 3{pi}, corresponding to a fluence of one order of magnitude below the conventional stimulated Raman adiabatic passage process. This process of short duration, typically picosecond and subpicosecond, is easily implementable with the modern pulse shaper technology and opens the possibility of ultrafast robust population transfer with interesting applications in quantum information processing.

  13. Methodology for Augmenting Existing Paths with Additional Parallel Transects

    SciTech Connect (OSTI)

    Wilson, John E.

    2013-09-30

    Visual Sample Plan (VSP) is sample planning software that is used, among other purposes, to plan transect sampling paths to detect areas that were potentially used for munition training. This module was developed for application on a large site where existing roads and trails were to be used as primary sampling paths. Gap areas between these primary paths needed to found and covered with parallel transect paths. These gap areas represent areas on the site that are more than a specified distance from a primary path. These added parallel paths needed to optionally be connected together into a single path—the shortest path possible. The paths also needed to optionally be attached to existing primary paths, again with the shortest possible path. Finally, the process must be repeatable and predictable so that the same inputs (primary paths, specified distance, and path options) will result in the same set of new paths every time. This methodology was developed to meet those specifications.

  14. Beam Dynamics Studies of Parallel-Bar Deflecting Cavities

    SciTech Connect (OSTI)

    S. Ahmed, G. Krafft, K. Detrick, S. Silva, J. Delayen, M. Spata ,M. Tiefenback, A. Hofler ,K. Beard

    2011-03-01

    We have performed three-dimensional simulations of beam dynamics for parallel-bar transverse electromagnetic mode (TEM) type RF separators: normal- and super-conducting. The compact size of these cavities as compared to conventional TM$_{110}$ type structures is more attractive particularly at low frequency. Highly concentrated electromagnetic fields between the parallel bars provide strong electrical stability to the beam for any mechanical disturbance. An array of six 2-cell normal conducting cavities or a one- or two-cell superconducting structure are enough to produce the required vertical displacement at the Lambertson magnet. Both the normal and super-conducting structures show very small emittance dilution due to the vertical kick of the beam.

  15. Laser Safety Method For Duplex Open Loop Parallel Optical Link

    DOE Patents [OSTI]

    Baumgartner, Steven John (Zumbro Falls, MN); Hedin, Daniel Scott (Rochester, MN); Paschal, Matthew James (Rochester, MN)

    2003-12-02

    A method and apparatus are provided to ensure that laser optical power does not exceed a "safe" level in an open loop parallel optical link in the event that a fiber optic ribbon cable is broken or otherwise severed. A duplex parallel optical link includes a transmitter and receiver pair and a fiber optic ribbon that includes a designated number of channels that cannot be split. The duplex transceiver includes a corresponding transmitter and receiver that are physically attached to each other and cannot be detached therefrom, so as to ensure safe, laser optical power in the event that the fiber optic ribbon cable is broken or severed. Safe optical power is ensured by redundant current and voltage safety checks.

  16. Coupled Serial and Parallel Non-uniform SQUIDs

    SciTech Connect (OSTI)

    Longhini, Patrick; In, Visarath; Berggren, Susan; Palacios, Antonio; Leese de Escobar, Anna

    2011-04-19

    In this work we numerical model series and parallel non-uniform superconducting quantum interference device (SQUID) array. Previous work has shown that series SQUID array constructed with a random distribution of loop sizes, (i.e. different areas for each SQUID loop) there exists a unique 'anti-peak' at the zero magnetic field for the voltage versus applied magnetic field (V-B). Similar results extend to a parallel SQUID array where the difference lies in the arrangement of the Josephson junctions. Other system parameter such as bias current, the number of loops, and mutual inductances are varied to demonstrate the change in dynamic range and linearity of the V-B response. Application of the SQUID array as a low noise amplifier (LNA) would increase link margins and affect the entire communication system. For unmanned aerial vehicles (UAVs), size, weight and power are limited, the SQUID array would allow use of practical 'electrically small' antennas that provide acceptable gain.

  17. Performance evaluation of a parallel sparse lattice Boltzmann solver

    SciTech Connect (OSTI)

    Axner, L. Bernsdorf, J. Zeiser, T. Lammers, P. Linxweiler, J. Hoekstra, A.G.

    2008-05-01

    We develop a performance prediction model for a parallelized sparse lattice Boltzmann solver and present performance results for simulations of flow in a variety of complex geometries. A special focus is on partitioning and memory/load balancing strategy for geometries with a high solid fraction and/or complex topology such as porous media, fissured rocks and geometries from medical applications. The topology of the lattice nodes representing the fluid fraction of the computational domain is mapped on a graph. Graph decomposition is performed with both multilevel recursive-bisection and multilevel k-way schemes based on modified Kernighan-Lin and Fiduccia-Mattheyses partitioning algorithms. Performance results and optimization strategies are presented for a variety of platforms, showing a parallel efficiency of almost 80% for the largest problem size. A good agreement between the performance model and experimental results is demonstrated.

  18. Kaleu: a general-purpose parton-level phase space generator

    E-Print Network [OSTI]

    A. van Hameren

    2010-03-25

    Kaleu is an independent, true phase space generator. After providing it with some information about the field theory and the particular multi-particle scattering process under consideration, it returns importance sampled random phase space points. Providing it also with the total weight of each generated phase space point, it further adapts to the integration problem on the fly. It is written in Fortran, such that it can independently deal with several scattering processes in parallel.

  19. Xyce parallel electronic simulator reference guide, version 6.1

    SciTech Connect (OSTI)

    Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory

    2014-03-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide [1] .

  20. Electronically commutated serial-parallel switching for motor windings

    DOE Patents [OSTI]

    Hsu, John S. (Oak Ridge, TN)

    2012-03-27

    A method and a circuit for controlling an ac machine comprises controlling a full bridge network of commutation switches which are connected between a multiphase voltage source and the phase windings to switch the phase windings between a parallel connection and a series connection while providing commutation discharge paths for electrical current resulting from inductance in the phase windings. This provides extra torque for starting a vehicle from lower battery current.

  1. Xyce Parallel Electronic Simulator : reference guide, version 4.1.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2009-02-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

  2. Xyce parallel electronic simulator reference guide, version 6.0.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Warrender, Christina E.; Baur, David G.

    2013-08-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide [1] . The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide [1].

  3. Massive Data-Parallel Swarm Simulation and Visualisation using CUDA

    E-Print Network [OSTI]

    Hinze, Thomas

    for the project was to employ CUDA for simulating natural movement and behaviour of a fish swarm. As a fish swarm by calculating the state of a fish at time ti+1 as a function of the state of the whole swarm at time ti. Since this function is equal for all fishes, the state of the whole swarm can easily be calculated in parallel

  4. Task and instruction scheduling in parallel multithreaded processors 

    E-Print Network [OSTI]

    Mishra, Amitabh

    1996-01-01

    them in the central IQ each cycle. Instructions are then decoded in the decoder, and logical registers are mapped into physical registers at this stage. Register renaming logic is implemented by the reorder buffer in order to remove false... of simultaneous, or parallel, multithreading, the context switch over- head (associated with a conventional multithreaded processor) is zero, since we have a central IQ of instructions from which we put instructions into the instruction win- dow, and we do...

  5. Matched filters for coalescing binaries detection on massively parallel computers

    E-Print Network [OSTI]

    Enrico Calzavarini; Laura Sartori; Fabio Schifano; Raffaele Tripiccione; Andrea Vicere'

    2002-07-18

    We discuss some computational problems associated to matched filtering of experimental signals from gravitational wave interferometric detectors in a parallel-processing environment. We then specialize our discussion to the use of the APEmille and apeNEXT processors for this task. Finally, we accurately estimate the performance of an APEmille system on a computational load appropriate for the LIGO and VIRGO experiments, and extrapolate our results to apeNEXT.

  6. Journal of Parallel and Distributed Computing 00 (2015) 116 Distributed

    E-Print Network [OSTI]

    Zhang, Minjie

    2015-01-01

    ]. Typically, bulk generation is the only energy resource to a distri- bution network and the direction of power flow is strictly from the central generation to downstream electric components [3]. Recently overload the networks. Second, it is very difficult to balance electricity demand with generation from

  7. Parallel Database Systems: The Future of High Performance Database Processing1

    E-Print Network [OSTI]

    Cafarella, Michael J.

    . 94105-2403 dewitt @ cs.wisc.edu Gray @ SFbay.enet.dec.com January 1992 Abstract: Parallel database

  8. Massively-parallel electrical-conductivity imaging of hydrocarbons using the Blue Gene/L supercomputer

    E-Print Network [OSTI]

    2008-01-01

    MASSIVELY-PARALLEL ELECTRICAL-CONDUCTIVITY IMAGING OFconsiderable attention for electrical conductivity mappingand anisotropic (blue) electrical conductivity. Fig. 1 Fig.

  9. Identifying failure in a tree network of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J. (Rochester, MN); Pinnow, Kurt W. (Rochester, MN); Wallenfelt, Brian P. (Eden Prairie, MN)

    2010-08-24

    Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.

  10. Non-signalling parallel repetition using de Finetti reductions

    E-Print Network [OSTI]

    Rotem Arnon-Friedman; Renato Renner; Thomas Vidick

    2014-11-06

    In the context of multiplayer games, the parallel repetition problem can be phrased as follows: given a game $G$ with optimal winning probability $1-\\alpha$ and its repeated version $G^n$ (in which $n$ games are played together, in parallel), can the players use strategies that are substantially better than ones in which each game is played independently? This question is relevant in physics for the study of correlations and plays an important role in computer science in the context of complexity and cryptography. In this work the case of multiplayer non-signalling games is considered, i.e., the only restriction on the players is that they are not allowed to communicate during the game. For complete-support games (games where all possible combinations of questions have non-zero probability to be asked) with any number of players we prove a threshold theorem stating that the probability that non-signalling players win more than a fraction $1-\\alpha+\\beta$ of the $n$ games is exponentially small in $n\\beta^2$, for every $0\\leq \\beta \\leq \\alpha$. For games with incomplete support we derive a similar statement, for a slightly modified form of repetition. The result is proved using a new technique, based on a recent de Finetti theorem, which allows us to avoid central technical difficulties that arise in standard proofs of parallel repetition theorems.

  11. Freezing of parallel hard cubes with rounded edges

    E-Print Network [OSTI]

    Matthieu Marechal; Urs Zimmermann; Hartmut Löwen

    2012-02-09

    The freezing transition in a classical three-dimensional system of parallel hard cubes with rounded edges is studied by computer simulation and fundamental-measure density functional theory. By switching the rounding parameter s from zero to one, one can smoothly interpolate between cubes with sharp edges and hard spheres. The equilibrium phase diagram of rounded parallel hard cubes is computed as a function of their volume fraction and the rounding parameter s. The second order freezing transition known for oriented cubes at s = 0 is found to be persistent up to s = 0.65. The fluid freezes into a simple-cubic crystal which exhibits a large vacancy concentration. Upon a further increase of s, the continuous freezing is replaced by a first-order transition into either a sheared simple cubic lattice or a deformed face-centered cubic lattice with two possible unit cells: body-centered orthorhombic or base-centered monoclinic. In principle, a system of parallel cubes could be realized in experiments on colloids using advanced synthesis techniques and a combination of external fields.

  12. Parallel Computing Environments and Methods for Power Distribution System Simulation

    SciTech Connect (OSTI)

    Lu, Ning; Taylor, Zachary T.; Chassin, David P.; Guttromson, Ross T.; Studham, Scott S.

    2005-11-10

    The development of cost-effective high-performance parallel computing on multi-processor super computers makes it attractive to port excessively time consuming simulation software from personal computers (PC) to super computes. The power distribution system simulator (PDSS) takes a bottom-up approach and simulates load at appliance level, where detailed thermal models for appliances are used. This approach works well for a small power distribution system consisting of a few thousand appliances. When the number of appliances increases, the simulation uses up the PC memory and its run time increases to a point where the approach is no longer feasible to model a practical large power distribution system. This paper presents an effort made to port a PC-based power distribution system simulator (PDSS) to a 128-processor shared-memory super computer. The paper offers an overview of the parallel computing environment and a description of the modification made to the PDSS model. The performances of the PDSS running on a standalone PC and on the super computer are compared. Future research direction of utilizing parallel computing in the power distribution system simulation is also addressed.

  13. Parallel partitioning strategies for the adaptive solution of conservation laws

    SciTech Connect (OSTI)

    Devine, K.D.; Flaherty, J.E.; Loy, R.M. [Rensselaer Polytechnic Institute, Troy, NY (United States)] [and others

    1995-12-31

    We describe and examine the performance of adaptive methods for Solving hyperbolic systems of conservation laws on massively parallel computers. The differential system is approximated by a discontinuous Galerkin finite element method with a hierarchical Legendre piecewise polynomial basis for the spatial discretization. Fluxes at element boundaries are computed by solving an approximate Riemann problem; a projection limiter is applied to keep the average solution monotone; time discretization is performed by Runge-Kutta integration; and a p-refinement-based error estimate is used as an enrichment indicator. Adaptive order (p-) and mesh (h-) refinement algorithms are presented and demonstrated. Using an element-based dynamic load balancing algorithm called tiling and adaptive p-refinement, parallel efficiencies of over 60% are achieved on a 1024-processor nCUBE/2 hypercube. We also demonstrate a fast, tree-based parallel partitioning strategy for three-dimensional octree-structured meshes. This method produces partition quality comparable to recursive spectral bisection at a greatly reduced cost.

  14. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, B.E.; Duncan, D.B.

    1994-02-15

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus is described. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect). 7 figures.

  15. Solid aerosol generator

    DOE Patents [OSTI]

    Prescott, D.S.; Schober, R.K.; Beller, J.

    1992-03-17

    An improved solid aerosol generator used to produce a gas borne stream of dry, solid particles of predetermined size and concentration is disclosed. The improved solid aerosol generator nebulizes a feed solution of known concentration with a flow of preheated gas and dries the resultant wet heated aerosol in a grounded, conical heating chamber, achieving high recovery and flow rates. 2 figs.

  16. Improved solid aerosol generator

    DOE Patents [OSTI]

    Prescott, D.S.; Schober, R.K.; Beller, J.

    1988-07-19

    An improved solid aerosol generator used to produce a gas borne stream of dry, solid particles of predetermined size and concentration. The improved solid aerosol generator nebulizes a feed solution of known concentration with a flow of preheated gas and dries the resultant wet heated aerosol in a grounded, conical heating chamber, achieving high recovery and flow rates. 2 figs.

  17. Wroclaw neutrino event generator

    E-Print Network [OSTI]

    Jaroslaw A. Nowak

    2006-07-07

    A neutrino event generator developed by the Wroclaw Neutrino Group is described. The physical models included in the generator are discussed and illustrated with the results of simulations. The considered processes are quasi-elastic scattering and pion production modelled by combining the $\\Delta$ resonance excitation and deep inelastic scattering.

  18. Internal split field generator

    DOE Patents [OSTI]

    Thundat; Thomas George (Knoxville, TN); Van Neste, Charles W. (Kingston, TN); Vass, Arpad Alexander (Oak Ridge, TN)

    2012-01-03

    A generator includes a coil of conductive material. A stationary magnetic field source applies a stationary magnetic field to the coil. An internal magnetic field source is disposed within a cavity of the coil to apply a moving magnetic field to the coil. The stationary magnetic field interacts with the moving magnetic field to generate an electrical energy in the coil.

  19. Laser beam generating apparatus

    DOE Patents [OSTI]

    Warner, B.E.; Duncan, D.B.

    1993-12-28

    Laser beam generating apparatus including a septum segment disposed longitudinally within the tubular structure of the apparatus. The septum provides for radiatively dissipating heat buildup within the tubular structure and for generating relatively uniform laser beam pulses so as to minimize or eliminate radial pulse delays (the chevron effect). 11 figures.

  20. Language Support for Synchronous Parallel Critical Sections Christoph W. Keler Helmut Seidl

    E-Print Network [OSTI]

    Kessler, Christoph

    elegant and effi­ cient programs for synchronous shared memory MIMD machines (also known as PRAM's). PRAMLanguage Support for Synchronous Parallel Critical Sections Christoph W. Ke�ler Helmut Seidl@psi.uni­trier.de Abstract We introduce a new parallel programming paradigm, namely synchronous parallel critical sections