National Library of Energy BETA

Sample records for kilowatt-hour parallel generation

  1. NREL Finds Up to 6-cent per Kilowatt-Hour Extra Value with Concentrate...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    the relative value of CSP. CSP could also allow greater penetration of PV by making the grid more flexible and reducing curtailment of PV by generating energy after the sun sets. ...

  2. Communication Graph Generator for Parallel Programs

    Energy Science and Technology Software Center (OSTI)

    2014-04-08

    Graphator is a collection of relatively simple sequential programs that generate communication graphs/matrices for commonly occurring patterns in parallel programs. Currently, there is support for five communication patterns: two-dimensional 4-point stencil, four-dimensional 8-point stencil, all-to-alls over sub-communicators, random near-neighbor communication, and near-neighbor communication.

  3. Massively parallel mesh generation for physics codes

    SciTech Connect (OSTI)

    Hardin, D.D.

    1996-06-01

    Massively parallel processors (MPPs) will soon enable realistic 3-D physical modeling of complex objects and systems. Work is planned or presently underway to port many of LLNL`s physical modeling codes to MPPs. LLNL`s DSI3D electromagnetics code already can solve 40+ million zone problems on the 256 processor Meiko. However, the author lacks the software necessary to generate and manipulate the large meshes needed to model many complicated 3-D geometries. State-of-the-art commercial mesh generators run on workstations and have a practical limit of several hundred thousand elements. In the foreseeable future MPPs will solve problems with a billion mesh elements. The objective of the Parallel Mesh Generation (PMESH) Project is to develop a unique mesh generation system that can construct large 3-D meshes (up to a billion elements) on MPPs. Such a capability will remove a critical roadblock to unleashing the power of MPPs for physical analysis and will put LLNL at the forefront of mesh generation technology. PMESH will ``front-end`` a variety of LLNL 3-D physics codes, including those in the areas of electromagnetics, structural mechanics, thermal analysis, and hydrodynamics. The DSI3D and DYNA3D codes are already running on MPPs. The primary goal of the PMESH project is to provide the robust generation of large meshes for complicated 3-D geometries through the appropriate distribution of the generation task between the user`s workstation and the MPP. Secondary goals are to support the unique features of LLNL physics codes (e.g., unusual elements) and to minimize the user effort required to generate different meshes for the same geometry. PMESH`s capabilities are essential because mesh generation is presently a major limiting factor in simulating larger and more complex 3-D geometries. PMESH will significantly enhance LLNL`s capabilities in physical simulation by advancing the state-of-the-art in large mesh generation by 2 to 3 orders of magnitude.

  4. SPRNG Scalable Parallel Random Number Generator LIbrary

    Energy Science and Technology Software Center (OSTI)

    2010-03-16

    This revision corrects some errors in SPRNG 1. Users of newer SPRNG versions can obtain the corrected files and build their version with it. This version also improves the scalability of some of the application-based tests in the SPRNG test suite. It also includes an interface to a parallel Mersenne Twister, so that if users install the Mersenne Twister, then they can test this generator with the SPRNG test suite and also use some SPRNGmore » features with that generator.« less

  5. Building the Next Generation of Parallel Applications: Co-Design...

    Office of Scientific and Technical Information (OSTI)

    Applications: Co-Design Opportunities and Challenges. Citation Details In-Document Search Title: Building the Next Generation of Parallel Applications: Co-Design Opportunities and ...

  6. Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

    SciTech Connect (OSTI)

    Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

    1997-03-01

    Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.

  7. Generating unstructured nuclear reactor core meshes in parallel

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore » examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less

  8. Generating unstructured nuclear reactor core meshes in parallel

    SciTech Connect (OSTI)

    Jain, Rajeev; Tautges, Timothy J.

    2014-10-24

    Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor core examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.

  9. Asynchronous parallel generating set search for linearly-constrained optimization.

    SciTech Connect (OSTI)

    Lewis, Robert Michael; Griffin, Joshua D.; Kolda, Tamara Gibson

    2006-08-01

    Generating set search (GSS) is a family of direct search methods that encompasses generalized pattern search and related methods. We describe an algorithm for asynchronous linearly-constrained GSS, which has some complexities that make it different from both the asynchronous bound-constrained case as well as the synchronous linearly-constrained case. The algorithm has been implemented in the APPSPACK software framework and we present results from an extensive numerical study using CUTEr test problems. We discuss the results, both positive and negative, and conclude that GSS is a reliable method for solving small-to-medium sized linearly-constrained optimization problems without derivatives.

  10. Generation of quasi-monoenergetic carbon ions accelerated parallel to the plane of a sandwich target

    SciTech Connect (OSTI)

    Wang, J. W.; Murakami, M.; Weng, S. M.; Xu, H.; Ju, J. J.; Luan, S. X.; Yu, W.

    2014-12-15

    A new ion acceleration scheme, namely, target parallel Coulomb acceleration, is proposed in which a carbon plate sandwiched between gold layers is irradiated with intense linearly polarized laser pulses. The high electrostatic field generated by the gold ions efficiently accelerates the embedded carbon ions parallel to the plane of the target. The ion beam is found to be collimated by the concave-shaped Coulomb potential. As a result, a quasi-monoenergetic and collimated C{sup 6+}-ion beam with an energy exceeding 10 MeV/nucleon is produced at a laser intensity of 5 × 10{sup 19} W/cm{sup 2}.

  11. Bit error rate tester using fast parallel generation of linear recurring sequences

    DOE Patents [OSTI]

    Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.

    2003-05-06

    A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.

  12. Feasibility Study of Biomass Electrical Generation on Tribal Lands

    SciTech Connect (OSTI)

    Tom Roche; Richard Hartmann; Joohn Luton; Warren Hudelson; Roger Blomguist; Jan Hacker; Colene Frye

    2005-03-29

    The goals of the St. Croix Tribe are to develop economically viable energy production facilities using readily available renewable biomass fuel sources at an acceptable cost per kilowatt hour ($/kWh), to provide new and meaningful permanent employment, retain and expand existing employment (logging) and provide revenues for both producers and sellers of the finished product. This is a feasibility study including an assessment of available biomass fuel, technology assessment, site selection, economics viability given the foreseeable fuel and generation costs, as well as an assessment of the potential markets for renewable energy.

  13. Experimental and cost analyses of a one kilowatt-hour/day domestic refrigerator-freezer

    SciTech Connect (OSTI)

    Vineyard, E.A.; Sand, J.R.

    1997-05-01

    Over the past ten years, government regulations for energy standards, coupled with the utility industry`s promotion of energy-efficient appliances, have prompted appliance manufacturers to reduce energy consumption in refrigerator-freezers by approximately 40%. Global concerns over ozone depletion have also required the appliance industry to eliminate CFC-12 and CFC-11 while concurrently improving energy efficiency to reduce greenhouse emissions. In response to expected future regulations that will be more stringent, several design options were investigated for improving the energy efficiency of a conventionally designed, domestic refrigerator-freezer. The options, such as cabinet and door insulation improvements and a high-efficiency compressor were incorporated into a prototype refrigerator-freezer cabinet and refrigeration system. Baseline energy consumption of the original 1996 production refrigerator-freezer, along with cabinet heat load and compressor calorimeter test results, were extensively documented to provide a firm basis for experimentally measured energy savings. The goal for the project was to achieve an energy consumption that is 50% below in 1993 National Appliance Energy Conservation Act (NAECA) standard for 20 ft{sup 3} (570 l) units. Based on discussions with manufacturers to determine the most promising energy-saving options, a laboratory prototype was fabricated and tested to experimentally verify the energy consumption of a unit with vacuum insulation around the freezer, increased door thicknesses, a high-efficiency compressor, a low wattage condenser fan, a larger counterflow evaporator, and adaptive defrost control.

  14. Fridge of the future: Designing a one-kilowatt-hour/day domestic refrigerator-freezer

    SciTech Connect (OSTI)

    Vineyard, E.A.; Sand, J.R.

    1998-03-01

    An industry/government Cooperative Research and Development Agreement (CRADA) was established to evaluate and test design concepts for a domestic refrigerator-freezer unit that represents approximately 60% of the US market. The goal of the CRADA was to demonstrate advanced technologies which reduce, by 50 percent, the 1993 NAECA standard energy consumption for a 20 ft{sup 3} (570 I) top-mount, automatic-defrost, refrigerator-freezer. For a unit this size, the goal translated to an energy consumption of 1.003 kWh/d. The general objective of the research was to facilitate the introduction of cost-efficient technologies by demonstrating design changes that can be effectively incorporated into new products. A 1996 model refrigerator-freezer was selected as the baseline unit for testing. Since the unit was required to meet the 1993 NAECA standards, the energy consumption was quite low (1.676 kWh/d), thus making further reductions in energy consumption very challenging. Among the energy saving features incorporated into the original design of the baseline unit were a low-wattage evaporator fan, increased insulation thicknesses, and liquid line flange heaters.

  15. Hydropower generation management under uncertainty via scenario analysis and parallel computation

    SciTech Connect (OSTI)

    Escudero, L.F.; Garcia, C.; Fuente, J.L. de la; Prieto, F.J.

    1996-05-01

    The authors present a modeling framework for the robust solution of hydroelectric power management problems with uncertainty in the values of the water inflows and outflows. A deterministic treatment of the problem provides unsatisfactory results, except for very short time horizons. The authors describe a model based on scenario analysis that allows a satisfactory treatment of uncertainty in the model data for medium and long-term planning problems. Their approach results in a huge model with a network submodel per scenario plus coupling constraints. The size of the problem and the structure of the constraints are adequate for the use of decomposition techniques and parallel computation tools. The authors present computational results for both sequential and parallel implementation versions of the codes, running on a cluster of workstations. The codes have been tested on data obtained from the reservoir network of Iberdrola, a power utility owning 50% of the total installed hydroelectric capacity of Spain, and generating 40% of the total energy demand.

  16. Hydropower generation management under uncertainty via scenario analysis and parallel computation

    SciTech Connect (OSTI)

    Escudero, L.F.; Garcia, C.; Fuente, J.L. de la; Prieto, F.J.

    1995-12-31

    The authors present a modeling framework for the robust solution of hydroelectric power management problems and uncertainty in the values of the water inflows and outflows. A deterministic treatment of the problem provides unsatisfactory results, except for very short time horizons. The authors describe a model based on scenario analysis that allows a satisfactory treatment of uncertainty in the model data for medium and long-term planning problems. This approach results in a huge model with a network submodel per scenario plus coupling constraints. The size of the problem and the structure of the constraints are adequate for the use of decomposition techniques and parallel computation tools. The authors present computational results for both sequential and parallel implementation versions of the codes, running on a cluster of workstations. The code have been tested on data obtained from the reservoir network of Iberdrola, a power utility owning 50% of the total installed hydroelectric capacity of Spain, and generating 40% of the total energy demand.

  17. Parallel Application Performance on Two Generations of Intel Xeon HPC Platforms

    SciTech Connect (OSTI)

    Chang, Christopher H.; Long, Hai; Sides, Scott; Vaidhynathan, Deepthi; Jones, Wesley

    2015-10-15

    Two next-generation node configurations hosting the Haswell microarchitecture were tested with a suite of microbenchmarks and application examples, and compared with a current Ivy Bridge production node on NREL" tm s Peregrine high-performance computing cluster. A primary conclusion from this study is that the additional cores are of little value to individual task performance--limitations to application parallelism, or resource contention among concurrently running but independent tasks, limits effective utilization of these added cores. Hyperthreading generally impacts throughput negatively, but can improve performance in the absence of detailed attention to runtime workflow configuration. The observations offer some guidance to procurement of future HPC systems at NREL. First, raw core count must be balanced with available resources, particularly memory bandwidth. Balance-of-system will determine value more than processor capability alone. Second, hyperthreading continues to be largely irrelevant to the workloads that are commonly seen, and were tested here, at NREL. Finally, perhaps the most impactful enhancement to productivity might occur through enabling multiple concurrent jobs per node. Given the right type and size of workload, more may be achieved by doing many slow things at once, than fast things in order.

  18. Parallel octree-based hexahedral mesh generation for eulerian to lagrangian conversion.

    SciTech Connect (OSTI)

    Staten, Matthew L.; Owen, Steven James

    2010-09-01

    Computational simulation must often be performed on domains where materials are represented as scalar quantities or volume fractions at cell centers of an octree-based grid. Common examples include bio-medical, geotechnical or shock physics calculations where interface boundaries are represented only as discrete statistical approximations. In this work, we introduce new methods for generating Lagrangian computational meshes from Eulerian-based data. We focus specifically on shock physics problems that are relevant to ASC codes such as CTH and Alegra. New procedures for generating all-hexahedral finite element meshes from volume fraction data are introduced. A new primal-contouring approach is introduced for defining a geometric domain. New methods for refinement, node smoothing, resolving non-manifold conditions and defining geometry are also introduced as well as an extension of the algorithm to handle tetrahedral meshes. We also describe new scalable MPI-based implementations of these procedures. We describe a new software module, Sculptor, which has been developed for use as an embedded component of CTH. We also describe its interface and its use within the mesh generation code, CUBIT. Several examples are shown to illustrate the capabilities of Sculptor.

  19. Property:PotentialHydropowerGeneration | Open Energy Information

    Open Energy Info (EERE)

    for a particular place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  20. Property:PotentialOnshoreWindGeneration | Open Energy Information

    Open Energy Info (EERE)

    onshore wind in a place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  1. Property:PotentialBiopowerSolidGeneration | Open Energy Information

    Open Energy Info (EERE)

    for a particular place. Use this type to express a quantity of energy. The default unit for energy on OpenEI is the Kilowatt hour (kWh), which is 3,600,000 Joules. http:...

  2. Localized parallel parametric generation of spin waves in a Ni{sub 81}Fe{sub 19} waveguide by spatial variation of the pumping field

    SciTech Connect (OSTI)

    Brächer, T.; Pirro, P.; Heussner, F.; Serga, A. A.; Hillebrands, B.

    2014-03-03

    We present the experimental observation of localized parallel parametric generation of spin waves in a transversally in-plane magnetized Ni{sub 81}Fe{sub 19} magnonic waveguide. The localization is realized by combining the threshold character of parametric generation with a spatially confined enhancement of the amplifying microwave field. The latter is achieved by modulating the width of the microstrip transmission line which is used to provide the pumping field. By employing microfocussed Brillouin light scattering spectroscopy, we analyze the spatial distribution of the generated spin waves and compare it with numerical calculations of the field distribution along the Ni{sub 81}Fe{sub 19} waveguide. This provides a local spin-wave excitation in transversally in-plane magnetized waveguides for a wide wave-vector range which is not restricted by the size of the generation area.

  3. EERE Success Story—Nevada: Geothermal Brine Brings Low-Cost Power with Big Potential

    Broader source: Energy.gov [DOE]

    Utilizing EERE funds, ElectraTherm developed a geothermal technology that will generate electricity for less than $0.06 per kilowatt hour.

  4. Nevada: Geothermal Brine Brings Low-Cost Power with Big Potential

    Broader source: Energy.gov [DOE]

    Utilizing EERE funds, ElectraTherm developed a geothermal technology that will generate electricity for less than $0.06 per kilowatt hour.

  5. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Recommended Reading & Resources Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead ...

  6. TRANSIMS Parallelization

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    TRANSIMS Parallelization This email address is being protected from spambots. You need JavaScript enabled to view it. - TRACC Director This email address is being protected from spambots. You need JavaScript enabled to view it. - Associate Computational Transportation Engineer Background TRANSIMS was originally developed by Los Alamos National Laboratory to run exclusively on a Linux cluster environment. In this initial version, the only parallelized component was the microsimulator. It worked

  7. Parallel Batch Scripts

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Batch Scripts Parallel Batch Scripts Parallel Environments on Genepool You can run parallel jobs that use MPI or OpenMP on Genepool as long as you make the appropriate changes to your submission script! To investigate the parallel environments that are available on Genepool, you can use Command Description qconf -sp <pename> Show the configuration for the specified parallel environment. qconf -spl Show a list of all currently configured parallel environments. Basic Parallel

  8. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Recommended Reading & Resources Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nickole Aguilar Garcia (505) 665-3048 Email Recommended Reading & References The Parallel Computing Summer Research Internship covers a broad range of topics that you may not have

  9. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    LaboratoryNational Security Education Center Menu About Seminar Series Summer Schools Workshops Viz Collab IS&T Projects NSEC » Information Science and Technology Institute (ISTI) » Summer School Programs » Parallel Computing Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff

  10. Special parallel processing workshop

    SciTech Connect (OSTI)

    1994-12-01

    This report contains viewgraphs from the Special Parallel Processing Workshop. These viewgraphs deal with topics such as parallel processing performance, message passing, queue structure, and other basic concept detailing with parallel processing.

  11. Parallel Python GDB

    Energy Science and Technology Software Center (OSTI)

    2012-08-05

    PGDB is a lightweight parallel debugger softrware product. It utilizes the open souce gdb debugger inside of a parallel python framework.

  12. Super Bowl of Energy: Solar Smashes Records | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Addthis MetLife Stadium, the site of yesterday's Super Bowl, features a ring of 1,350 solar panels that can generate 350,000 kilowatt hours of electricity annually. The number of ...

  13. EERE Success Story-Nevada: Geothermal Brine Brings Low-Cost Power...

    Broader source: Energy.gov (indexed) [DOE]

    Utilizing a 1 million EERE investment, heat from geothermal fluids-a byproduct of gold mining-will be generating electricity this year for less than 0.06 per kilowatt hour with ...

  14. Halfway There But Far From Done: SunShot Surges Ahead on Path...

    Office of Environmental Management (EM)

    the SunShot goals by the end of the decade, with ... kilowatt hours of solar electricity may be more valuable ... energy generation and consumption on the grid, will ...

  15. Net Metering

    Broader source: Energy.gov [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the end of a 12-month...

  16. Net Metering

    Broader source: Energy.gov [DOE]

    Net excess generation (NEG) is treated as a kilowatt-hour (kWh) credit or other compensation on the customer's following bill.* At the beginning of the calendar year, a utility will purchase any...

  17. Tax Credits, Rebates & Savings | Department of Energy

    Broader source: Energy.gov (indexed) [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the end of...

  18. Tax Credits, Rebates & Savings | Department of Energy

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    Metering Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At...

  19. Tax Credits, Rebates & Savings | Department of Energy

    Broader source: Energy.gov (indexed) [DOE]

    Customer net excess generation (NEG) is carried forward at the utility's retail rate (i.e., as a kilowatt-hour credit) to a customer's next bill for up to 12 months. At the...

  20. Life Cycle Greenhouse Gas Emissions of Coal-Fired Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Whitaker, M.; Heath, G. A.; O'Donoughue, P.; Vorum, M.

    2012-04-01

    This systematic review and harmonization of life cycle assessments (LCAs) of utility-scale coal-fired electricity generation systems focuses on reducing variability and clarifying central tendencies in estimates of life cycle greenhouse gas (GHG) emissions. Screening 270 references for quality LCA methods, transparency, and completeness yielded 53 that reported 164 estimates of life cycle GHG emissions. These estimates for subcritical pulverized, integrated gasification combined cycle, fluidized bed, and supercritical pulverized coal combustion technologies vary from 675 to 1,689 grams CO{sub 2}-equivalent per kilowatt-hour (g CO{sub 2}-eq/kWh) (interquartile range [IQR]= 890-1,130 g CO{sub 2}-eq/kWh; median = 1,001) leading to confusion over reasonable estimates of life cycle GHG emissions from coal-fired electricity generation. By adjusting published estimates to common gross system boundaries and consistent values for key operational input parameters (most importantly, combustion carbon dioxide emission factor [CEF]), the meta-analytical process called harmonization clarifies the existing literature in ways useful for decision makers and analysts by significantly reducing the variability of estimates ({approx}53% in IQR magnitude) while maintaining a nearly constant central tendency ({approx}2.2% in median). Life cycle GHG emissions of a specific power plant depend on many factors and can differ from the generic estimates generated by the harmonization approach, but the tightness of distribution of harmonized estimates across several key coal combustion technologies implies, for some purposes, first-order estimates of life cycle GHG emissions could be based on knowledge of the technology type, coal mine emissions, thermal efficiency, and CEF alone without requiring full LCAs. Areas where new research is necessary to ensure accuracy are also discussed.

  1. Applications of Parallel Computers

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Computers Applications of Parallel Computers UCB CS267 Spring 2015 Tuesday & Thursday, 9:30-11:00 Pacific Time Applications of Parallel Computers, CS267, is a graduate-level course...

  2. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, H.C.; Cheng, Y.S.

    1984-01-01

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  3. Parallel flow diffusion battery

    DOE Patents [OSTI]

    Yeh, Hsu-Chi; Cheng, Yung-Sung

    1984-08-07

    A parallel flow diffusion battery for determining the mass distribution of an aerosol has a plurality of diffusion cells mounted in parallel to an aerosol stream, each diffusion cell including a stack of mesh wire screens of different density.

  4. PISTON (Portable Data Parallel Visualization and Analysis)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    in a data-parallel way. By using nVidia's freely downloadable Thrust library and our own tools, we can generate executable codes for different acceleration hardware architectures...

  5. Parallel Atomistic Simulations

    SciTech Connect (OSTI)

    HEFFELFINGER,GRANT S.

    2000-01-18

    Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

  6. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Mentors Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nickole Aguilar Garcia (505) 665-3048 Email 2016: Mentors Bob Robey Bob Robey XCP-2: EULERIAN CODES Bob Robey is a Research Scientist in the Eulerian Applications group at Los Alamos National Laboratory. He is the

  7. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Students Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nickole Aguilar Garcia (505) 665-3048 Email 2016: Students Peter Ahrens Peter Ahrens Electrical Engineering & Computer Science BS UC Berkeley Jenniffer Estrada Jenniffer Estrada Computer Science MS Youngstown

  8. Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Guide to Los Alamos Parallel Computing Summer Research Internship Creates next-generation leaders in HPC research and applications development Contacts Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nickole Aguilar Garcia (505) 665-3048 Email Guide to Los Alamos During your 10-week internship, we hope you have the opportunity to explore and enjoy Los Alamos and the surrounding area. Here are some

  9. Parallel integrated thermal management

    DOE Patents [OSTI]

    Bennion, Kevin; Thornton, Matthew

    2014-08-19

    Embodiments discussed herein are directed to managing the heat content of two vehicle subsystems through a single coolant loop having parallel branches for each subsystem.

  10. Optimize Parallel Pumping Systems

    Broader source: Energy.gov [DOE]

    This tip sheet describes how to optimize the performance of multiple pumps operating continuously as part of a parallel pumping system.

  11. Life Cycle Greenhouse Gas Emissions of Nuclear Electricity Generation: Systematic Review and Harmonization

    SciTech Connect (OSTI)

    Warner, E. S.; Heath, G. A.

    2012-04-01

    A systematic review and harmonization of life cycle assessment (LCA) literature of nuclear electricity generation technologies was performed to determine causes of and, where possible, reduce variability in estimates of life cycle greenhouse gas (GHG) emissions to clarify the state of knowledge and inform decision making. LCA literature indicates that life cycle GHG emissions from nuclear power are a fraction of traditional fossil sources, but the conditions and assumptions under which nuclear power are deployed can have a significant impact on the magnitude of life cycle GHG emissions relative to renewable technologies. Screening 274 references yielded 27 that reported 99 independent estimates of life cycle GHG emissions from light water reactors (LWRs). The published median, interquartile range (IQR), and range for the pool of LWR life cycle GHG emission estimates were 13, 23, and 220 grams of carbon dioxide equivalent per kilowatt-hour (g CO{sub 2}-eq/kWh), respectively. After harmonizing methods to use consistent gross system boundaries and values for several important system parameters, the same statistics were 12, 17, and 110 g CO{sub 2}-eq/kWh, respectively. Harmonization (especially of performance characteristics) clarifies the estimation of central tendency and variability. To explain the remaining variability, several additional, highly influential consequential factors were examined using other methods. These factors included the primary source energy mix, uranium ore grade, and the selected LCA method. For example, a scenario analysis of future global nuclear development examined the effects of a decreasing global uranium market-average ore grade on life cycle GHG emissions. Depending on conditions, median life cycle GHG emissions could be 9 to 110 g CO{sub 2}-eq/kWh by 2050.

  12. Eclipse Parallel Tools Platform

    Energy Science and Technology Software Center (OSTI)

    2005-02-18

    Designing and developing parallel programs is an inherently complex task. Developers must choose from the many parallel architectures and programming paradigms that are available, and face a plethora of tools that are required to execute, debug, and analyze parallel programs i these environments. Few, if any, of these tools provide any degree of integration, or indeed any commonality in their user interfaces at all. This further complicates the parallel developer's task, hampering software engineering practices,more » and ultimately reducing productivity. One consequence of this complexity is that best practice in parallel application development has not advanced to the same degree as more traditional programming methodologies. The result is that there is currently no open-source, industry-strength platform that provides a highly integrated environment specifically designed for parallel application development. Eclipse is a universal tool-hosting platform that is designed to providing a robust, full-featured, commercial-quality, industry platform for the development of highly integrated tools. It provides a wide range of core services for tool integration that allow tool producers to concentrate on their tool technology rather than on platform specific issues. The Eclipse Integrated Development Environment is an open-source project that is supported by over 70 organizations, including IBM, Intel and HP. The Eclipse Parallel Tools Platform (PTP) plug-in extends the Eclipse framwork by providing support for a rich set of parallel programming languages and paradigms, and a core infrastructure for the integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration of a wide variety of parallel tools. The first version of the PTP is a prototype that only provides minimal functionality for parallel tool integration, support for a small number of parallel architectures

  13. Parallel computing works

    SciTech Connect (OSTI)

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  14. Parallel programming with PCN

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  15. UPC (Unified Parallel C)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    UPC (Unified Parallel C) Description Unified Parallel C is a partitioned global address space (PGAS) language and an extension of the C programming language. Availability UPC is available on Edison and Hopper via both the Cray compilers, as well as through Berkeley UPC, a portable high-performance UPC compiler and runtime implementation. Using UPC To compile a UPC source file using the Cray compilers, you must first swap the Cray compiler with the default compiler. On Hopper: % module swap

  16. Parallel phase model : a programming model for high-end parallel machines with manycores.

    SciTech Connect (OSTI)

    Wu, Junfeng; Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

    2009-04-01

    This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.

  17. Parallel time integration software

    Energy Science and Technology Software Center (OSTI)

    2014-07-01

    This package implements an optimal-scaling multigrid solver for the (non) linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integrarion techniques is limited to spatial parallelism. However, current trends in computer architectures are leading twards system with more, but not faster. processors. Therefore, faster compute speeds mustmore » come from greater parallelism. One approach to achieve parallelism in time is with multigrid, but extending classical multigrid methods for elliptic poerators to this setting is a significant achievement. In this software, we implement a non-intrusive, optimal-scaling time-parallel method based on multigrid reduction techniques. The examples in the package demonstrate optimality of our multigrid-reduction-in-time algorithm (MGRIT) for solving a variety of parabolic equations in two and three sparial dimensions. These examples can also be used to show that MGRIT can achieve significant speedup in comparison to sequential time marching on modern architectures.« less

  18. Parallel optical sampler

    DOE Patents [OSTI]

    Tauke-Pedretti, Anna; Skogen, Erik J; Vawter, Gregory A

    2014-05-20

    An optical sampler includes a first and second 1.times.n optical beam splitters splitting an input optical sampling signal and an optical analog input signal into n parallel channels, respectively, a plurality of optical delay elements providing n parallel delayed input optical sampling signals, n photodiodes converting the n parallel optical analog input signals into n respective electrical output signals, and n optical modulators modulating the input optical sampling signal or the optical analog input signal by the respective electrical output signals, and providing n successive optical samples of the optical analog input signal. A plurality of output photodiodes and eADCs convert the n successive optical samples to n successive digital samples. The optical modulator may be a photodiode interconnected Mach-Zehnder Modulator. A method of sampling the optical analog input signal is disclosed.

  19. Parallel programming with Ada

    SciTech Connect (OSTI)

    Kok, J.

    1988-01-01

    To the human programmer the ease of coding distributed computing is highly dependent on the suitability of the employed programming language. But with a particular language it is also important whether the possibilities of one or more parallel architectures can efficiently be addressed by available language constructs. In this paper the possibilities are discussed of the high-level language Ada and in particular of its tasking concept as a descriptional tool for the design and implementation of numerical and other algorithms that allow execution of parts in parallel. Language tools are explained and their use for common applications is shown. Conclusions are drawn about the usefulness of several Ada concepts.

  20. Parallel Multigrid Equation Solver

    Energy Science and Technology Software Center (OSTI)

    2001-09-07

    Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.

  1. Parallel programming with PCN

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  2. Parallel Dislocation Simulator

    Energy Science and Technology Software Center (OSTI)

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  3. Parallel Total Energy

    Energy Science and Technology Software Center (OSTI)

    2004-10-21

    This is a total energy electronic structure code using Local Density Approximation (LDA) of the density funtional theory. It uses the plane wave as the wave function basis set. It can sue both the norm conserving pseudopotentials and the ultra soft pseudopotentials. It can relax the atomic positions according to the total energy. It is a parallel code using MP1.

  4. Ultrascalable petaflop parallel supercomputer

    DOE Patents [OSTI]

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  5. Parallel grid population

    DOE Patents [OSTI]

    Wald, Ingo; Ize, Santiago

    2015-07-28

    Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.

  6. Hybrid Optimization Parallel Search PACKage

    Energy Science and Technology Software Center (OSTI)

    2009-11-10

    HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework providesmore » a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, a useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less

  7. Xyce parallel electronic simulator.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users' Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users' Guide.

  8. Exploiting Network Parallelism

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Exploiting Network Parallelism for Improving Data Transfer Performance Dan Gunter ∗ , Raj Kettimuthu † , Ezra Kissel ‡ , Martin Swany ‡ , Jun Yi § , Jason Zurawski ¶ ∗ Advanced Computing for Science Department, Lawrence Berkeley National Laboratory, Berkeley, CA † Mathematics and Computer Science Division, Argonne National Laboratory Argonne, IL ‡ School of Informatics and Computing, Indiana University, Bloomington, IN § Computation Institute, University of Chicago/Argonne

  9. Hybrid Parallel Programming with MPI and Unified Parallel C | Argonne

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Leadership Computing Facility Parallel Programming with MPI and Unified Parallel C Authors: Dinan, J., Balaji, P., Lusk, E., Sadayappan, P., Thakur, R. The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity

  10. Parallel Harness for Informatic Stream Hashing

    Energy Science and Technology Software Center (OSTI)

    2012-09-11

    PHISH is a lightweight framework which a set of independent processes can use to exchange data as they run on the same desktop machine, on processors of a parallel machine, or on different machines across a network. This enables them to work in a coordinated parallel fashion to perform computations on either streaming, archived, or self-generated data. The PHISH distribution includes a simple, portable library for performing data exchanges in useful patterns either via MPImore » message-passing or ZMQ sockets. PHISH input scripts are used to describe a data-processing algorithm, and additional tools provided in the PHISH distribution convert the script into a form that can be launched as a parallel job.« less

  11. Applied Parallel Metadata Indexing

    SciTech Connect (OSTI)

    Jacobi, Michael R

    2012-08-01

    The GPFS Archive is parallel archive is a parallel archive used by hundreds of users in the Turquoise collaboration network. It houses 4+ petabytes of data in more than 170 million files. Currently, users must navigate the file system to retrieve their data, requiring them to remember file paths and names. A better solution might allow users to tag data with meaningful labels and searach the archive using standard and user-defined metadata, while maintaining security. last summer, I developed the backend to a tool that adheres to these design goals. The backend works by importing GPFS metadata into a MongoDB cluster, which is then indexed on each attribute. This summer, the author implemented security and developed the user interfae for the search tool. To meet security requirements, each database table is associated with a single user, which only stores records that the user may read, and requires a set of credentials to access. The interface to the search tool is implemented using FUSE (Filesystem in USErspace). FUSE is an intermediate layer that intercepts file system calls and allows the developer to redefine how those calls behave. In the case of this tool, FUSE interfaces with MongoDB to issue queries and populate output. A FUSE implementation is desirable because it allows users to interact with the search tool using commands they are already familiar with. These security and interface additions are essential for a usable product.

  12. Parallel ptychographic reconstruction

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps tomoretake in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.less

  13. Parallel ptychographic reconstruction

    SciTech Connect (OSTI)

    Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom; Deng, Junjing; Ross, Rob; Jacobsen, Chris

    2014-12-19

    Ptychography is an imaging method whereby a coherent beam is scanned across an object, and an image is obtained by iterative phasing of the set of diffraction patterns. It is able to be used to image extended objects at a resolution limited by scattering strength of the object and detector geometry, rather than at an optics-imposed limit. As technical advances allow larger fields to be imaged, computational challenges arise for reconstructing the correspondingly larger data volumes, yet at the same time there is also a need to deliver reconstructed images immediately so that one can evaluate the next steps to take in an experiment. Here we present a parallel method for real-time ptychographic phase retrieval. It uses a hybrid parallel strategy to divide the computation between multiple graphics processing units (GPUs) and then employs novel techniques to merge sub-datasets into a single complex phase and amplitude image. Results are shown on a simulated specimen and a real dataset from an X-ray experiment conducted at a synchrotron light source.

  14. Unified Parallel Software

    Energy Science and Technology Software Center (OSTI)

    2003-12-01

    UPS (Unified Paralled Software is a collection of software tools libraries, scripts, executables) that assist in parallel programming. This consists of: o libups.a C/Fortran callable routines for message passing (utilities written on top of MPI) and file IO (utilities written on top of HDF). o libuserd-HDF.so EnSight user-defined reader for visualizing data files written with UPS File IO. o ups_libuserd_query, ups_libuserd_prep.pl, ups_libuserd_script.pl Executables/scripts to get information from data files and to simplify the use ofmore » EnSight on those data files. o ups_io_rm/ups_io_cp Manipulate data files written with UPS File IO These tools are portable to a wide variety of Unix platforms.« less

  15. Parallelization and automatic data distribution for nuclear reactor simulations

    SciTech Connect (OSTI)

    Liebrock, L.M.

    1997-07-01

    Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.

  16. Discontinuous Methods for Accurate, Massively Parallel Quantum...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Investigator for Discontinuous Methods for Accurate, Massively Parallel Quantum Molecular Dynamics. Discontinuous Methods for Accurate, Massively Parallel Quantum...

  17. Multilingual interfaces for parallel coupling in multiphysics and multiscale systems.

    SciTech Connect (OSTI)

    Ong, E. T.; Larson, J. W.; Norris, B.; Jacob, R. L.; Tobis, M.; Steder, M.; Mathematics and Computer Science; Univ. of Wisconsin; Australian National Univ.; Univ. of Chicago

    2007-01-01

    Multiphysics and multiscale simulation systems are emerging as a new grand challenge in computational science, largely because of increased computing power provided by the distributed-memory parallel programming model on commodity clusters. These systems often present a parallel coupling problem in their intercomponent data exchanges. Another potential problem in these coupled systems is language interoperability between their various constituent codes. In anticipation of combined parallel coupling/language interoperability challenges, we have created a set of interlanguage bindings for a successful parallel coupling library, the Model Coupling Toolkit. We describe the method used for automatically generating the bindings using the Babel language interoperability tool, and illustrate with short examples how MCT can be used from the C++ and Python languages. We report preliminary performance reports for the MCT interpolation benchmark. We conclude with a discussion of the significance of this work to the rapid prototyping of large parallel coupled systems.

  18. Parallel processing for control applications

    SciTech Connect (OSTI)

    Telford, J. W.

    2001-01-01

    Parallel processing has been a topic of discussion in computer science circles for decades. Using more than one single computer to control a process has many advantages that compensate for the additional cost. Initially multiple computers were used to attain higher speeds. A single cpu could not perform all of the operations necessary for real time operation. As technology progressed and cpu's became faster, the speed issue became less significant. The additional processing capabilities however continue to make high speeds an attractive element of parallel processing. Another reason for multiple processors is reliability. For the purpose of this discussion, reliability and robustness will be the focal paint. Most contemporary conceptions of parallel processing include visions of hundreds of single computers networked to provide 'computing power'. Indeed our own teraflop machines are built from large numbers of computers configured in a network (and thus limited by the network). There are many approaches to parallel configfirations and this presentation offers something slightly different from the contemporary networked model. In the world of embedded computers, which is a pervasive force in contemporary computer controls, there are many single chip computers available. If one backs away from the PC based parallel computing model and considers the possibilities of a parallel control device based on multiple single chip computers, a new area of possibilities becomes apparent. This study will look at the use of multiple single chip computers in a parallel configuration with emphasis placed on maximum reliability.

  19. Template based parallel checkpointing in a massively parallel computer system

    DOE Patents [OSTI]

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  20. Small file aggregation in a parallel computing system

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Zhang, Jingwang

    2014-09-02

    Techniques are provided for small file aggregation in a parallel computing system. An exemplary method for storing a plurality of files generated by a plurality of processes in a parallel computing system comprises aggregating the plurality of files into a single aggregated file; and generating metadata for the single aggregated file. The metadata comprises an offset and a length of each of the plurality of files in the single aggregated file. The metadata can be used to unpack one or more of the files from the single aggregated file.

  1. Global synchronization of parallel processors using clock pulse width modulation

    SciTech Connect (OSTI)

    Chen, Dong; Ellavsky, Matthew R.; Franke, Ross L.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Jeanson, Mark J.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Littrell, Daniel; Ohmacht, Martin; Reed, Don D.; Schenck, Brandon E.; Swetz, Richard A.

    2013-04-02

    A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

  2. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-08-12

    Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  3. Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective opeartion through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

  4. An integrated approach to improving the parallel applications development process

    SciTech Connect (OSTI)

    Rasmussen, Craig E; Watson, Gregory R; Tibbitts, Beth R

    2009-01-01

    The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical phenomenon. It is becoming increasingly clear, however, that continued hardware performance improvements through clock scaling and feature-size reduction are simply not going to be achievable for much longer. The hardware vendor's approach to addressing this issue is to employ parallelism through multi-processor and multi-core technologies. While there is little doubt that this approach produces scaling improvements, there are still many significant hurdles to be overcome before parallelism can be employed as a general replacement to more traditional programming techniques. The Parallel Tools Platform (PTP) Project was created in 2005 in an attempt to provide developers with new tools aimed at addressing some of the parallel development issues. Since then, the introduction of a new generation of peta-scale and multi-core systems has highlighted the need for such a platform. In this paper, we describe some of the challenges facing parallel application developers, present the current state of PTP, and provide a simple case study that demonstrates how PTP can be used to locate a potential deadlock situation in an MPI code.

  5. An efficient parallel algorithm for matrix-vector multiplication

    SciTech Connect (OSTI)

    Hendrickson, B.; Leland, R.; Plimpton, S.

    1993-03-01

    The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.

  6. Parallel Programming and Optimization for Intel Architecture

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Programming and Optimization for Intel Architecture Parallel Programming and Optimization for Intel Architecture August 14, 2015 by Richard Gerber Intel is sponsoring a ...

  7. Computing contingency statistics in parallel.

    SciTech Connect (OSTI)

    Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

    2010-09-01

    Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy, and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel.We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.

  8. BAGEL: New-generation parallel quantum chemistry program | Argonne...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Event Sponsor: Mathematics and Computer Science Division Seminar Start Date: Apr 5 2016 - ... Shiozaki has been a recipient of Japan Society for the Promotion of Science Fellowship ...

  9. Designing a parallel simula machine

    SciTech Connect (OSTI)

    Papazoglou, M.P.; Georgiadis, P.I.; Maritsas, D.G.

    1983-10-01

    The parallel simula machine (PSM) architecture is based upon a master/slave topology, incorporating a master microprocessor. Interconnection circuitry between the master and slave processor modules uses a timesharing system bus and various programmable interrupt control units. Common and private memory modules reside in the PSM, and direct memory access transfers ease the master processor's workload. 5 references.

  10. Parallel, Distributed Scripting with Python

    SciTech Connect (OSTI)

    Miller, P J

    2002-05-24

    Parallel computers used to be, for the most part, one-of-a-kind systems which were extremely difficult to program portably. With SMP architectures, the advent of the POSIX thread API and OpenMP gave developers ways to portably exploit on-the-box shared memory parallelism. Since these architectures didn't scale cost-effectively, distributed memory clusters were developed. The associated MPI message passing libraries gave these systems a portable paradigm too. Having programmers effectively use this paradigm is a somewhat different question. Distributed data has to be explicitly transported via the messaging system in order for it to be useful. In high level languages, the MPI library gives access to data distribution routines in C, C++, and FORTRAN. But we need more than that. Many reasonable and common tasks are best done in (or as extensions to) scripting languages. Consider sysadm tools such as password crackers, file purgers, etc ... These are simple to write in a scripting language such as Python (an open source, portable, and freely available interpreter). But these tasks beg to be done in parallel. Consider the a password checker that checks an encrypted password against a 25,000 word dictionary. This can take around 10 seconds in Python (6 seconds in C). It is trivial to parallelize if you can distribute the information and co-ordinate the work.

  11. Parallelization

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    ... that many intermediate calculations are done in place rather than saving values inmemory. ... many computer cores to leverage more computing power. 3 1.2 Magnetohydrodynamics A ...

  12. Cooperative storage of shared files in a parallel computing system with dynamic block size

    DOE Patents [OSTI]

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2015-11-10

    Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).

  13. Fuel cell generator

    DOE Patents [OSTI]

    Isenberg, Arnold O.

    1983-01-01

    High temperature solid oxide electrolyte fuel cell generators which allow controlled leakage among plural chambers in a sealed housing. Depleted oxidant and fuel are directly reacted in one chamber to combust remaining fuel and preheat incoming reactants. The cells are preferably electrically arranged in a series-parallel configuration.

  14. Compact Mesh Generator

    Energy Science and Technology Software Center (OSTI)

    2007-02-02

    The CMG is a small, lightweight, structured mesh generation code. It features a simple text input parser that allows setup of various meshes via a small set of text commands. Mesh generation data can be output to text, the silo file format, or the API can be directly queried by applications. It can run serially or in parallel via MPI. The CMG includes the ability to specify varius initial conditions on a mesh via meshmore » tags.« less

  15. Parallel multiplex laser feedback interferometry

    SciTech Connect (OSTI)

    Zhang, Song; Tan, Yidong; Zhang, Shulian

    2013-12-15

    We present a parallel multiplex laser feedback interferometer based on spatial multiplexing which avoids the signal crosstalk in the former feedback interferometer. The interferometer outputs two close parallel laser beams, whose frequencies are shifted by two acousto-optic modulators by 2Ω simultaneously. A static reference mirror is inserted into one of the optical paths as the reference optical path. The other beam impinges on the target as the measurement optical path. Phase variations of the two feedback laser beams are simultaneously measured through heterodyne demodulation with two different detectors. Their subtraction accurately reflects the target displacement. Under typical room conditions, experimental results show a resolution of 1.6 nm and accuracy of 7.8 nm within the range of 100 μm.

  16. Parallel Power Grid Simulation Toolkit

    Energy Science and Technology Software Center (OSTI)

    2015-09-14

    ParGrid is a 'wrapper' that integrates a coupled Power Grid Simulation toolkit consisting of a library to manage the synchronization and communication of independent simulations. The included library code in ParGid, named FSKIT, is intended to support the coupling multiple continuous and discrete even parallel simulations. The code is designed using modern object oriented C++ methods utilizing C++11 and current Boost libraries to ensure compatibility with multiple operating systems and environments.

  17. Scalable Parallel Algebraic Multigrid Solvers

    SciTech Connect (OSTI)

    Bank, R; Lu, S; Tong, C; Vassilevski, P

    2005-03-23

    The authors propose a parallel algebraic multilevel algorithm (AMG), which has the novel feature that the subproblem residing in each processor is defined over the entire partition domain, although the vast majority of unknowns for each subproblem are associated with the partition owned by the corresponding processor. This feature ensures that a global coarse description of the problem is contained within each of the subproblems. The advantages of this approach are that interprocessor communication is minimized in the solution process while an optimal order of convergence rate is preserved; and the speed of local subproblem solvers can be maximized using the best existing sequential algebraic solvers.

  18. Xyce parallel electronic simulator design.

    SciTech Connect (OSTI)

    Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.

    2010-09-01

    This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.

  19. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect (OSTI)

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  20. Parallel computation with adaptive methods for elliptic and hyperbolic systems

    SciTech Connect (OSTI)

    Benantar, M.; Biswas, R.; Flaherty, J.E.; Shephard, M.S.

    1990-01-01

    We consider the solution of two dimensional vector systems of elliptic and hyperbolic partial differential equations on a shared memory parallel computer. For elliptic problems, the spatial domain is discretized using a finite quadtree mesh generation procedure and the differential system is discretized by a finite element-Galerkin technique with a piecewise linear polynomial basis. Resulting linear algebraic systems are solved using the conjugate gradient technique with element-by-element and symmetric successive over-relaxation preconditioners. Stiffness matrix assembly and linear system solutions are processed in parallel with computations scheduled on noncontiguous quadrants of the tree in order to minimize process synchronization. Determining noncontiguous regions by coloring the regular finite quadtree structure is far simpler than coloring elements of the unstructured mesh that the finite quadtree procedure generates. We describe linear-time complexity coloring procedures that use six and eight colors.

  1. Apply for the Parallel Computing Summer Research Internship

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    How to Apply Apply for the Parallel Computing Summer Research Internship Creating next-generation leaders in HPC research and applications development Program Co-Lead Robert (Bob) Robey Email Program Co-Lead Gabriel Rockefeller Email Program Co-Lead Hai Ah Nam Email Professional Staff Assistant Nicole Aguilar Garcia (505) 665-3048 Email Current application deadline is February 5, 2016 with notification by early March 2016. Who can apply? Upper division undergraduate students and early graduate

  2. High voltage pulse generator

    DOE Patents [OSTI]

    Fasching, George E.

    1977-03-08

    An improved high-voltage pulse generator has been provided which is especially useful in ultrasonic testing of rock core samples. An N number of capacitors are charged in parallel to V volts and at the proper instance are coupled in series to produce a high-voltage pulse of N times V volts. Rapid switching of the capacitors from the paralleled charging configuration to the series discharging configuration is accomplished by using silicon-controlled rectifiers which are chain self-triggered following the initial triggering of a first one of the rectifiers connected between the first and second of the plurality of charging capacitors. A timing and triggering circuit is provided to properly synchronize triggering pulses to the first SCR at a time when the charging voltage is not being applied to the parallel-connected charging capacitors. Alternate circuits are provided for controlling the application of the charging voltage from a charging circuit to be applied to the parallel capacitors which provides a selection of at least two different intervals in which the charging voltage is turned "off" to allow the SCR's connecting the capacitors in series to turn "off" before recharging begins. The high-voltage pulse-generating circuit including the N capacitors and corresponding SCR's which connect the capacitors in series when triggered "on" further includes diodes and series-connected inductors between the parallel-connected charging capacitors which allow sufficiently fast charging of the capacitors for a high pulse repetition rate and yet allow considerable control of the decay time of the high-voltage pulses from the pulse-generating circuit.

  3. Device for balancing parallel strings

    DOE Patents [OSTI]

    Mashikian, Matthew S.

    1985-01-01

    A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.

  4. Information hiding in parallel programs

    SciTech Connect (OSTI)

    Foster, I.

    1992-01-30

    A fundamental principle in program design is to isolate difficult or changeable design decisions. Application of this principle to parallel programs requires identification of decisions that are difficult or subject to change, and the development of techniques for hiding these decisions. We experiment with three complex applications, and identify mapping, communication, and scheduling as areas in which decisions are particularly problematic. We develop computational abstractions that hide such decisions, and show that these abstractions can be used to develop elegant solutions to programming problems. In particular, they allow us to encode common structures, such as transforms, reductions, and meshes, as software cells and templates that can reused in different applications. An important characteristic of these structures is that they do not incorporate mapping, communication, or scheduling decisions: these aspects of the design are specified separately, when composing existing structures to form applications. This separation of concerns allows the same cells and templates to be reused in different contexts.

  5. Parallel computing in enterprise modeling.

    SciTech Connect (OSTI)

    Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

    2008-08-01

    This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.

  6. Graphical representation of parallel algorithmic processes. Master's thesis

    SciTech Connect (OSTI)

    Williams, E.M.

    1990-12-01

    Algorithm animation is a visualization method used to enhance understanding of functioning of an algorithm or program. Visualization is used for many purposes, including education, algorithm research, performance analysis, and program debugging. This research applies algorithm animation techniques to programs developed for parallel architectures, with specific on the Intel iPSC/2 hypercube. While both P-time and NP-time algorithms can potentially benefit from using visualization techniques, the set of NP-complete problems provides fertile ground for developing parallel applications, since the combinatoric nature of the problems makes finding the optimum solution impractical. The primary goals for this visualization system are: Data should be displayed as it is generated. The interface to the targe program should be transparent, allowing the animation of existing programs. Flexibility - the system should be able to animate any algorithm. The resulting system incorporates and extends two AFIT products: the AFIT Algorithm Animation Research Facility (AAARF) and the Parallel Resource Analysis Software Environment (PRASE). AAARF is an algorithm animation system developed primarily for sequential programs, but is easily adaptable for use with parallel programs. PRASE is an instrumentation package that extracts system performance data from programs on the Intel hypercubes. Since performance data is an essential part of analyzing any parallel program, views of the performance data are provided as an elementary part of the system. Custom software is designed to interface these systems and to display the program data. The program chosen as the example for this study is a member of the NP-complete problem set; it is a parallel implementation of a general.

  7. Parallel Programming and Optimization for Intel Architecture

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Parallel Programming and Optimization for Intel Architecture Parallel Programming and Optimization for Intel Architecture August 14, 2015 by Richard Gerber Intel is sponsoring a series of webinars entitled "Parallel Programming and Optimization for Intel Architecture." Here's the schedule for August (Registration link is: https://attendee.gotowebinar.com/register/6325131222429932289) Mon, August 17 - "Hello world from Intel Xeon Phi coprocessors". Overview of architecture,

  8. CASL - The Michigan Parallel Characteristics Transport Code

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    The Michigan Parallel Characteristics Transport Code Verification of MPACT: The Michigan Parallel Characteristics Transport Code Benjamin Collins, Brendan Kochunas, Daniel Jabbay, Thomas Downar, William Martin Department of Nuclear Engineering and Radiological Sciences University of Michigan Andrew Godfrey Oak Ridge National Laboroatory MPACT (Michigan PArallel Characteristics Transport Code) is a new reactor analysis tool being developed at the University of Michigan as an advanced pin-resolved

  9. Parallel auto-correlative statistics with VTK.

    SciTech Connect (OSTI)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  10. Petascale Parallelization of the Gyrokinetic Toroidal Code

    SciTech Connect (OSTI)

    Ethier, Stephane; Adams, Mark; Carter, Jonathan; Oliker, Leonid

    2010-05-01

    The Gyrokinetic Toroidal Code (GTC) is a global, three-dimensional particle-in-cell application developed to study microturbulence in tokamak fusion devices. The global capability of GTC is unique, allowing researchers to systematically analyze important dynamics such as turbulence spreading. In this work we examine a new radial domain decomposition approach to allow scalability onto the latest generation of petascale systems. Extensive performance evaluation is conducted on three high performance computing systems: the IBM BG/P, the Cray XT4, and an Intel Xeon Cluster. Overall results show that the radial decomposition approach dramatically increases scalability, while reducing the memory footprint - allowing for fusion device simulations at an unprecedented scale. After a decade where high-end computing (HEC) was dominated by the rapid pace of improvements to processor frequencies, the performance of next-generation supercomputers is increasingly differentiated by varying interconnect designs and levels of integration. Understanding the tradeoffs of these system designs is a key step towards making effective petascale computing a reality. In this work, we examine a new parallelization scheme for the Gyrokinetic Toroidal Code (GTC) [?] micro-turbulence fusion application. Extensive scalability results and analysis are presented on three HEC systems: the IBM BlueGene/P (BG/P) at Argonne National Laboratory, the Cray XT4 at Lawrence Berkeley National Laboratory, and an Intel Xeon cluster at Lawrence Livermore National Laboratory. Overall results indicate that the new radial decomposition approach successfully attains unprecedented scalability to 131,072 BG/P cores by overcoming the memory limitations of the previous approach. The new version is well suited to utilize emerging petascale resources to access new regimes of physical phenomena.

  11. Composing Data Parallel Code for a SPARQL Graph Engine

    SciTech Connect (OSTI)

    Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste; Haglin, David J.; Feo, John

    2013-09-08

    Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basic graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.

  12. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  13. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOE Patents [OSTI]

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  14. Final Report: Center for Programming Models for Scalable Parallel Computing

    SciTech Connect (OSTI)

    Mellor-Crummey, John

    2011-09-13

    As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

  15. Buffered coscheduling for parallel programming and enhanced fault tolerance

    DOE Patents [OSTI]

    Petrini, Fabrizio; Feng, Wu-chun

    2006-01-31

    A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

  16. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Berg, Jeremy E.; Faraj, Ahmad A.

    2011-08-02

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.

  17. Differences Between Distributed and Parallel Systems

    SciTech Connect (OSTI)

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  18. Automated Parallel Capillary Electrophoretic System

    DOE Patents [OSTI]

    Li, Qingbo; Kane, Thomas E.; Liu, Changsheng; Sonnenschein, Bernard; Sharer, Michael V.; Kernan, John R.

    2000-02-22

    An automated electrophoretic system is disclosed. The system employs a capillary cartridge having a plurality of capillary tubes. The cartridge has a first array of capillary ends projecting from one side of a plate. The first array of capillary ends are spaced apart in substantially the same manner as the wells of a microtitre tray of standard size. This allows one to simultaneously perform capillary electrophoresis on samples present in each of the wells of the tray. The system includes a stacked, dual carousel arrangement to eliminate cross-contamination resulting from reuse of the same buffer tray on consecutive executions from electrophoresis. The system also has a gel delivery module containing a gel syringe/a stepper motor or a high pressure chamber with a pump to quickly and uniformly deliver gel through the capillary tubes. The system further includes a multi-wavelength beam generator to generate a laser beam which produces a beam with a wide range of wavelengths. An off-line capillary reconditioner thoroughly cleans a capillary cartridge to enable simultaneous execution of electrophoresis with another capillary cartridge. The streamlined nature of the off-line capillary reconditioner offers the advantage of increased system throughput with a minimal increase in system cost.

  19. Xyce parallel electronic simulator : users' guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique

  20. Parallel Climate Analysis Toolkit (ParCAT)

    SciTech Connect (OSTI)

    Smith, Brian Edward

    2013-06-30

    The parallel analysis toolkit (ParCAT) provides parallel statistical processing of large climate model simulation datasets. ParCAT provides parallel point-wise average calculations, frequency distributions, sum/differences of two datasets, and difference-of-average and average-of-difference for two datasets for arbitrary subsets of simulation time. ParCAT is a command-line utility that can be easily integrated in scripts or embedded in other application. ParCAT supports CMIP5 post-processed datasets as well as non-CMIP5 post-processed datasets. ParCAT reads and writes standard netCDF files.

  1. Distributed parallel messaging for multiprocessor systems

    DOE Patents [OSTI]

    Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka

    2013-06-04

    A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.

  2. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A.M.; Draper, R.

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row. 5 figures.

  3. Solid oxide fuel cell generator

    DOE Patents [OSTI]

    Di Croce, A. Michael; Draper, Robert

    1993-11-02

    A solid oxide fuel cell generator has a plenum containing at least two rows of spaced apart, annular, axially elongated fuel cells. An electrical conductor extending between adjacent rows of fuel cells connects the fuel cells of one row in parallel with each other and in series with the fuel cells of the adjacent row.

  4. Stochastic Parallel PARticle Kinetic Simulator

    Energy Science and Technology Software Center (OSTI)

    2008-07-01

    SPPARKS is a kinetic Monte Carlo simulator which implements kinetic and Metropolis Monte Carlo solvers in a general way so that they can be hooked to applications of various kinds. Specific applications are implemented in SPPARKS as physical models which generate events (e.g. a diffusive hop or chemical reaction) and execute them one-by-one. Applications can run in paralle so long as the simulation domain can be partitoned spatially so that multiple events can be invokedmore » simultaneously. SPPARKS is used to model various kinds of mesoscale materials science scenarios such as grain growth, surface deposition and growth, and reaction kinetics. It can also be used to develop new Monte Carlo models that hook to the existing solver and paralle infrastructure provided by the code.« less

  5. Solid state pulsed power generator

    DOE Patents [OSTI]

    Tao, Fengfeng; Saddoughi, Seyed Gholamali; Herbon, John Thomas

    2014-02-11

    A power generator includes one or more full bridge inverter modules coupled to a semiconductor opening switch (SOS) through an inductive resonant branch. Each module includes a plurality of switches that are switched in a fashion causing the one or more full bridge inverter modules to drive the semiconductor opening switch SOS through the resonant circuit to generate pulses to a load connected in parallel with the SOS.

  6. Evaluating parallel relational databases for medical data analysis.

    SciTech Connect (OSTI)

    Rintoul, Mark Daniel; Wilson, Andrew T.

    2012-03-01

    Hospitals have always generated and consumed large amounts of data concerning patients, treatment and outcomes. As computers and networks have permeated the hospital environment it has become feasible to collect and organize all of this data. This raises naturally the question of how to deal with the resulting mountain of information. In this report we detail a proof-of-concept test using two commercially available parallel database systems to analyze a set of real, de-identified medical records. We examine database scalability as data sizes increase as well as responsiveness under load from multiple users.

  7. Feature Clustering for Accelerating Parallel Coordinate Descent

    SciTech Connect (OSTI)

    Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh; Haglin, David J.

    2012-12-06

    We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.

  8. Parallel I/O in Practice

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    art. This tutorial sheds light on the state-of-the-art in parallel IO and provides the knowledge necessary for attendees to best leverage IO resources available to them. We...

  9. Asynchronous parallel pattern search for nonlinear optimization

    SciTech Connect (OSTI)

    P. D. Hough; T. G. Kolda; V. J. Torczon

    2000-01-01

    Parallel pattern search (PPS) can be quite useful for engineering optimization problems characterized by a small number of variables (say 10--50) and by expensive objective function evaluations such as complex simulations that take from minutes to hours to run. However, PPS, which was originally designed for execution on homogeneous and tightly-coupled parallel machine, is not well suited to the more heterogeneous, loosely-coupled, and even fault-prone parallel systems available today. Specifically, PPS is hindered by synchronization penalties and cannot recover in the event of a failure. The authors introduce a new asynchronous and fault tolerant parallel pattern search (AAPS) method and demonstrate its effectiveness on both simple test problems as well as some engineering optimization problems

  10. Parallel programming with PCN. Revision 1

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1991-12-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and C that allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. In includes both tutorial and reference material. It also presents the basic concepts that underly PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous FTP from Argonne National Laboratory in the directory pub/pcn at info.mcs.anl.gov (c.f. Appendix A).

  11. Optimize Parallel Pumping Systems: Industrial Technologies Program...

    Office of Energy Efficiency and Renewable Energy (EERE) Indexed Site

    ... to operate the number of pumps needed to meet variable fow rate requirements effciently. ... Parallel pumps provide balanced or equal fow rates when the same models are used and their ...

  12. HOPSPACK: Hybrid Optimization Parallel Search Package.

    SciTech Connect (OSTI)

    Gray, Genetha A.; Kolda, Tamara G.; Griffin, Joshua; Taddy, Matt; Martinez-Canales, Monica

    2008-12-01

    In this paper, we describe the technical details of HOPSPACK (Hybrid Optimization Parallel SearchPackage), a new software platform which facilitates combining multiple optimization routines into asingle, tightly-coupled, hybrid algorithm that supports parallel function evaluations. The frameworkis designed such that existing optimization source code can be easily incorporated with minimalcode modification. By maintaining the integrity of each individual solver, the strengths and codesophistication of the original optimization package are retained and exploited.4

  13. Light beam frequency comb generator

    DOE Patents [OSTI]

    Priatko, G.J.; Kaskey, J.A.

    1992-11-24

    A light beam frequency comb generator uses an acousto-optic modulator to generate a plurality of light beams with frequencies which are uniformly separated and possess common noise and drift characteristics. A well collimated monochromatic input light beam is passed through this modulator to produce a set of both frequency shifted and unshifted optical beams. An optical system directs one or more frequency shifted beams along a path which is parallel to the path of the input light beam such that the frequency shifted beams are made incident on the modulator proximate to but separated from the point of incidence of the input light beam. After the beam is thus returned to and passed through the modulator repeatedly, a plurality of mutually parallel beams are generated which are frequency-shifted different numbers of times and possess common noise and drift characteristics. 2 figs.

  14. Light beam frequency comb generator

    DOE Patents [OSTI]

    Priatko, Gordon J.; Kaskey, Jeffrey A.

    1992-01-01

    A light beam frequency comb generator uses an acousto-optic modulator to generate a plurality of light beams with frequencies which are uniformly separated and possess common noise and drift characteristics. A well collimated monochromatic input light beam is passed through this modulator to produce a set of both frequency shifted and unshifted optical beams. An optical system directs one or more frequency shifted beams along a path which is parallel to the path of the input light beam such that the frequency shifted beams are made incident on the modulator proximate to but separated from the point of incidence of the input light beam. After the beam is thus returned to and passed through the modulator repeatedly, a plurality of mutually parallel beams are generated which are frequency-shifted different numbers of times and possess common noise and drift characteristics.

  15. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    SciTech Connect (OSTI)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs.

  16. Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

    SciTech Connect (OSTI)

    Jin, Shuangshuang; Chen, Yousu; Wu, Di; Diao, Ruisheng; Huang, Zhenyu

    2015-12-09

    Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Message Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.

  17. Parallel Integral Curves (Book) | SciTech Connect

    Office of Scientific and Technical Information (OSTI)

    SciTech Connect Search Results Book: Parallel Integral Curves Citation Details In-Document Search Title: Parallel Integral Curves Authors: Pugmire, Dave 1 ; Peterka, Tom 2 ; ...

  18. Parallel Algorithms and Patterns (Technical Report) | SciTech...

    Office of Scientific and Technical Information (OSTI)

    Parallel Algorithms and Patterns Citation Details In-Document Search Title: Parallel Algorithms and Patterns Authors: Robey, Robert W. 1 + Show Author Affiliations Los Alamos ...

  19. Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications...

    Office of Scientific and Technical Information (OSTI)

    Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications Citation Details In-Document Search Title: Linux Kernel Co-Scheduling For Bulk Synchronous Parallel ...

  20. Parallel phase-sensitive three-dimensional imaging camera

    DOE Patents [OSTI]

    Smithpeter, Colin L.; Hoover, Eddie R.; Pain, Bedabrata; Hancock, Bruce R.; Nellums, Robert O.

    2007-09-25

    An apparatus is disclosed for generating a three-dimensional (3-D) image of a scene illuminated by a pulsed light source (e.g. a laser or light-emitting diode). The apparatus, referred to as a phase-sensitive 3-D imaging camera utilizes a two-dimensional (2-D) array of photodetectors to receive light that is reflected or scattered from the scene and processes an electrical output signal from each photodetector in the 2-D array in parallel using multiple modulators, each having inputs of the photodetector output signal and a reference signal, with the reference signal provided to each modulator having a different phase delay. The output from each modulator is provided to a computational unit which can be used to generate intensity and range information for use in generating a 3-D image of the scene. The 3-D camera is capable of generating a 3-D image using a single pulse of light, or alternately can be used to generate subsequent 3-D images with each additional pulse of light.

  1. Environmental support to the clean coal technology program

    SciTech Connect (OSTI)

    Miller, R.L.

    1996-06-01

    Work during this period focused on the preparation for DOE`s Morgantown Energy Technology Center (METC) of a final Environmental Assessment (EA) for the Externally Fired Combined Cycle (EFCC) Project in Warren, Pennsylvania. Proposed by the Pennsylvania Electric Company (Penelec) and selected by DOE in the fifth solicitation of the CCT Program, the project would be sited at one of the two units at Penelec`s Warren Station. The EFCC Project proposes to replace two existing boilers with a new {open_quotes}power island{close_quotes} consisting of a staged coal combustor, slag screen, heat exchanger, an indirectly fired gas turbine, and a heat recovery steam generator. Subsequently, Unit 2 would operate in combined-cycle mode using the new gas turbine and the existing steam turbine simultaneously. The gas turbine would generate 25 megawatts of electricity so that Unit 2 output would increase from the existing 48 megawatts generated by the steam turbine to a total of 73 megawatts. Operation of a conventional flue gas desulfurization dry scrubber as part of the EFCC technology is expected to decrease SO{sub 2} emissions by 90% per kilowatt-hour of electricity generated, and NO{sub x} emissions are anticipated to be 60% less per kilowatt-hour of electricity generated because of the staged combustor. Because the EFCC technology would be more efficient, less carbon dioxide (CO{sub 2}) would be emitted to the atmosphere per kilowatt-hour of electricity produced.

  2. Java Parallel Secure Stream for Grid Computing

    SciTech Connect (OSTI)

    Chen, Jie; Akers, Walter; Chen, Ying; Watson, William

    2001-09-01

    The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve the bandwidth and to reduce latency on a high speed wide area network. This paper presents a pure Java package called JPARSS (Java Par-allel Secure Stream) that divides data into partitions that are sent over several parallel Java streams simultaneously and allows Java or Web applications to achieve optimal TCP performance in a gird environment without the necessity of tuning the TCP window size. Several experimental results are provided to show that using parallel stream is more effective than tuning TCP window size. In addi-tion X.509 certificate based single sign-on mechanism and SSL based connection establishment are integrated into this package. Finally a few applications using this package will be discussed.

  3. Xyce parallel electronic simulator release notes.

    SciTech Connect (OSTI)

    Keiter, Eric Richard; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd Stirling; Pawlowski, Roger Patrick; Santarelli, Keith R.

    2010-05-01

    The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.

  4. Parallel Implementation of Power System Dynamic Simulation

    SciTech Connect (OSTI)

    Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng; Wu, Di; Chen, Yousu

    2013-07-21

    Dynamic simulation of power system transient stability is important for planning, monitoring, operation, and control of electrical power systems. However, modeling the system dynamics and network involves the computationally intensive time-domain solution of numerous differential and algebraic equations (DAE). This results in a transient stability implementation that may not maintain the real-time constraints of an online security assessment. This paper presents a parallel implementation of the dynamic simulation on a high-performance computing (HPC) platform using parallel simulation algorithms and computation architectures. It enables the simulation to run even faster than real time, enabling the look-ahead capability of upcoming stability problems in the power grid.

  5. Berkeley Unified Parallel C (UPC) Compiler

    Energy Science and Technology Software Center (OSTI)

    2003-04-06

    This program is a portable, open-source, compiler for the UPC language, which is based on the Open64 framework, and has extensive support for optimizations. This compiler operated by translating UPC into ANS/ISO C for compilation by a native compiler and linking with a UPC Runtime Library. This design eases portability to both shared and distributed memory parallel architectures. For proper operation the "Berkeley Unified Parallel C (UPC) Runtime Library" and its dependencies are required. Compatiblemore » replacements which implement "The Berkeley UPC Runtime Specification" are possible.« less

  6. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-11-12

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.

  7. Ramona Band of Cahuilla Mission Indians- 2002 Project

    Broader source: Energy.gov [DOE]

    The Ramona Band of Cahuilla Mission Indians ("Ramona Band" or "tribe") will be the first tribe to develop its entire reservation off-grid, using renewable energy as the primary power source. The tribe will purchase and install the primary components for a 65-80 kilowatt-hours per day central wind/PV/propane generator hybrid system that will power the reservation's housing, offices, ecotourism, and training businesses. The electricity is planned to be distributed through an underground mini-grid.

  8. Project Reports for Ramona Band of Cahuilla Mission Indians- 2002 Project

    Broader source: Energy.gov [DOE]

    The Ramona Band of Cahuilla Mission Indians ("Ramona Band" or "tribe") will be the first tribe to develop its entire reservation off-grid, using renewable energy as the primary power source. The tribe will purchase and install the primary components for a 65-80 kilowatt-hours per day central wind/PV/propane generator hybrid system that will power the reservation's housing, offices, ecotourism, and training businesses. The electricity is planned to be distributed through an underground mini-grid.

  9. GRIDS: Grid-Scale Rampable Intermittent Dispatchable Storage

    SciTech Connect (OSTI)

    2010-09-01

    GRIDS Project: The 12 projects that comprise ARPA-Es GRIDS Project, short for Grid-Scale Rampable Intermittent Dispatchable Storage, are developing storage technologies that can store renewable energy for use at any location on the grid at an investment cost less than $100 per kilowatt hour. Flexible, large-scale storage would create a stronger and more robust electric grid by enabling renewables to contribute to reliable power generation.

  10. Under Secretary Klotz delivers remarks at PREP ribbon-cutting | National

    National Nuclear Security Administration (NNSA)

    Nuclear Security Administration | (NNSA) Under Secretary Klotz delivers remarks at PREP ribbon-cutting Wednesday, June 18, 2014 - 1:23pm Under Secretary Klotz delivered remarks at the Pantex Renewable Energy Project (PREP) ribbon-cutting this week. PREP establishes the largest federally-owned wind farm in the country and will generate approximately 47 million kilowatt-hours of electricity annually, more than 60 percent of the electricity needed for Pantex. The project will reduce CO2

  11. Parallel Algebraic Multigrids for Structural mechanics

    SciTech Connect (OSTI)

    Brezina, M; Tong, C; Becker, R

    2004-05-11

    This paper presents the results of a comparison of three parallel algebraic multigrid (AMG) preconditioners for structural mechanics applications. In particular, they are interested in investigating both the scalability and robustness of the preconditioners. Numerical results are given for a range of structural mechanics problems with various degrees of difficulty.

  12. Parallel programming with PCN. Revision 2

    SciTech Connect (OSTI)

    Foster, I.; Tuecke, S.

    1993-01-01

    PCN is a system for developing and executing parallel programs. It comprises a high-level programming language, tools for developing and debugging programs in this language, and interfaces to Fortran and Cthat allow the reuse of existing code in multilingual parallel programs. Programs developed using PCN are portable across many different workstations, networks, and parallel computers. This document provides all the information required to develop parallel programs with the PCN programming system. It includes both tutorial and reference material. It also presents the basic concepts that underlie PCN, particularly where these are likely to be unfamiliar to the reader, and provides pointers to other documentation on the PCN language, programming techniques, and tools. PCN is in the public domain. The latest version of both the software and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/pcn at info.mcs. ani.gov (cf. Appendix A). This version of this document describes PCN version 2.0, a major revision of the PCN programming system. It supersedes earlier versions of this report.

  13. Linked-View Parallel Coordinate Plot Renderer

    Energy Science and Technology Software Center (OSTI)

    2011-06-28

    This software allows multiple linked views for interactive querying via map-based data selection, bar chart analytic overlays, and high dynamic range (HDR) line renderings. The major component of the visualization package is a parallel coordinate renderer with binning, curved layouts, shader-based rendering, and other techniques to allow interactive visualization of multidimensional data.

  14. The parallel virtual file system for portals.

    SciTech Connect (OSTI)

    Schutt, James Alan

    2004-04-01

    This report presents the result of an effort to re-implement the Parallel Virtual File System (PVFS) using Portals as the transport. This report provides short overviews of PVFS and Portals, and describes the design and implementation of PVFS over Portals. Finally, the results of performance testing of both stock PVFS and PVFS over Portals are presented.

  15. Message passing with parallel queue traversal

    DOE Patents [OSTI]

    Underwood, Keith D.; Brightwell, Ronald B.; Hemmert, K. Scott

    2012-05-01

    In message passing implementations, associative matching structures are used to permit list entries to be searched in parallel fashion, thereby avoiding the delay of linear list traversal. List management capabilities are provided to support list entry turnover semantics and priority ordering semantics.

  16. Parallel stitching of 2D materials

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Ling, Xi; Wu, Lijun; Lin, Yuxuan; Ma, Qiong; Wang, Ziqiang; Song, Yi; Yu, Lili; Huang, Shengxi; Fang, Wenjing; Zhang, Xu; et al

    2016-01-27

    Diverse parallel stitched 2D heterostructures, including metal–semiconductor, semiconductor–semiconductor, and insulator–semiconductor, are synthesized directly through selective “sowing” of aromatic molecules as the seeds in the chemical vapor deposition (CVD) method. Lastly, the methodology enables the large-scale fabrication of lateral heterostructures, which offers tremendous potential for its application in integrated circuits.

  17. Parallel Performance of a Combustion Chemistry Simulation

    DOE Public Access Gateway for Energy & Science Beta (PAGES Beta)

    Skinner, Gregg; Eigenmann, Rudolf

    1995-01-01

    We used a description of a combustion simulation's mathematical and computational methods to develop a version for parallel execution. The result was a reasonable performance improvement on small numbers of processors. We applied several important programming techniques, which we describe, in optimizing the application. This work has implications for programming languages, compiler design, and software engineering.

  18. Collectively loading an application in a parallel computer

    DOE Patents [OSTI]

    Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Miller, Samuel J.; Mundy, Michael B.

    2016-01-05

    Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.

  19. Multitasking TORT under UNICOS: Parallel performance models and measurements

    SciTech Connect (OSTI)

    Barnett, A.; Azmy, Y.Y.

    1999-09-27

    The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead.

  20. Generation Planning (pbl/generation)

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Generation Hydro Power Wind Power Monthly GSP BPA White Book Dry Year Tools Firstgov Generation Planning Thumbnail image of BPA White Book BPA White Book (1998-2014) Draft Dry...

  1. Electrostatic generator/motor configurations

    DOE Patents [OSTI]

    Post, Richard F

    2014-02-04

    Electrostatic generators/motors designs are provided that generally may include a first cylindrical stator centered about a longitudinal axis; a second cylindrical stator centered about the axis, a first cylindrical rotor centered about the axis and located between the first cylindrical stator and the second cylindrical stator. The first cylindrical stator, the second cylindrical stator and the first cylindrical rotor may be concentrically aligned. A magnetic field having field lines about parallel with the longitudinal axis is provided.

  2. Users manual for the Chameleon parallel programming tools

    SciTech Connect (OSTI)

    Gropp, W.; Smith, B.

    1993-06-01

    Message passing is a common method for writing programs for distributed-memory parallel computers. Unfortunately, the lack of a standard for message passing has hampered the construction of portable and efficient parallel programs. In an attempt to remedy this problem, a number of groups have developed their own message-passing systems, each with its own strengths and weaknesses. Chameleon is a second-generation system of this type. Rather than replacing these existing systems, Chameleon is meant to supplement them by providing a uniform way to access many of these systems. Chameleon`s goals are to (a) be very lightweight (low over-head), (b) be highly portable, and (c) help standardize program startup and the use of emerging message-passing operations such as collective operations on subsets of processors. Chameleon also provides a way to port programs written using PICL or Intel NX message passing to other systems, including collections of workstations. Chameleon is tracking the Message-Passing Interface (MPI) draft standard and will provide both an MPI implementation and an MPI transport layer. Chameleon provides support for heterogeneous computing by using p4 and PVM. Chameleon`s support for homogeneous computing includes the portable libraries p4, PICL, and PVM and vendor-specific implementation for Intel NX, IBM EUI (SP-1), and Thinking Machines CMMD (CM-5). Support for Ncube and PVM 3.x is also under development.

  3. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Data communications in a parallel active messaging interface ('PAMI') or a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution of a compute node, including specification of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications instruction, the instruction characterized by instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance witht the instruction type, the transfer data from the origin endpoin to the target endpoint.

  4. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-10-29

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.

  5. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOE Patents [OSTI]

    Engh, G.J. van den; Stokdijk, W.

    1992-09-22

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate. 17 figs.

  6. Parallel pulse processing and data acquisition for high speed, low error flow cytometry

    DOE Patents [OSTI]

    van den Engh, Gerrit J.; Stokdijk, Willem

    1992-01-01

    A digitally synchronized parallel pulse processing and data acquisition system for a flow cytometer has multiple parallel input channels with independent pulse digitization and FIFO storage buffer. A trigger circuit controls the pulse digitization on all channels. After an event has been stored in each FIFO, a bus controller moves the oldest entry from each FIFO buffer onto a common data bus. The trigger circuit generates an ID number for each FIFO entry, which is checked by an error detection circuit. The system has high speed and low error rate.

  7. Impact analysis on a massively parallel computer

    SciTech Connect (OSTI)

    Zacharia, T.; Aramayo, G.A.

    1994-06-01

    Advanced mathematical techniques and computer simulation play a major role in evaluating and enhancing the design of beverage cans, industrial, and transportation containers for improved performance. Numerical models are used to evaluate the impact requirements of containers used by the Department of Energy (DOE) for transporting radioactive materials. Many of these models are highly compute-intensive. An analysis may require several hours of computational time on current supercomputers despite the simplicity of the models being studied. As computer simulations and materials databases grow in complexity, massively parallel computers have become important tools. Massively parallel computational research at the Oak Ridge National Laboratory (ORNL) and its application to the impact analysis of shipping containers is briefly described in this paper.

  8. Locating hardware faults in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-04-13

    Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.

  9. Parallel machine architecture for production rule systems

    DOE Patents [OSTI]

    Allen, Jr., John D.; Butler, Philip L.

    1989-01-01

    A parallel processing system for production rule programs utilizes a host processor for storing production rule right hand sides (RHS) and a plurality of rule processors for storing left hand sides (LHS). The rule processors operate in parallel in the recognize phase of the system recognize -Act Cycle to match their respective LHS's against a stored list of working memory elements (WME) in order to find a self consistent set of WME's. The list of WME is dynamically varied during the Act phase of the system in which the host executes or fires rule RHS's for those rules for which a self-consistent set has been found by the rule processors. The host transmits instructions for creating or deleting working memory elements as dictated by the rule firings until the rule processors are unable to find any further self-consistent working memory element sets at which time the production rule system is halted.

  10. Parallel processor-based raster graphics system architecture

    DOE Patents [OSTI]

    Littlefield, Richard J.

    1990-01-01

    An apparatus for generating raster graphics images from the graphics command stream includes a plurality of graphics processors connected in parallel, each adapted to receive any part of the graphics command stream for processing the command stream part into pixel data. The apparatus also includes a frame buffer for mapping the pixel data to pixel locations and an interconnection network for interconnecting the graphics processors to the frame buffer. Through the interconnection network, each graphics processor may access any part of the frame buffer concurrently with another graphics processor accessing any other part of the frame buffer. The plurality of graphics processors can thereby transmit concurrently pixel data to pixel locations in the frame buffer.

  11. Runtime System Library for Parallel Weather Modules

    Energy Science and Technology Software Center (OSTI)

    1997-07-22

    RSL is a Fortran-callable runtime library for use in implementing regular-grid weather forecast models, with nesting, on scalable distributed memory parallel computers. It provides high-level routines for finite-difference stencil communications and inter-domain exchange of data for nested forcing and feedback. RSL supports a unique point-wise domain-decomposition strategy to facilitate load-balancing.

  12. FORTRAN Extensions for Modular Parallel Processing

    Energy Science and Technology Software Center (OSTI)

    1996-01-12

    FORTRAN M is a small set of extensions to FORTRAN that supports a modular approach to the construction of sequential and parallel programs. FORTRAN M programs use channels to plug together processes which may be written in FORTRAN M or FORTRAN 77. Processes communicate by sending and receiving messages on channels. Channels and processes can be created dynamically, but programs remain deterministic unless specialized nondeterministic constructs are used.

  13. Parallel Integrated Thermal Management - Energy Innovation Portal

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Vehicles and Fuels Vehicles and Fuels Early Stage R&D Early Stage R&D Find More Like This Return to Search Parallel Integrated Thermal Management National Renewable Energy Laboratory Contact NREL About This Technology Technology Marketing Summary Many current cooling systems for hybrid electric vehicles (HEVs) with a high power electric drive system utilize a low temperature liquid cooling loop for cooling the power electronics system and electric machines associated with the electric

  14. Parallel Heuristics for Scalable Community Detection

    SciTech Connect (OSTI)

    Lu, Howard; Kalyanaraman, Anantharaman; Halappanavar, Mahantesh; Choudhury, Sutanay

    2014-05-17

    Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability to problems that can be solved on desktops. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose multiple heuristics that are designed to break the sequential barrier. Our heuristics are agnostic to the underlying parallel architecture. For evaluation purposes, we implemented our heuristics on shared memory (OpenMP) and distributed memory (MapReduce-MPI) machines, and tested them over real world graphs derived from multiple application domains (internet, biological, natural language processing). Experimental results demonstrate the ability of our heuristics to converge to high modularity solutions comparable to those output by the serial algorithm in nearly the same number of iterations, while also drastically reducing time to solution.

  15. Parallel Molecular Dynamics Program for Molecules

    Energy Science and Technology Software Center (OSTI)

    1995-03-07

    ParBond is a parallel classical molecular dynamics code that models bonded molecular systems, typically of an organic nature. It uses classical force fields for both non-bonded Coulombic and Van der Waals interactions and for 2-, 3-, and 4-body bonded (bond, angle, dihedral, and improper) interactions. It integrates Newton''s equation of motion for the molecular system and evaluates various thermodynamical properties of the system as it progresses.

  16. Parallel log structured file system collective buffering to achieve a compact representation of scientific and/or dimensional data

    DOE Patents [OSTI]

    Grider, Gary A.; Poole, Stephen W.

    2015-09-01

    Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.

  17. Distributed Generation

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Untapped Value of Backup Generation While new guidelines and regulations such as IEEE (Institute of Electrical and Electronics Engineers) 1547 have come a long way in addressing interconnection standards for distributed generation, utilities have largely overlooked the untapped potential of these resources. Under certain conditions, these units (primarily backup generators) represent a significant source of power that can deliver utility services at lower costs than traditional centralized

  18. Parallel Environment for the Creation of Stochastics 1.0

    Energy Science and Technology Software Center (OSTI)

    2011-01-06

    PECOS is a computational library for creating and manipulating realizations of stochastic quantities, including scalar uncertain variables, random fields, and stochastic processes. It offers a unified interface to univariate and multivariate polynomial approximations using either orthogonal or interpolation polynomials; numerical integration drivers for Latin hypercube sampling, quadrature, cubature, and sparse grids; and fast Fourier transforms using third party libraries. The PECOS core also offers statistical utilities and transformations between various representations of stochastic uncertainty. PECOSmore » provides a C++ API through which users can generate and transform realizations of stochastic quantities. It is currently used by Sandia’s DAKOTA, Stokhos, and Encore software packages for uncertainty quantification and verification. PECOS generates random sample sets and multi-dimensional integration grids, typically used in forward propagation of scalar uncertainty in computational models (uncertainty quantification (UQ)). PECOS also generates samples of random fields (RFs) and stochastic processes (SPs) from a set of user-defined power spectral densities (PSDs). The RF/SP may be either Gaussian or non-Gaussian and either stationary or nonstationary, and the resulting sample is intended for run-time query by parallel finite element simulation codes. Finally, PECOS supports nonlinear transformations of random variables via the Nataf transformation and extensions.« less

  19. Xyce parallel electronic simulator : reference guide.

    SciTech Connect (OSTI)

    Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

    2011-05-01

    This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.

  20. Xyce(™) Parallel Electronic Simulator

    Energy Science and Technology Software Center (OSTI)

    2013-10-03

    The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.! ! Xyce is primarily used to simulate the voltage and current behavior of a circuitmore » network (a network of electronic devices connected via a conductive network). As a tool, it is mainly used for the design and analysis of electronic circuits.! ! Kirchoff's conservation laws are enforced over a network using modified nodal analysis. This results in a set of differential algebraic equations (DAEs). The resulting nonlinear problem is solved iteratively using a fully coupled Newton method, which in turn results in a linear system that is solved by either a standard sparse-direct solver or iteratively using Trilinos linear solver packages, also developed at Sandia National Laboratories.« less

  1. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect (OSTI)

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  2. Distributed Generation

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    and regulations such as IEEE (Institute of Electrical and Electronics Engineers) 1547 have come a long way in addressing interconnection standards for distributed generation, ...

  3. Storing files in a parallel computing system based on user-specified parser function

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

    2014-10-21

    Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

  4. A mirror for lab-based quasi-monochromatic parallel x-rays

    SciTech Connect (OSTI)

    Nguyen, Thanhhai; Lu, Xun; Lee, Chang Jun; Jeon, Insu; Jung, Jin-Ho; Jin, Gye-Hwan; Kim, Sung Youb

    2014-09-15

    A multilayered parabolic mirror with six W/Al bilayers was designed and fabricated to generate monochromatic parallel x-rays using a lab-based x-ray source. Using this mirror, curved bright bands were obtained in x-ray images as reflected x-rays. The parallelism of the reflected x-rays was investigated using the shape of the bands. The intensity and monochromatic characteristics of the reflected x-rays were evaluated through measurements of the x-ray spectra in the band. High intensity, nearly monochromatic, and parallel x-rays, which can be used for high resolution x-ray microscopes and local radiation therapy systems, were obtained.

  5. Magnetohydrodynamic generator electrode

    DOE Patents [OSTI]

    Marchant, David D.; Killpatrick, Don H.; Herman, Harold; Kuczen, Kenneth D.

    1979-01-01

    An improved electrode for use as a current collector in the channel of a magnetohydrodynamid (MHD) generator utilizes an elongated monolithic cap of dense refractory material compliantly mounted to the MHD channel frame for collecting the current. The cap has a central longitudinal channel which contains a first layer of porous refractory ceramic as a high-temperature current leadout from the cap and a second layer of resilient wire mesh in contact with the first layer as a low-temperature current leadout between the first layer and the frame. Also described is a monolithic ceramic insulator compliantly mounted to the frame parallel to the electrode by a plurality of flexible metal strips.

  6. Characterizing the parallelism in rule-based expert systems

    SciTech Connect (OSTI)

    Douglass, R.J.

    1984-01-01

    A brief review of two classes of rule-based expert systems is presented, followed by a detailed analysis of potential sources of parallelism at the production or rule level, the subrule level (including match, select, and act parallelism), and at the search level (including AND, OR, and stream parallelism). The potential amount of parallelism from each source is discussed and characterized in terms of its granularity, inherent serial constraints, efficiency, speedup, dynamic behavior, and communication volume, frequency, and topology. Subrule parallelism will yield, at best, two- to tenfold speedup, and rule level parallelism will yield a modest speedup on the order of 5 to 10 times. Rule level can be combined with OR, AND, and stream parallelism in many instances to yield further parallel speedups.

  7. A Massively Parallel Solver for the Mechanical Harmonic Analysis...

    Office of Scientific and Technical Information (OSTI)

    Details In-Document Search Title: A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities ACE3P is a 3D massively parallel simulation suite that...

  8. SimFS: A Large Scale Parallel File System Simulator

    Energy Science and Technology Software Center (OSTI)

    2011-08-30

    The software provides both framework and tools to simulate a large-scale parallel file system such as Lustre.

  9. Parallelizing AT with MatlabMPI

    SciTech Connect (OSTI)

    Li, Evan Y.; /Brown U. /SLAC

    2011-06-22

    The Accelerator Toolbox (AT) is a high-level collection of tools and scripts specifically oriented toward solving problems dealing with computational accelerator physics. It is integrated into the MATLAB environment, which provides an accessible, intuitive interface for accelerator physicists, allowing researchers to focus the majority of their efforts on simulations and calculations, rather than programming and debugging difficulties. Efforts toward parallelization of AT have been put in place to upgrade its performance to modern standards of computing. We utilized the packages MatlabMPI and pMatlab, which were developed by MIT Lincoln Laboratory, to set up a message-passing environment that could be called within MATLAB, which set up the necessary pre-requisites for multithread processing capabilities. On local quad-core CPUs, we were able to demonstrate processor efficiencies of roughly 95% and speed increases of nearly 380%. By exploiting the efficacy of modern-day parallel computing, we were able to demonstrate incredibly efficient speed increments per processor in AT's beam-tracking functions. Extrapolating from prediction, we can expect to reduce week-long computation runtimes to less than 15 minutes. This is a huge performance improvement and has enormous implications for the future computing power of the accelerator physics group at SSRL. However, one of the downfalls of parringpass is its current lack of transparency; the pMatlab and MatlabMPI packages must first be well-understood by the user before the system can be configured to run the scripts. In addition, the instantiation of argument parameters requires internal modification of the source code. Thus, parringpass, cannot be directly run from the MATLAB command line, which detracts from its flexibility and user-friendliness. Future work in AT's parallelization will focus on development of external functions and scripts that can be called from within MATLAB and configured on multiple nodes, while

  10. Large-eddy simulation of the Rayleigh-Taylor instability on a massively parallel computer

    SciTech Connect (OSTI)

    Amala, P.A.K.

    1995-03-01

    A computational model for the solution of the three-dimensional Navier-Stokes equations is developed. This model includes a turbulence model: a modified Smagorinsky eddy-viscosity with a stochastic backscatter extension. The resultant equations are solved using finite difference techniques: the second-order explicit Lax-Wendroff schemes. This computational model is implemented on a massively parallel computer. Programming models on massively parallel computers are next studied. It is desired to determine the best programming model for the developed computational model. To this end, three different codes are tested on a current massively parallel computer: the CM-5 at Los Alamos. Each code uses a different programming model: one is a data parallel code; the other two are message passing codes. Timing studies are done to determine which method is the fastest. The data parallel approach turns out to be the fastest method on the CM-5 by at least an order of magnitude. The resultant code is then used to study a current problem of interest to the computational fluid dynamics community. This is the Rayleigh-Taylor instability. The Lax-Wendroff methods handle shocks and sharp interfaces poorly. To this end, the Rayleigh-Taylor linear analysis is modified to include a smoothed interface. The linear growth rate problem is then investigated. Finally, the problem of the randomly perturbed interface is examined. Stochastic backscatter breaks the symmetry of the stationary unstable interface and generates a mixing layer growing at the experimentally observed rate. 115 refs., 51 figs., 19 tabs.

  11. A brief parallel I/O tutorial.

    SciTech Connect (OSTI)

    Ward, H. Lee

    2010-03-01

    This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about

  12. Parallel State Estimation Assessment with Practical Data

    SciTech Connect (OSTI)

    Chen, Yousu; Jin, Shuangshuang; Rice, Mark J.; Huang, Zhenyu

    2014-10-31

    This paper presents a full-cycle parallel state estimation (PSE) implementation using a preconditioned conjugate gradient algorithm. The developed code is able to solve large-size power system state estimation within 5 seconds using real-world data, comparable to the Supervisory Control And Data Acquisition (SCADA) rate. This achievement allows the operators to know the system status much faster to help improve grid reliability. Case study results of the Bonneville Power Administration (BPA) system with real measurements are presented. The benefits of fast state estimation are also discussed.

  13. Parallel heater system for subsurface formations

    DOE Patents [OSTI]

    Harris, Christopher Kelvin (Houston, TX); Karanikas, John Michael (Houston, TX); Nguyen, Scott Vinh (Houston, TX)

    2011-10-25

    A heating system for a subsurface formation is disclosed. The system includes a plurality of substantially horizontally oriented or inclined heater sections located in a hydrocarbon containing layer in the formation. At least a portion of two of the heater sections are substantially parallel to each other. The ends of at least two of the heater sections in the layer are electrically coupled to a substantially horizontal, or inclined, electrical conductor oriented substantially perpendicular to the ends of the at least two heater sections.

  14. Requirements for Parallel I/O,

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Requirements for Parallel I/O, ! Visualization and Analysis Prabhat 1 , Q uincey K oziol 2 1 LBL/NERSC 2 The H DF G roup NERSC A SCR R equirements f or 2 017 January 1 5, 2 014 LBNL 1. Project Description! * m636 r epo * LBL V is B ase P rogram ( Bethel P I) [ PM: N owell] * Conduct f undamental a nd a pplied vis/analyEcs R &D t o address e xascale c hallenges * ExaHDF5 P roject ( Prabhat, Q uincey P Is) [ PM: Nowell] * Scale P arallel I /O, a nd d ata m anagement t echnologies f or current

  15. Carbothermic reduction with parallel heat sources

    DOE Patents [OSTI]

    Troup, Robert L.; Stevenson, David T.

    1984-12-04

    Disclosed are apparatus and method of carbothermic direct reduction for producing an aluminum alloy from a raw material mix including aluminum oxide, silicon oxide, and carbon wherein parallel heat sources are provided by a combustion heat source and by an electrical heat source at essentially the same position in the reactor, e.g., such as at the same horizontal level in the path of a gravity-fed moving bed in a vertical reactor. The present invention includes providing at least 79% of the heat energy required in the process by the electrical heat source.

  16. Digitally programmable signal generator and method

    DOE Patents [OSTI]

    Priatko, G.J.; Kaskey, J.A.

    1989-11-14

    Disclosed is a digitally programmable waveform generator for generating completely arbitrary digital or analog waveforms from very low frequencies to frequencies in the gigasample per second range. A memory array with multiple parallel outputs is addressed; then the parallel output data is latched into buffer storage from which it is serially multiplexed out at a data rate many times faster than the access time of the memory array itself. While data is being multiplexed out serially, the memory array is accessed with the next required address and presents its data to the buffer storage before the serial multiplexing of the last group of data is completed, allowing this new data to then be latched into the buffer storage for smooth continuous serial data output. In a preferred implementation, a plurality of these serial data outputs are paralleled to form the input to a digital to analog converter, providing a programmable analog output. 6 figs.

  17. Digitally programmable signal generator and method

    DOE Patents [OSTI]

    Priatko, Gordon J.; Kaskey, Jeffrey A.

    1989-01-01

    A digitally programmable waveform generator for generating completely arbitrary digital or analog waveforms from very low frequencies to frequencies in the gigasample per second range. A memory array with multiple parallel outputs is addressed; then the parallel output data is latched into buffer storage from which it is serially multiplexed out at a data rate many times faster than the access time of the memory array itself. While data is being multiplexed out serially, the memory array is accessed with the next required address and presents its data to the buffer storage before the serial multiplexing of the last group of data is completed, allowing this new data to then be latched into the buffer storage for smooth continuous serial data output. In a preferred implementation, a plurality of these serial data outputs are paralleled to form the input to a digital to analog converter, providing a programmable analog output.

  18. Processing data communications events by awakening threads in parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2016-03-15

    Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.

  19. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, D.B.

    1994-07-19

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination. 9 figs.

  20. Switch for serial or parallel communication networks

    DOE Patents [OSTI]

    Crosette, Dario B.

    1994-01-01

    A communication switch apparatus and a method for use in a geographically extensive serial, parallel or hybrid communication network linking a multi-processor or parallel processing system has a very low software processing overhead in order to accommodate random burst of high density data. Associated with each processor is a communication switch. A data source and a data destination, a sensor suite or robot for example, may also be associated with a switch. The configuration of the switches in the network are coordinated through a master processor node and depends on the operational phase of the multi-processor network: data acquisition, data processing, and data exchange. The master processor node passes information on the state to be assumed by each switch to the processor node associated with the switch. The processor node then operates a series of multi-state switches internal to each communication switch. The communication switch does not parse and interpret communication protocol and message routing information. During a data acquisition phase, the communication switch couples sensors producing data to the processor node associated with the switch, to a downlink destination on the communications network, or to both. It also may couple an uplink data source to its processor node. During the data exchange phase, the switch couples its processor node or an uplink data source to a downlink destination (which may include a processor node or a robot), or couples an uplink source to its processor node and its processor node to a downlink destination.

  1. Parallel tetrahedral mesh refinement with MOAB.

    SciTech Connect (OSTI)

    Thompson, David C.; Pebay, Philippe Pierre

    2008-12-01

    In this report, we present the novel functionality of parallel tetrahedral mesh refinement which we have implemented in MOAB. This report details work done to implement parallel, edge-based, tetrahedral refinement into MOAB. The theoretical basis for this work is contained in [PT04, PT05, TP06] while information on design, performance, and operation specific to MOAB are contained herein. As MOAB is intended mainly for use in pre-processing and simulation (as opposed to the post-processing bent of previous papers), the primary use case is different: rather than refining elements with non-linear basis functions, the goal is to increase the number of degrees of freedom in some region in order to more accurately represent the solution to some system of equations that cannot be solved analytically. Also, MOAB has a unique mesh representation which impacts the algorithm. This introduction contains a brief review of streaming edge-based tetrahedral refinement. The remainder of the report is broken into three sections: design and implementation, performance, and conclusions. Appendix A contains instructions for end users (simulation authors) on how to employ the refiner.

  2. Fuel dissipater for pressurized fuel cell generators

    DOE Patents [OSTI]

    Basel, Richard A.; King, John E.

    2003-11-04

    An apparatus and method are disclosed for eliminating the chemical energy of fuel remaining in a pressurized fuel cell generator (10) when the electrical power output of the fuel cell generator is terminated during transient operation, such as a shutdown; where, two electrically resistive elements (two of 28, 53, 54, 55) at least one of which is connected in parallel, in association with contactors (26, 57, 58, 59), a multi-point settable sensor relay (23) and a circuit breaker (24), are automatically connected across the fuel cell generator terminals (21, 22) at two or more contact points, in order to draw current, thereby depleting the fuel inventory in the generator.

  3. Paradyn a parallel nonlinear, explicit, three-dimensional finite-element code for solid and structural mechanics user manual

    SciTech Connect (OSTI)

    Hoover, C G; DeGroot, A J; Sherwood, R J

    2000-06-01

    ParaDyn is a parallel version of the DYNA3D computer program, a three-dimensional explicit finite-element program for analyzing the dynamic response of solids and structures. The ParaDyn program has been used as a production tool for over three years for analyzing problems which range in size from a few tens of thousands of elements to between one-million and ten-million elements. ParaDyn runs on parallel computers provided by the Department of Energy Accelerated Strategic Computing Initiative (ASCI) and the Department of Defense High Performance Computing and Modernization Program. Preprocessing and post-processing software utilities and tools are designed to facilitate the generation of partitioned domains for processors on a massively parallel computer and the visualization of both resultant data and boundary data generated in a parallel simulation. This manual provides a brief overview of the parallel implementation; describes techniques for running the ParaDyn program, tools and utilities; and provides examples of parallel simulations.

  4. PULSE GENERATOR

    DOE Patents [OSTI]

    Roeschke, C.W.

    1957-09-24

    An improvement in pulse generators is described by which there are produced pulses of a duration from about 1 to 10 microseconds with a truly flat top and extremely rapid rise and fall. The pulses are produced by triggering from a separate input or by modifying the current to operate as a free-running pulse generator. In its broad aspect, the disclosed pulse generator comprises a first tube with an anode capacitor and grid circuit which controls the firing; a second tube series connected in the cathode circuit of the first tube such that discharge of the first tube places a voltage across it as the leading edge of the desired pulse; and an integrator circuit from the plate across the grid of the second tube to control the discharge time of the second tube, determining the pulse length.

  5. Microwave generator

    DOE Patents [OSTI]

    Kwan, T.J.T.; Snell, C.M.

    1987-03-31

    A microwave generator is provided for generating microwaves substantially from virtual cathode oscillation. Electrons are emitted from a cathode and accelerated to an anode which is spaced apart from the cathode. The anode has an annular slit there through effective to form the virtual cathode. The anode is at least one range thickness relative to electrons reflecting from the virtual cathode. A magnet is provided to produce an optimum magnetic field having the field strength effective to form an annular beam from the emitted electrons in substantial alignment with the annular anode slit. The magnetic field, however, does permit the reflected electrons to axially diverge from the annular beam. The reflected electrons are absorbed by the anode in returning to the real cathode, such that substantially no reflexing electrons occur. The resulting microwaves are produced with a single dominant mode and are substantially monochromatic relative to conventional virtual cathode microwave generators. 6 figs.

  6. Sub-Second Parallel State Estimation

    SciTech Connect (OSTI)

    Chen, Yousu; Rice, Mark J.; Glaesemann, Kurt R.; Wang, Shaobu; Huang, Zhenyu

    2014-10-31

    This report describes the performance of Pacific Northwest National Laboratory (PNNL) sub-second parallel state estimation (PSE) tool using the utility data from the Bonneville Power Administrative (BPA) and discusses the benefits of the fast computational speed for power system applications. The test data were provided by BPA. They are two-days’ worth of hourly snapshots that include power system data and measurement sets in a commercial tool format. These data are extracted out from the commercial tool box and fed into the PSE tool. With the help of advanced solvers, the PSE tool is able to solve each BPA hourly state estimation problem within one second, which is more than 10 times faster than today’s commercial tool. This improved computational performance can help increase the reliability value of state estimation in many aspects: (1) the shorter the time required for execution of state estimation, the more time remains for operators to take appropriate actions, and/or to apply automatic or manual corrective control actions. This increases the chances of arresting or mitigating the impact of cascading failures; (2) the SE can be executed multiple times within time allowance. Therefore, the robustness of SE can be enhanced by repeating the execution of the SE with adaptive adjustments, including removing bad data and/or adjusting different initial conditions to compute a better estimate within the same time as a traditional state estimator’s single estimate. There are other benefits with the sub-second SE, such as that the PSE results can potentially be used in local and/or wide-area automatic corrective control actions that are currently dependent on raw measurements to minimize the impact of bad measurements, and provides opportunities to enhance the power grid reliability and efficiency. PSE also can enable other advanced tools that rely on SE outputs and could be used to further improve operators’ actions and automated controls to mitigate effects

  7. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    2014-04-30

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  8. Parallel detecting, spectroscopic ellipsometers/polarimeters

    DOE Patents [OSTI]

    Furtak, Thomas E.

    2002-01-01

    The parallel detecting spectroscopic ellipsometer/polarimeter sensor has no moving parts and operates in real-time for in-situ monitoring of the thin film surface properties of a sample within a processing chamber. It includes a multi-spectral source of radiation for producing a collimated beam of radiation directed towards the surface of the sample through a polarizer. The thus polarized collimated beam of radiation impacts and is reflected from the surface of the sample, thereby changing its polarization state due to the intrinsic material properties of the sample. The light reflected from the sample is separated into four separate polarized filtered beams, each having individual spectral intensities. Data about said four individual spectral intensities is collected within the processing chamber, and is transmitted into one or more spectrometers. The data of all four individual spectral intensities is then analyzed using transformation algorithms, in real-time.

  9. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Miller, Douglas R.; Parker, Jeffrey J.; Ratterman, Joseph D.; Smith, Brian E.

    2013-09-03

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  10. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Ahmad A

    2013-04-16

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  11. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2013-07-23

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a compute node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  12. Intranode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Ratterman, Joseph D; Smith, Brian E

    2014-01-07

    Intranode data communications in a parallel computer that includes compute nodes configured to execute processes, where the data communications include: allocating, upon initialization of a first process of a computer node, a region of shared memory; establishing, by the first process, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; sending, to a second process on the same compute node, a data communications message without determining whether the second process has been initialized, including storing the data communications message in the message buffer of the second process; and upon initialization of the second process: retrieving, by the second process, a pointer to the second process's message buffer; and retrieving, by the second process from the second process's message buffer in dependence upon the pointer, the data communications message sent by the first process.

  13. Internode data communications in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Miller, Douglas R; Parker, Jeffrey J; Ratterman, Joseph D; Smith, Brian E

    2014-02-11

    Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.

  14. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2014-10-21

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  15. Clock Agreement Among Parallel Supercomputer Nodes

    DOE Data Explorer [Office of Scientific and Technical Information (OSTI)]

    Jones, Terry R.; Koenig, Gregory A.

    This dataset presents measurements that quantify the clock synchronization time-agreement characteristics among several high performance computers including the current world's most powerful machine for open science, the U.S. Department of Energy's Titan machine sited at Oak Ridge National Laboratory. These ultra-fast machines derive much of their computational capability from extreme node counts (over 18000 nodes in the case of the Titan machine). Time-agreement is commonly utilized by parallel programming applications and tools, distributed programming application and tools, and system software. Our time-agreement measurements detail the degree of time variance between nodes and how that variance changes over time. The dataset includes empirical measurements and the accompanying spreadsheets.

  16. LAPACK BLAS Parallel BLAS ScaLAPACK

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    LAPACK BLAS Parallel BLAS ScaLAPACK (E.g., MPI, PVM) PBLAS Local Addressing Global Addressing man intro_blas3 man intro_blacs man intro_lapack BLACS Message Passing Primitives man intro_scalapack Basic Lin. Alg. Communication Subprograms 0 1 2 3 0 4 5 0 1 2 1 NB M N MB a a a a a a a a a a a a a a a a a a a a a a a a 11 12 13 14 a a a a a 15 16 17 18 19 a a a a a a a a a 21 22 23 24 25 26 27 28 29 a a a a a a a a a a a a a a a a a a a a a a a a a 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 47 48

  17. Optimized data communications in a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2014-08-19

    A parallel computer includes nodes that include a network adapter that couples the node in a point-to-point network and supports communications in opposite directions of each dimension. Optimized communications include: receiving, by a network adapter of a receiving compute node, a packet--from a source direction--that specifies a destination node and deposit hints. Each hint is associated with a direction within which the packet is to be deposited. If a hint indicates the packet to be deposited in the opposite direction: the adapter delivers the packet to an application on the receiving node; forwards the packet to a next node in the opposite direction if the receiving node is not the destination; and forwards the packet to a node in a direction of a subsequent dimension if the hints indicate that the packet is to be deposited in the direction of the subsequent dimension.

  18. CS-Studio Scan System Parallelization

    SciTech Connect (OSTI)

    Kasemir, Kay; Pearson, Matthew R

    2015-01-01

    For several years, the Control System Studio (CS-Studio) Scan System has successfully automated the operation of beam lines at the Oak Ridge National Laboratory (ORNL) High Flux Isotope Reactor (HFIR) and Spallation Neutron Source (SNS). As it is applied to additional beam lines, we need to support simultaneous adjustments of temperatures or motor positions. While this can be implemented via virtual motors or similar logic inside the Experimental Physics and Industrial Control System (EPICS) Input/Output Controllers (IOCs), doing so requires a priori knowledge of experimenters requirements. By adding support for the parallel control of multiple process variables (PVs) to the Scan System, we can better support ad hoc automation of experiments that benefit from such simultaneous PV adjustments.

  19. Parallelism of the SANDstorm hash algorithm.

    SciTech Connect (OSTI)

    Torgerson, Mark Dolan; Draelos, Timothy John; Schroeppel, Richard Crabtree

    2009-09-01

    Mainstream cryptographic hashing algorithms are not parallelizable. This limits their speed and they are not able to take advantage of the current trend of being run on multi-core platforms. Being limited in speed limits their usefulness as an authentication mechanism in secure communications. Sandia researchers have created a new cryptographic hashing algorithm, SANDstorm, which was specifically designed to take advantage of multi-core processing and be parallelizable on a wide range of platforms. This report describes a late-start LDRD effort to verify the parallelizability claims of the SANDstorm designers. We have shown, with operating code and bench testing, that the SANDstorm algorithm may be trivially parallelized on a wide range of hardware platforms. Implementations using OpenMP demonstrates a linear speedup with multiple cores. We have also shown significant performance gains with optimized C code and the use of assembly instructions to exploit particular platform capabilities.

  20. Broadcasting a message in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Faraj, Daniel A

    2014-11-18

    Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.

  1. Link failure detection in a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J.; Blocksome, Michael A.; Megerian, Mark G.; Smith, Brian E.

    2010-11-09

    Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.

  2. Substantially parallel flux uncluttered rotor machines

    DOE Patents [OSTI]

    Hsu, John S.

    2012-12-11

    A permanent magnet-less and brushless synchronous system includes a stator that generates a magnetic rotating field when sourced by polyphase alternating currents. An uncluttered rotor is positioned within the magnetic rotating field and is spaced apart from the stator. An excitation core is spaced apart from the stator and the uncluttered rotor and magnetically couples the uncluttered rotor. The brushless excitation source generates a magnet torque by inducing magnetic poles near an outer peripheral surface of the uncluttered rotor, and the stator currents also generate a reluctance torque by a reaction of the difference between the direct and quadrature magnetic paths of the uncluttered rotor. The system can be used either as a motor or a generator

  3. Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale

    SciTech Connect (OSTI)

    Daily, Jeffrey A.

    2015-05-01

    The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore’s law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of already annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K cores

  4. Methods and apparatus for multi-resolution replication of files in a parallel computing system using semantic information

    DOE Patents [OSTI]

    Faibish, Sorin; Bent, John M.; Tzelnic, Percy; Grider, Gary; Torres, Aaron

    2015-10-20

    Techniques are provided for storing files in a parallel computing system using different resolutions. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a sub-file. The method comprises the steps of obtaining semantic information related to the file; generating a plurality of replicas of the file with different resolutions based on the semantic information; and storing the file and the plurality of replicas of the file in one or more storage nodes of the parallel computing system. The different resolutions comprise, for example, a variable number of bits and/or a different sub-set of data elements from the file. A plurality of the sub-files can be merged to reproduce the file.

  5. Magnetocumulative generator

    DOE Patents [OSTI]

    Pettibone, J.S.; Wheeler, P.C.

    1981-06-08

    An improved magnetocumulative generator is described that is useful for producing magnetic fields of very high energy content over large spatial volumes. The polar directed pleated magnetocumulative generator has a housing providing a housing chamber with an electrically conducting surface. The chamber forms a coaxial system having a small radius portion and a large radius portion. When a magnetic field is injected into the chamber, from an external source, most of the magnetic flux associated therewith positions itself in the small radius portion. The propagation of an explosive detonation through high-explosive layers disposed adjacent to the housing causes a phased closure of the chamber which sweeps most of the magnetic flux into the large radius portion of the coaxial system. The energy content of the magnetic field is greatly increased by flux stretching as well as by flux compression. The energy enhanced magnetic field is utilized within the housing chamber itself.

  6. PLASMA GENERATOR

    DOE Patents [OSTI]

    Foster, J.S. Jr.

    1958-03-11

    This patent describes apparatus for producing an electricity neutral ionized gas discharge, termed a plasma, substantially free from contamination with neutral gas particles. The plasma generator of the present invention comprises a plasma chamber wherein gas introduced into the chamber is ionized by a radiofrequency source. A magnetic field is used to focus the plasma in line with an exit. This magnetic field cooperates with a differential pressure created across the exit to draw a uniform and uncontaminated plasma from the plasma chamber.

  7. Thermoelectric generator

    DOE Patents [OSTI]

    Pryslak, N.E.

    1974-02-26

    A thermoelectric generator having a rigid coupling or stack'' between the heat source and the hot strap joining the thermoelements is described. The stack includes a member of an insulating material, such as ceramic, for electrically isolating the thermoelements from the heat source, and a pair of members of a ductile material, such as gold, one each on each side of the insulating member, to absorb thermal differential expansion stresses in the stack. (Official Gazette)

  8. Cluster generator

    DOE Patents [OSTI]

    Donchev, Todor I.; Petrov, Ivan G.

    2011-05-31

    Described herein is an apparatus and a method for producing atom clusters based on a gas discharge within a hollow cathode. The hollow cathode includes one or more walls. The one or more walls define a sputtering chamber within the hollow cathode and include a material to be sputtered. A hollow anode is positioned at an end of the sputtering chamber, and atom clusters are formed when a gas discharge is generated between the hollow anode and the hollow cathode.

  9. Photon generator

    DOE Patents [OSTI]

    Srinivasan-Rao, Triveni

    2002-01-01

    A photon generator includes an electron gun for emitting an electron beam, a laser for emitting a laser beam, and an interaction ring wherein the laser beam repetitively collides with the electron beam for emitting a high energy photon beam therefrom in the exemplary form of x-rays. The interaction ring is a closed loop, sized and configured for circulating the electron beam with a period substantially equal to the period of the laser beam pulses for effecting repetitive collisions.

  10. Electric generator

    DOE Patents [OSTI]

    Foster, Jr., John S.; Wilson, James R.; McDonald, Jr., Charles A.

    1983-01-01

    1. In an electrical energy generator, the combination comprising a first elongated annular electrical current conductor having at least one bare surface extending longitudinally and facing radially inwards therein, a second elongated annular electrical current conductor disposed coaxially within said first conductor and having an outer bare surface area extending longitudinally and facing said bare surface of said first conductor, the contiguous coaxial areas of said first and second conductors defining an inductive element, means for applying an electrical current to at least one of said conductors for generating a magnetic field encompassing said inductive element, and explosive charge means disposed concentrically with respect to said conductors including at least the area of said inductive element, said explosive charge means including means disposed to initiate an explosive wave front in said explosive advancing longitudinally along said inductive element, said wave front being effective to progressively deform at least one of said conductors to bring said bare surfaces thereof into electrically conductive contact to progressively reduce the inductance of the inductive element defined by said conductors and transferring explosive energy to said magnetic field effective to generate an electrical potential between undeformed portions of said conductors ahead of said explosive wave front.

  11. Data communications in a parallel active messaging interface of a parallel computer

    SciTech Connect (OSTI)

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-16

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  12. Data communications in a parallel active messaging interface of a parallel computer

    SciTech Connect (OSTI)

    Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

    2014-09-02

    Eager send data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints that specify a client, a context, and a task, including receiving an eager send data communications instruction with transfer data disposed in a send buffer characterized by a read/write send buffer memory address in a read/write virtual address space of the origin endpoint; determining for the send buffer a read-only send buffer memory address in a read-only virtual address space, the read-only virtual address space shared by both the origin endpoint and the target endpoint, with all frames of physical memory mapped to pages of virtual memory in the read-only virtual address space; and communicating by the origin endpoint to the target endpoint an eager send message header that includes the read-only send buffer memory address.

  13. Data communications for a collective operation in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A

    2013-07-16

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  14. Data communications for a collective operation in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Faraj, Daniel A.

    2015-11-19

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and bit masks; receiving in an origin endpoint of the PAMI a collective instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint; constructing a bit mask for the received collective instruction; selecting, from among the associated algorithms and bit masks, a data communications algorithm in dependence upon the constructed bit mask; and executing the collective instruction, transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  15. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2015-02-03

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

  16. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2013-09-03

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  17. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-02

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task; the compute nodes coupled for data communications through the PAMI and through data communications resources including at least one segment of shared random access memory; including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers through a segment of shared memory; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  18. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-30

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint comprising a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI and through data communications resources including a deterministic data communications network, including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  19. Fencing data transfers in a parallel active messaging interface of a parallel computer

    SciTech Connect (OSTI)

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-08-11

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint comprising a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI and through data communications resources including a deterministic data communications network, including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  20. Fencing data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A.; Mamidala, Amith R.

    2015-06-09

    Fencing data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task; the compute nodes coupled for data communications through the PAMI and through data communications resources including at least one segment of shared random access memory; including initiating execution through the PAMI of an ordered sequence of active SEND instructions for SEND data transfers between two endpoints, effecting deterministic SEND data transfers through a segment of shared memory; and executing through the PAMI, with no FENCE accounting for SEND data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all SEND instructions initiated prior to execution of the FENCE instruction for SEND data transfers between the two endpoints.

  1. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2014-11-18

    Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (`RTS`) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.

  2. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D.; Faraj, Daniel A.

    2014-07-22

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  3. Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Blocksome, Michael A; Mamidala, Amith R

    2014-02-11

    Fencing direct memory access (`DMA`) data transfers in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two endpoints.

  4. Data communications in a parallel active messaging interface of a parallel computer

    DOE Patents [OSTI]

    Davis, Kristan D; Faraj, Daniel A

    2013-07-09

    Algorithm selection for data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI, including associating in the PAMI data communications algorithms and ranges of message sizes so that each algorithm is associated with a separate range of message sizes; receiving in an origin endpoint of the PAMI a data communications instruction, the instruction specifying transmission of a data communications message from the origin endpoint to a target endpoint, the data communications message characterized by a message size; selecting, from among the associated algorithms and ranges, a data communications algorithm in dependence upon the message size; and transmitting, according to the selected data communications algorithm from the origin endpoint to the target endpoint, the data communications message.

  5. The Swift Parallel Scripting Language for ALCF Systems | Argonne Leadership

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Computing Facility Projects bgclang Compiler Cobalt Scheduler GLEAN Petrel Swift The Swift Parallel Scripting Language for ALCF Systems Swift is an implicitly parallel functional language that makes it easier to script higher-level applications or workflows composed from serial or parallel programs. Recently made available across ALCF systems, it has been used to script application workflows in a broad range of diverse disciplines from protein structure prediction to modeling global

  6. DGDFT: A massively parallel method for large scale density functional theory calculations

    SciTech Connect (OSTI)

    Hu, Wei Yang, Chao; Lin, Lin

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  7. Massively Parallel LES of Azimuthal Thermo-Acoustic Instabilities...

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    Massively Parallel LES of Azimuthal Thermo-Acoustic Instabilities in Annular Gas Turbines Authors: Wolf, P., Staffelbach, G., Roux, A., Gicquel, L., Poinsot, T., Moureau, V. ...

  8. A set of parallel, implicit methods for a reconstructed discontinuous...

    Office of Scientific and Technical Information (OSTI)

    Journal Article: A set of parallel, implicit methods for a reconstructed discontinuous Galerkin method for compressible flows on 3D hybrid grids Citation Details In-Document Search...

  9. A Comprehensive Look at High Performance Parallel I/O

    Broader source: All U.S. Department of Energy (DOE) Office Webpages (Extended Search)

    A Comprehensive Look at High Performance Parallel I/O A Comprehensive Look at High Performance Parallel I/O Book Signing @ SC14! Nov. 18, 5 p.m. in Booth 1939 November 10, 2014 Contact: Linda Vu, +1 510 495 2402, lvu@lbl.gov HighPerf Parallel IO In the 1990s, high performance computing (HPC) made a dramatic transition to massively parallel processors. As this model solidified over the next 20 years, supercomputing performance increased from gigaflops-billions of calculations per second-to

  10. Massively Parallel Models of the Human Circulatory System (Conference...

    Office of Scientific and Technical Information (OSTI)

    Massively Parallel Models of the Human Circulatory System Citation Details In-Document ... Sponsoring Org: USDOE Country of Publication: United States Language: English Subject: 59 ...