skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An open source platform for multi-scale spatially distributed simulations of microbial ecosystems

Abstract

The goal of this project was to develop a tool for facilitating simulation, validation and discovery of multiscale dynamical processes in microbial ecosystems. This led to the development of an open-source software platform for Computation Of Microbial Ecosystems in Time and Space (COMETS). COMETS performs spatially distributed time-dependent flux balance based simulations of microbial metabolism. Our plan involved building the software platform itself, calibrating and testing it through comparison with experimental data, and integrating simulations and experiments to address important open questions on the evolution and dynamics of cross-feeding interactions between microbial species.

Authors:
 [1]
  1. Boston Univ., MA (United States)
Publication Date:
Research Org.:
Boston Univ., MA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
OSTI Identifier:
1228383
Report Number(s):
DE-SC0004962
DOE Contract Number:
SC0004962
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Segre, Daniel. An open source platform for multi-scale spatially distributed simulations of microbial ecosystems. United States: N. p., 2014. Web. doi:10.2172/1228383.
Segre, Daniel. An open source platform for multi-scale spatially distributed simulations of microbial ecosystems. United States. doi:10.2172/1228383.
Segre, Daniel. Thu . "An open source platform for multi-scale spatially distributed simulations of microbial ecosystems". United States. doi:10.2172/1228383. https://www.osti.gov/servlets/purl/1228383.
@article{osti_1228383,
title = {An open source platform for multi-scale spatially distributed simulations of microbial ecosystems},
author = {Segre, Daniel},
abstractNote = {The goal of this project was to develop a tool for facilitating simulation, validation and discovery of multiscale dynamical processes in microbial ecosystems. This led to the development of an open-source software platform for Computation Of Microbial Ecosystems in Time and Space (COMETS). COMETS performs spatially distributed time-dependent flux balance based simulations of microbial metabolism. Our plan involved building the software platform itself, calibrating and testing it through comparison with experimental data, and integrating simulations and experiments to address important open questions on the evolution and dynamics of cross-feeding interactions between microbial species.},
doi = {10.2172/1228383},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Aug 14 00:00:00 EDT 2014},
month = {Thu Aug 14 00:00:00 EDT 2014}
}

Technical Report:

Save / Share:
  • The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
  • Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model,more » either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.« less
  • Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
  • First-Principles Molecular Dynamics (FPMD) is an accurate, atomistic simulation approach that is routinely applied to a variety of areas including solid-state physics, chemistry, biochemistry and nanotechnology. FPMD enables one to perform predictive materials simulations, as no empirical or adjustable parameters are used to describe a given system. Instead, a quantum mechanical description of electrons is obtained by solving the Kohn-Sham equations within a pseudopotential plane-wave formalism. This rigorous first-principles treatment of electronic structure is computationally expensive and limits the size of tractable systems to a few hundred atoms on most currently available parallel computers. Developed specifically for large parallel systemsmore » at LLNL's Center for Applied Scientific Computing, the Qbox implementation of the FPMD method shows unprecedented performance and scaling on BlueGene/L.« less