skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An Experimental Metagenome Data Management and AnalysisSystem

Abstract

The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity of microbial community, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Ernest Orlando Lawrence Berkeley NationalLaboratory, Berkeley, CA (US)
Sponsoring Org.:
USDOE Director. Office of Science. Office of Biological andEnvironmental Research. Life Sciences Division
OSTI Identifier:
889805
Report Number(s):
LBNL-60051
R&D Project: KL64EQ; BnR: YN0100000; TRN: US200619%%855
DOE Contract Number:
DE-AC02-05CH11231
Resource Type:
Conference
Resource Relation:
Conference: ISMB 2006, Fortaleza, Brazil, Aug 6-10,2006
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; CAPACITY; GENES; MANAGEMENT

Citation Formats

Markowitz, Victor M., Korzeniewski, Frank, Palaniappan, Krishna, Szeto, Ernest, Ivanova, Natalia N., Kyrpides, Nikos C., and Hugenholtz, Philip. An Experimental Metagenome Data Management and AnalysisSystem. United States: N. p., 2006. Web.
Markowitz, Victor M., Korzeniewski, Frank, Palaniappan, Krishna, Szeto, Ernest, Ivanova, Natalia N., Kyrpides, Nikos C., & Hugenholtz, Philip. An Experimental Metagenome Data Management and AnalysisSystem. United States.
Markowitz, Victor M., Korzeniewski, Frank, Palaniappan, Krishna, Szeto, Ernest, Ivanova, Natalia N., Kyrpides, Nikos C., and Hugenholtz, Philip. Wed . "An Experimental Metagenome Data Management and AnalysisSystem". United States. doi:. https://www.osti.gov/servlets/purl/889805.
@article{osti_889805,
title = {An Experimental Metagenome Data Management and AnalysisSystem},
author = {Markowitz, Victor M. and Korzeniewski, Frank and Palaniappan, Krishna and Szeto, Ernest and Ivanova, Natalia N. and Kyrpides, Nikos C. and Hugenholtz, Philip},
abstractNote = {The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity of microbial community, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Wed Mar 01 00:00:00 EST 2006},
month = {Wed Mar 01 00:00:00 EST 2006}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • New high-throughput DNA sequencing technologies have revolutionized how scientists study the organisms around us. In particular, microbiology - the study of the smallest, unseen organisms that pervade our lives - has embraced these new techniques to characterize and analyze the cellular constituents and use this information to develop novel tools, techniques, and therapeutics. So-called next-generation DNA sequencing platforms have resulted in huge increases in the amount of raw data that can be rapidly generated. Argonne National Laboratory developed the premier platform for the analysis of this new data (mg-rast) that is used by microbiologists worldwide. This paper uses the accountingmore » from the computational analysis of more than 10,000,000,000 bp of DNA sequence data, describes an analysis of the advanced computational requirements, and suggests the level of analysis that will be essential as microbiologists move to understand how these tiny organisms affect our every day lives. The results from this analysis indicate that data analysis is a linear problem, but that most analyses are held up in queues. With sufficient resources, computations could be completed in a few hours for a typical dataset. These data also suggest execution times that delimit timely completion of computational analyses, and provide bounds for problematic processes.« less
  • Scientific user facilities---particle accelerators, telescopes, colliders, supercomputers, light sources, sequencing facilities, and more---operated by the U.S. Department of Energy (DOE) Office of Science (SC) generate ever increasing volumes of data at unprecedented rates from experiments, observations, and simulations. At the same time there is a growing community of experimentalists that require real-time data analysis feedback, to enable them to steer their complex experimental instruments to optimized scientific outcomes and new discoveries. Recent efforts in DOE-SC have focused on articulating the data-centric challenges and opportunities facing these science communities. Key challenges include difficulties coping with data size, rate, and complexity inmore » the context of both real-time and post-experiment data analysis and interpretation. Solutions will require algorithmic and mathematical advances, as well as hardware and software infrastructures that adequately support data-intensive scientific workloads. This paper presents the summary findings of a workshop held by DOE-SC in September 2015, convened to identify the major challenges and the research that is needed to meet those challenges.« less
  • The Chemical Imaging Initiative at the Pacific Northwest National Laboratory (PNNL) is creating a ‘Rapid Experimental Analysis’ (REXAN) Framework, based on the concept of reusable component libraries. REXAN allows developers to quickly compose and customize high throughput analysis pipelines for a range of experiments, as well as supporting the creation of multi-modal analysis pipelines. In addition, PNNL has coupled REXAN with its collaborative data management and analysis environment Velo to create an easy to use data management and analysis environments for experimental facilities. This paper will discuss the benefits of Velo and REXAN in the context of three examples: PNNLmore » High Resolution Mass Spectrometry - reducing analysis times from hours to seconds, and enabling the analysis of much larger data samples (100KB to 40GB) at the same time · ALS X-Ray tomography - reducing analysis times of combined STXM and EM data collected at the ALS from weeks to minutes, decreasing manual work and increasing data volumes that can be analysed in a single step ·Multi-modal nano-scale analysis of STXM and TEM data - providing a semi automated process for particle detection The creation of REXAN has significantly shortened the development time for these analysis pipelines. The integration of Velo and REXAN has significantly increased the scientific productivity of the instruments and their users by creating easy to use data management and analysis environments with greatly reduced analysis times and improved analysis capabilities.« less