skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Challenges in large scale distributed computing: bioinformatics.

Abstract

The amount of genomic data available for study is increasing at a rate similar to that of Moore's law. This deluge of data is challenging bioinformaticians to develop newer, faster and better algorithms for analysis and examination of this data. The growing availability of large scale computing grids coupled with high-performance networking is challenging computer scientists to develop better, faster methods of exploiting parallelism in these biological computations and deploying them across computing grids. In this paper, we describe two computations that are required to be run frequently and which require large amounts of computing resource to complete in a reasonable time. The data for these computations are very large and the sequential computational time can exceed thousands of hours. We show the importance and relevance of these computations, the nature of the data and parallelism and we show how we are meeting the challenge of efficiently distributing and managing these computations in the SEED project.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
937393
Report Number(s):
ANL/MCS/CP-116121
TRN: US200819%%23
DOE Contract Number:  
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: 14th International Symposium on High Performance Distributed Computing (HPDC-14); Jul. 24, 2005 - Jul. 27, 2005; Research Triangle Park, NC
Country of Publication:
United States
Language:
ENGLISH
Subject:
97; 59 BASIC BIOLOGICAL SCIENCES; ALGORITHMS; DATA ANALYSIS; COMPUTER NETWORKS; GENETICS; PARALLEL PROCESSING

Citation Formats

Disz, T, Kubal, M, Olson, R, Overbeek, R, Stevens, R, Mathematics and Computer Science, Univ. of Chicago, and The Fellowship for the Interpretation of Genomes. Challenges in large scale distributed computing: bioinformatics.. United States: N. p., 2005. Web. doi:10.1109/CLADE.2005.1520902.
Disz, T, Kubal, M, Olson, R, Overbeek, R, Stevens, R, Mathematics and Computer Science, Univ. of Chicago, & The Fellowship for the Interpretation of Genomes. Challenges in large scale distributed computing: bioinformatics.. United States. https://doi.org/10.1109/CLADE.2005.1520902
Disz, T, Kubal, M, Olson, R, Overbeek, R, Stevens, R, Mathematics and Computer Science, Univ. of Chicago, and The Fellowship for the Interpretation of Genomes. 2005. "Challenges in large scale distributed computing: bioinformatics.". United States. https://doi.org/10.1109/CLADE.2005.1520902.
@article{osti_937393,
title = {Challenges in large scale distributed computing: bioinformatics.},
author = {Disz, T and Kubal, M and Olson, R and Overbeek, R and Stevens, R and Mathematics and Computer Science and Univ. of Chicago and The Fellowship for the Interpretation of Genomes},
abstractNote = {The amount of genomic data available for study is increasing at a rate similar to that of Moore's law. This deluge of data is challenging bioinformaticians to develop newer, faster and better algorithms for analysis and examination of this data. The growing availability of large scale computing grids coupled with high-performance networking is challenging computer scientists to develop better, faster methods of exploiting parallelism in these biological computations and deploying them across computing grids. In this paper, we describe two computations that are required to be run frequently and which require large amounts of computing resource to complete in a reasonable time. The data for these computations are very large and the sequential computational time can exceed thousands of hours. We show the importance and relevance of these computations, the nature of the data and parallelism and we show how we are meeting the challenge of efficiently distributing and managing these computations in the SEED project.},
doi = {10.1109/CLADE.2005.1520902},
url = {https://www.osti.gov/biblio/937393}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Sat Jan 01 00:00:00 EST 2005},
month = {Sat Jan 01 00:00:00 EST 2005}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: