skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Gaggle: An open-source software system for integrating bioinformatics software and data sources

Journal Article · · BMC Bioinformatics
 [1];  [1];  [2];  [1]
  1. Inst. for Systems Biology, Seattle, WA (United States)
  2. Inst. for Systems Biology, Seattle, WA (United States); New York Univ. (NYU), NY (United States). Dept. of Biology

Background: Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming) data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing. Results: In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment. We demonstrate that four simple data types (names, matrices, networks, and associative arrays) are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools. Conclusion: We have integrated diverse databases (for example, KEGG, BioCyc, String) and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer). Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, proteinprotein and protein-DNA interactions), functional associations (operon, chromosomal proximity, phylogenetic pattern), metabolic pathways (KEGG) and Pubmed abstracts (STRING web resource), creating an exploratory environment useful to 'web browser and spreadsheet biologists', to statistically savvy computational biologists, and those in between. The Gaggle uses Java RMI and Java Web Start technologies and can be found at http://gaggle.systemsbiology.net.

Research Organization:
New York Univ. (NYU), NY (United States); Institute for Systems Biology, Seattle, WA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division; National Science Foundation (NSF)
Grant/Contract Number:
DAAD13-03-O-0057; EF-0313754
OSTI ID:
1626316
Journal Information:
BMC Bioinformatics, Vol. 7, Issue 1; ISSN 1471-2105
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (28)

Systems Biology Experimental Design - Considerations for Building Predictive Gene Regulatory Network Models for Prokaryotic Systems journal November 2004
A Life Scientist's Gateway to Distributed Data Management and Computing: The PathPort/ToolBus Framework journal January 2003
Taverna: a tool for the composition and enactment of bioinformatics workflows journal June 2004
caCORE: A common infrastructure for cancer informatics journal December 2003
The KEGG Database book November 2002
Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks journal November 2003
The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models journal March 2003
BioMOBY: An open source biological web services proposal journal January 2002
TM4: A Free, Open-Source System for Microarray Data Management and Analysis journal February 2003
Bioconductor: open software development for computational biology and bioinformatics journal September 2004
STRING: known and predicted protein-protein associations, integrated and transferred across organisms journal December 2004
Querying and computing with BioCyc databases journal June 2005
On the criteria to be used in decomposing systems into modules journal December 1972
Prolinks: a database of protein functional linkages derived from coevolution journal April 2004
The protein–protein interaction map of Helicobacter pylori journal January 2001
Integrated Access to Metabolic and Genomic Data journal January 1996
The Stanford Microarray Database accommodates additional microarray platforms and data formats journal December 2004
The role of lipopolysaccharide in Helicobacter pylori pathogenesis. journal April 1996
Identification, characterization, and spatial localization of two flagellin species in Helicobacter pylori flagella journal February 1991
Helicobacter pylori Infection journal October 2002
Altered states: Involvement of phosphorylated CagA in the induction of host cellular growth changes by Helicobacter pylori journal December 1999
Comparative ultrastructural and functional studies of Helicobacter pylori and Helicobacter mustelae flagellin mutants: both flagellin subunits, FlaA and FlaB, are necessary for full motility in Helicobacter species journal June 1995
Colonization of gnotobiotic piglets by Helicobacter pylori deficient in two flagellin genes journal July 1996
Automated prediction of CASP-5 structures using the Robetta server journal January 2003
Comparative transcriptomic analysis reveals conserved programmes underpinning organogenesis and reproduction in land plants journal July 2021
The Morphological Transition of Helicobacter pyloriCells from Spiral to Coccoid Is Preceded by a Substantial Modification of the Cell Wall journal June 1999
On the criteria to be used in decomposing systems into modules text January 1995
Bioconductor : open software development for computational biology and bioinformatics text January 2004

Cited By (51)

PhosphoPep—a phosphoproteome resource for systems biology research in Drosophila Kc167 cells journal January 2007
Evolutionary etiology of high-grade astrocytomas text January 2013
KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor journal March 2009
Data management strategies for multinational large-scale systems biology projects journal October 2012
Visualizing biological data—now and in the future journal March 2010
Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future journal August 2015
Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. text January 2015
PhosphoPep—a phosphoproteome resource for systems biology research in Drosophila Kc167 cells text January 2007
GeneWeaver: a web-based system for integrative functional genomics journal November 2011
A systems biology approach to understanding atherosclerosis journal March 2010
STRING 8--a global view on proteins and their functional interactions in 630 organisms journal January 2009
Systems Biology and Multi-Omics Integration: Viewpoints from the Metabolomics Research Community journal April 2019
In silico models of cancer
  • Edelman, Lucas B.; Eddy, James A.; Price, Nathan D.
  • Wiley Interdisciplinary Reviews: Systems Biology and Medicine, Vol. 2, Issue 4 https://doi.org/10.1002/wsbm.75
journal June 2010
A systems level predictive model for global gene regulation of methanogenesis in a hydrogenotrophic methanogen journal October 2013
Bioclipse: an open source workbench for chemo- and bioinformatics journal February 2007
Systems biology approaches and pathway tools for investigating cardiovascular disease journal January 2009
Evolutionary etiology of high-grade astrocytomas journal October 2013
Effect of nutritional interventions with quercetin, oat hulls, β-glucans, lysozyme and fish oil on performance and health status related parameters of broilers chickens journal July 2018
A travel guide to Cytoscape plugins journal November 2012
Comparative Microbial Modules Resource: Generation and Visualization of Multi-species Biclusters journal December 2011
Adaptation of cells to new environments: Adaptation of cells to new environments
  • Brooks, Aaron N.; Turkarslan, Serdar; Beer, Karlyn D.
  • Wiley Interdisciplinary Reviews: Systems Biology and Medicine, Vol. 3, Issue 5 https://doi.org/10.1002/wsbm.136
journal December 2010
Visualizing genomes: techniques and challenges journal February 2010
Bioinformatics for High-Throughput Toxico-Epigenomics Studies book August 2012
Identification of AMP-activated protein kinase targets by a consensus sequence search of the proteome journal March 2015
Systems biology of innate immunity journal January 2009
AMPK promotes mitochondrial biogenesis and function by phosphorylating the epigenetic factors DNMT1, RBBP7, and HAT1 journal January 2017
STRING 8--a global view on proteins and their functional interactions in 630 organisms text January 2009
Effect of nutritional interventions with quercetin, oat hulls, β-glucans, lysozyme and fish oil on performance and health status related parameters of broilers chickens text January 2018
A single transcription factor regulates evolutionarily diverse but functionally linked metabolic pathways in response to nutrient availability journal January 2009
Niche adaptation by expansion and reprogramming of general transcription factors journal January 2011
Architecture for interoperable software in biology journal December 2012
DASMI: exchanging, annotating and assessing molecular interaction data journal May 2009
The Prion Disease Database: a comprehensive transcriptome resource for systems biology research in prion diseases journal January 2009
JBioWH: an open-source Java framework for bioinformatics data integration journal July 2013
An emerging cyberinfrastructure for biodefense pathogen and pathogen–host data journal November 2007
Two transcription factors are necessary for iron homeostasis in a salt-dwelling archaeon journal November 2010
Integration and visualization of systems biology data in context of the genome journal January 2010
SIDEKICK: Genomic data driven analysis and decision-making framework journal December 2010
Combing the hairball with BioFabric: a new approach for visualization of large networks journal October 2012
Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools journal November 2012
The Firegoose: two-way integration of diverse data from different bioinformatics web resources with desktop applications journal January 2007
The RosR transcription factor is required for gene expression dynamics in response to extreme oxidative stress in a hypersaline-adapted archaeon journal July 2012
A digital repository with an extensible data model for biobanking and genomic analysis management journal May 2014
Large scale physiological readjustment during growth enables rapid, comprehensive and inexpensive systems analysis journal May 2010
FISH Oracle 2: a web server for integrative visualization of genomic data in cancer research journal January 2014
A Quick Guide to Large-Scale Genomic Data Mining journal May 2010
Diurnally Entrained Anticipatory Behavior in Archaea journal May 2009
Intuitive Visualization and Analysis of Multi-Omics Data and Application to Escherichia coli Carbon Metabolism journal June 2011
A Methodology for the Development of RESTful Semantic Web Services for Gene Expression Analysis journal July 2015
ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry text January 2011
ProbCD: enrichment analysis accounting for categorization uncertainty text January 2007