skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences

Journal Article · · Bioinformatics
 [1];  [2];  [2]
  1. Brown Univ., Providence, RI (United States). Dept. of Mathematics
  2. Brown Univ., Providence, RI (United States). Division of Applied Mathematics. Center for Computational Molecular Biology

Motivation: RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical advantage in convergence time. This algorithm iteratively samples from the conditional probability distributions P(Structure | Alignment) and P(Alignment | Structure). Not surprisingly, there is considerable uncertainly in the high-dimensional space of this difficult problem, which has so far received limited attention in this field. We show how the samples drawn from this algorithm can be used to more fully characterize the posterior space and to assess the uncertainty of predictions. Results: Our analysis of three publically available datasets showed a substantial improvement in RNA structure prediction by RNAG over extant prediction methods. Additionally, our analysis of 17 RNA families showed that the RNAG sampled structures were generally compact around their ensemble centroids, and at least 11 families had at least two well-separated clusters of predicted structures. In general, the distance between a reference structure and our predicted structure was large relative to the variation among structures within an ensemble. Availability: The Perl implementation of the RNAG algorithm and the data necessary to reproduce the results described in Sections 3.1 and 3.2 are available at http://ccmbweb.ccv.brown.edu/rnag.html

Research Organization:
Brown Univ., Providence, RI (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
FG02-04ER63942
OSTI ID:
1625276
Journal Information:
Bioinformatics, Vol. 27, Issue 18; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English

References (45)

RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers journal March 2006
RNAalifold: improved consensus structure prediction for RNA alignments journal November 2008
Centroid estimation in discrete high-dimensional spaces with applications in biology journal February 2008
Multiple sequence alignment with the Clustal series of programs journal July 2003
RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble journal August 2005
Clustering of RNA Secondary Structures with Application to Messenger RNAs journal June 2006
ProbCons: Probabilistic consistency-based multiple sequence alignment journal January 2005
CONTRAfold: RNA secondary structure prediction without physics-based models journal July 2006
A max-margin model for efficient simultaneous alignment and folding of RNA sequences journal June 2008
RNA sequence analysis using covariance models journal January 1994
A benchmark of multiple sequence alignment programs upon structural RNAs journal April 2005
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images journal November 1984
Abstract shapes of RNA journal September 2004
Rfam: annotating non-coding RNAs in complete genomes journal December 2004
Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods journal January 1992
Prediction of RNA secondary structure using generalized centroid estimators journal December 2008
Improving the accuracy of predicting secondary structure for aligned RNA sequences journal September 2010
Fast folding and comparison of RNA secondary structures journal February 1994
A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences journal February 2004
Murlet: a practical multiple alignment tool for structural RNA sequences journal April 2007
Robust prediction of consensus secondary structures using averaged base pairing probability matrices journal December 2006
RNA secondary structure prediction using stochastic context-free grammars and evolutionary history journal June 1999
MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing journal November 2007
The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem journal September 1994
Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization journal July 2004
Revolutions in RNA Secondary Structure Prediction journal June 2006
SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework journal August 2007
Query-Dependent Banding (QDB) for Faster RNA Similarity Searches journal January 2007
Infernal 1.0: inference of RNA alignments journal March 2009
Exact Calculation of Distributions on Integers, with Application to Sequence Alignment journal January 2009
Stochastic context-free grammers for tRNA modeling journal January 1994
Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems journal October 1985
Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments journal October 2008
MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons journal June 2005
RNAshapes: an integrated RNA analysis package based on abstract shapes journal December 2005
An RNA folding method capable of identifying pseudoknots and base triples journal September 1998
Multiple structural alignment and clustering of RNA sequences journal February 2007
BALSA: Bayesian algorithm for local sequence alignment journal March 2002
Measuring Global Credibility with Application to Local Sequence Alignment journal May 2008
Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering journal January 2007
RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment journal May 2007
CMfinder--a covariance model based RNA motif finding algorithm journal December 2005
Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information journal January 1981
Secondary Structure Prediction for Aligned RNA Sequences journal June 2002
Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images* journal January 1993

Cited By (3)