Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
Abstract
Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop andmore »
- Authors:
- Publication Date:
- Research Org.:
- Michigan State Univ., East Lansing, MI (United States). Great Lakes Bioenergy Research Center
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER)
- OSTI Identifier:
- 1619548
- Alternate Identifier(s):
- OSTI ID: 1579362
- Grant/Contract Number:
- BER DE-SC0018409; SC0018409; IOS-1546617; DEB-1655386
- Resource Type:
- Published Article
- Journal Name:
- Scientific Reports
- Additional Journal Information:
- Journal Name: Scientific Reports Journal Volume: 9 Journal Issue: 1; Journal ID: ISSN 2045-2322
- Publisher:
- Nature Publishing Group
- Country of Publication:
- United Kingdom
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES
Citation Formats
Lloyd, John P., Bowman, Megan J., Azodi, Christina B., Sowers, Rosalie P., Moghe, Gaurav D., Childs, Kevin L., and Shiu, Shin-Han. Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae. United Kingdom: N. p., 2019.
Web. doi:10.1038/s41598-019-47797-y.
Lloyd, John P., Bowman, Megan J., Azodi, Christina B., Sowers, Rosalie P., Moghe, Gaurav D., Childs, Kevin L., & Shiu, Shin-Han. Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae. United Kingdom. https://doi.org/10.1038/s41598-019-47797-y
Lloyd, John P., Bowman, Megan J., Azodi, Christina B., Sowers, Rosalie P., Moghe, Gaurav D., Childs, Kevin L., and Shiu, Shin-Han. Tue .
"Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae". United Kingdom. https://doi.org/10.1038/s41598-019-47797-y.
@article{osti_1619548,
title = {Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae},
author = {Lloyd, John P. and Bowman, Megan J. and Azodi, Christina B. and Sowers, Rosalie P. and Moghe, Gaurav D. and Childs, Kevin L. and Shiu, Shin-Han},
abstractNote = {Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.},
doi = {10.1038/s41598-019-47797-y},
journal = {Scientific Reports},
number = 1,
volume = 9,
place = {United Kingdom},
year = {Tue Aug 20 00:00:00 EDT 2019},
month = {Tue Aug 20 00:00:00 EDT 2019}
}
https://doi.org/10.1038/s41598-019-47797-y
Web of Science
Works referenced in this record:
The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing
journal, June 2008
- Nagalakshmi, U.; Wang, Z.; Waern, K.
- Science, Vol. 320, Issue 5881
Araport: the Arabidopsis Information Portal
journal, November 2014
- Krishnakumar, Vivek; Hanlon, Matthew R.; Contrino, Sergio
- Nucleic Acids Research, Vol. 43, Issue D1
Conservation and Functional Element Discovery in 20 Angiosperm Plant Genomes
journal, May 2013
- Hupalo, D.; Kern, A. D.
- Molecular Biology and Evolution, Vol. 30, Issue 7
New technologies accelerate the exploration of non-coding RNAs in horticultural plants
journal, July 2017
- Liu, Degao; Mewalal, Ritesh; Hu, Rongbin
- Horticulture Research, Vol. 4, Issue 1
Seventy Million Years of Concerted Evolution of a Homoeologous Chromosome Pair, in Parallel, in Major Poaceae Lineages
journal, January 2011
- Wang, Xiyin; Tang, Haibao; Paterson, Andrew H.
- The Plant Cell, Vol. 23, Issue 1
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
journal, August 2005
- Siepel, A.
- Genome Research, Vol. 15, Issue 8
Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner
journal, April 2004
- Blanchette, M.
- Genome Research, Vol. 14, Issue 4
De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis
journal, July 2013
- Haas, Brian J.; Papanicolaou, Alexie; Yassour, Moran
- Nature Protocols, Vol. 8, Issue 8
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
journal, May 2010
- Trapnell, Cole; Williams, Brian A.; Pertea, Geo
- Nature Biotechnology, Vol. 28, Issue 5
Biological function in the twilight zone of sequence conservation
journal, August 2017
- Ponting, Chris P.
- BMC Biology, Vol. 15, Issue 1
Determinants of nucleosome positioning and their influence on plant gene expression
journal, June 2015
- Liu, Ming-Jung; Seddon, Alexander E.; Tsai, Zing Tsung-Yeh
- Genome Research, Vol. 25, Issue 8
The time-resolved transcriptome of C. elegans
journal, August 2016
- Boeck, Max E.; Huynh, Chau; Gevirtzman, Lou
- Genome Research, Vol. 26, Issue 10
TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions
journal, January 2013
- Kim, Daehwan; Pertea, Geo; Trapnell, Cole
- Genome Biology, Vol. 14, Issue 4
MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes
journal, November 2007
- Cantarel, B. L.; Korf, I.; Robb, S. M. C.
- Genome Research, Vol. 18, Issue 1
Characteristics and Significance of Intergenic Polyadenylated RNA Transcription in Arabidopsis
journal, November 2012
- Moghe, Gaurav D.; Lehti-Shiu, Melissa D.; Seddon, Alex E.
- Plant Physiology, Vol. 161, Issue 1
Transcriptional noise and the fidelity of initiation by RNA polymerase II
journal, February 2007
- Struhl, Kevin
- Nature Structural & Molecular Biology, Vol. 14, Issue 2
MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations
journal, December 2013
- Campbell, Michael S.; Law, MeiYee; Holt, Carson
- Plant Physiology, Vol. 164, Issue 2
Diversity and dynamics of the Drosophila transcriptome
journal, March 2014
- Brown, James B.; Boley, Nathan; Eisman, Robert
- Nature, Vol. 512, Issue 7515
Phytozome: a comparative platform for green plant genomics
journal, November 2011
- Goodstein, David M.; Shu, Shengqiang; Howson, Russell
- Nucleic Acids Research, Vol. 40, Issue D1
Global Identification of Human Transcribed Sequences with Genome Tiling Arrays
journal, December 2004
- Bertone, P.
- Science, Vol. 306, Issue 5705
Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function
journal, January 2006
- Pang, Ken C.; Frith, Martin C.; Mattick, John S.
- Trends in Genetics, Vol. 22, Issue 1
On the Immortality of Television Sets: "Function" in the Human Genome According to the Evolution-Free Gospel of ENCODE
journal, January 2013
- Graur, D.; Zheng, Y.; Price, N.
- Genome Biology and Evolution, Vol. 5, Issue 3
Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes
journal, August 2011
- Davidson, Rebecca M.; Hansey, Candice N.; Gowda, Malali
- The Plant Genome Journal, Vol. 4, Issue 3, p. 191-203
Angiosperm genome comparisons reveal early polyploidy in the monocot lineage
journal, December 2009
- Tang, H.; Bowers, J. E.; Wang, X.
- Proceedings of the National Academy of Sciences, Vol. 107, Issue 1
Genome-Wide Nucleosome Positioning Is Orchestrated by Genomic Regions Associated with DNase I Hypersensitivity in Rice
journal, May 2014
- Wu, Yufeng; Zhang, Wenli; Jiang, Jiming
- PLoS Genetics, Vol. 10, Issue 5
An ontology approach to comparative phenomics in plants
journal, January 2015
- Oellrich, Anika; Walls, Ramona L.; Cannon, Ethalinda
- Plant Methods, Vol. 11, Issue 1
An expression atlas of rice mRNAs and small RNAs
journal, March 2007
- Nobuta, Kan; Venu, R. C.; Lu, Cheng
- Nature Biotechnology, Vol. 25, Issue 4
Gene Space Dynamics During the Evolution of Aegilops tauschii, Brachypodium distachyon, Oryza sativa, and Sorghum bicolor Genomes
journal, April 2011
- Massa, A. N.; Wanjugi, H.; Deal, K. R.
- Molecular Biology and Evolution, Vol. 28, Issue 9
Rfam 12.0: updates to the RNA families database
journal, November 2014
- Nawrocki, Eric P.; Burge, Sarah W.; Bateman, Alex
- Nucleic Acids Research, Vol. 43, Issue D1
Evolutionary and Expression Signatures of Pseudogenes in Arabidopsis and Rice
journal, July 2009
- Zou, Cheng; Lehti-Shiu, Melissa D.; Thibaud-Nissen, Françoise
- Plant Physiology, Vol. 151, Issue 1
An integrated encyclopedia of DNA elements in the human genome
journal, September 2012
- ,
- Nature, Vol. 489, Issue 7414, p. 57-74
The Pfam protein families database: towards a more sustainable future
journal, December 2015
- Finn, Robert D.; Coggill, Penelope; Eberhardt, Ruth Y.
- Nucleic Acids Research, Vol. 44, Issue D1
The GENCODE pseudogene resource
journal, January 2012
- Pei, Baikang; Sisu, Cristina; Frankish, Adam
- Genome Biology, Vol. 13, Issue 9
Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics
journal, May 2004
- Paterson, A. H.; Bowers, J. E.; Chapman, B. A.
- Proceedings of the National Academy of Sciences, Vol. 101, Issue 26
Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome
journal, October 2003
- Yamada, K.
- Science, Vol. 302, Issue 5646
Genome Annotation and Curation Using MAKER and MAKER‐P
journal, December 2014
- Campbell, Michael S.; Holt, Carson; Moore, Barry
- Current Protocols in Bioinformatics, Vol. 48, Issue 1
Cis-acting noncoding RNAs: friends and foes
journal, November 2012
- Guil, Sònia; Esteller, Manel
- Nature Structural & Molecular Biology, Vol. 19, Issue 11
Infrageneric Phylogeny and Temporal Divergence of Sorghum (Andropogoneae, Poaceae) Based on Low-Copy Nuclear and Plastid Sequences
journal, August 2014
- Liu, Qing; Liu, Huan; Wen, Jun
- PLoS ONE, Vol. 9, Issue 8
Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes
journal, November 2014
- Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.
- Plant Physiology, Vol. 167, Issue 1
Characteristics of Plant Essential Genes Allow for within- and between-Species Prediction of Lethal Mutant Phenotypes
journal, August 2015
- Lloyd, John P.; Seddon, Alexander E.; Moghe, Gaurav D.
- The Plant Cell, Vol. 27, Issue 8
PHAST and RPHAST: phylogenetic analysis with space/time models
journal, December 2010
- Hubisz, M. J.; Pollard, K. S.; Siepel, A.
- Briefings in Bioinformatics, Vol. 12, Issue 1
Defining Functional Genic Regions in the Human Genome through Integration of Biochemical, Evolutionary, and Genetic Evidence
journal, April 2017
- Tsai, Zing Tsung-Yeh; Lloyd, John P.; Shiu, Shin-Han
- Molecular Biology and Evolution, Vol. 34, Issue 7
A Model-Based Approach for Identifying Functional Intergenic Transcribed Regions and Noncoding RNAs
journal, March 2018
- Lloyd, John P.; Tsai, Zing Tsung-Yeh; Sowers, Rosalie P.
- Molecular Biology and Evolution, Vol. 35, Issue 6
Most “Dark Matter” Transcripts Are Associated With Known Genes
journal, May 2010
- van Bakel, Harm; Nislow, Corey; Blencowe, Benjamin J.
- PLoS Biology, Vol. 8, Issue 5
Regulated functional alternative splicing in Drosophila
journal, September 2011
- Venables, J. P.; Tazi, J.; Juge, F.
- Nucleic Acids Research, Vol. 40, Issue 1
The uniqueome: a mappability resource for short-tag sequencing
journal, November 2010
- Koehler, Ryan; Issac, Hadar; Cloonan, Nicole
- Bioinformatics, Vol. 27, Issue 2
Regularization Paths for Generalized Linear Models via Coordinate Descent
journal, January 2010
- Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert
- Journal of Statistical Software, Vol. 33, Issue 1
Function without purpose: The uses of causal role function in evolutionary biology
journal, October 1994
- Amundson, Ron; Lauder, George V.
- Biology & Philosophy, Vol. 9, Issue 4
Distinguishing between "Function" and "Effect" in Genome Biology
journal, May 2014
- Doolittle, W. F.; Brunet, T. D. P.; Linquist, S.
- Genome Biology and Evolution, Vol. 6, Issue 5
Defining functional DNA elements in the human genome
journal, April 2014
- Kellis, M.; Wold, B.; Snyder, M. P.
- Proceedings of the National Academy of Sciences, Vol. 111, Issue 17
Proto-genes and de novo gene birth
journal, June 2012
- Carvunis, Anne-Ruxandra; Rolland, Thomas; Wapinski, Ilan
- Nature, Vol. 487, Issue 7407
Small open reading frames associated with morphogenesis are hidden in plant genomes
journal, January 2013
- Hanada, K.; Higuchi-Takeuchi, M.; Okamoto, M.
- Proceedings of the National Academy of Sciences, Vol. 110, Issue 6
Infernal 1.1: 100-fold faster RNA homology searches
journal, September 2013
- Nawrocki, E. P.; Eddy, S. R.
- Bioinformatics, Vol. 29, Issue 22
PAML 4: Phylogenetic Analysis by Maximum Likelihood
journal, April 2007
- Yang, Z.
- Molecular Biology and Evolution, Vol. 24, Issue 8
Close Split of Sorghum and Maize Genome Progenitors
journal, September 2004
- Swigonova, Z.
- Genome Research, Vol. 14, Issue 10a
miRBase: annotating high confidence microRNAs using deep sequencing data
journal, November 2013
- Kozomara, Ana; Griffiths-Jones, Sam
- Nucleic Acids Research, Vol. 42, Issue D1
mice : Multivariate Imputation by Chained Equations in R
journal, January 2011
- Buuren, Stef van; Groothuis-Oudshoorn, Karin
- Journal of Statistical Software, Vol. 45, Issue 3
Extensive microRNA-mediated crosstalk between lncRNAs and mRNAs in mouse embryonic stem cells
journal, March 2015
- Tan, Jennifer Y.; Sirey, Tamara; Honti, Frantisek
- Genome Research, Vol. 25, Issue 5