skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae

Journal Article · · Scientific Reports

Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.

Research Organization:
Michigan State Univ., East Lansing, MI (United States). Great Lakes Bioenergy Research Center
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
BER DE-SC0018409; SC0018409; IOS-1546617; DEB-1655386
OSTI ID:
1619548
Alternate ID(s):
OSTI ID: 1579362
Journal Information:
Scientific Reports, Journal Name: Scientific Reports Vol. 9 Journal Issue: 1; ISSN 2045-2322
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United Kingdom
Language:
English
Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

References (58)

The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing journal June 2008
Araport: the Arabidopsis Information Portal journal November 2014
Conservation and Functional Element Discovery in 20 Angiosperm Plant Genomes journal May 2013
New technologies accelerate the exploration of non-coding RNAs in horticultural plants journal July 2017
Seventy Million Years of Concerted Evolution of a Homoeologous Chromosome Pair, in Parallel, in Major Poaceae Lineages journal January 2011
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes journal August 2005
Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner journal April 2004
De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis journal July 2013
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation journal May 2010
Biological function in the twilight zone of sequence conservation journal August 2017
Determinants of nucleosome positioning and their influence on plant gene expression journal June 2015
The time-resolved transcriptome of C. elegans journal August 2016
TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions journal January 2013
MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes journal November 2007
Characteristics and Significance of Intergenic Polyadenylated RNA Transcription in Arabidopsis journal November 2012
Transcriptional noise and the fidelity of initiation by RNA polymerase II journal February 2007
MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations journal December 2013
Diversity and dynamics of the Drosophila transcriptome journal March 2014
Phytozome: a comparative platform for green plant genomics journal November 2011
Global Identification of Human Transcribed Sequences with Genome Tiling Arrays journal December 2004
Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function journal January 2006
On the Immortality of Television Sets: "Function" in the Human Genome According to the Evolution-Free Gospel of ENCODE journal January 2013
Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes journal August 2011
Angiosperm genome comparisons reveal early polyploidy in the monocot lineage journal December 2009
Genome-Wide Nucleosome Positioning Is Orchestrated by Genomic Regions Associated with DNase I Hypersensitivity in Rice journal May 2014
An ontology approach to comparative phenomics in plants journal January 2015
An expression atlas of rice mRNAs and small RNAs journal March 2007
Gene Space Dynamics During the Evolution of Aegilops tauschii, Brachypodium distachyon, Oryza sativa, and Sorghum bicolor Genomes journal April 2011
Rfam 12.0: updates to the RNA families database journal November 2014
Evolutionary and Expression Signatures of Pseudogenes in Arabidopsis and Rice journal July 2009
An integrated encyclopedia of DNA elements in the human genome journal September 2012
The Pfam protein families database: towards a more sustainable future journal December 2015
The GENCODE pseudogene resource journal January 2012
Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics journal May 2004
Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome journal October 2003
Genome Annotation and Curation Using MAKER and MAKER‐P journal December 2014
Cis-acting noncoding RNAs: friends and foes journal November 2012
Infrageneric Phylogeny and Temporal Divergence of Sorghum (Andropogoneae, Poaceae) Based on Low-Copy Nuclear and Plastid Sequences journal August 2014
Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes journal November 2014
Characteristics of Plant Essential Genes Allow for within- and between-Species Prediction of Lethal Mutant Phenotypes journal August 2015
PHAST and RPHAST: phylogenetic analysis with space/time models journal December 2010
Defining Functional Genic Regions in the Human Genome through Integration of Biochemical, Evolutionary, and Genetic Evidence journal April 2017
A Model-Based Approach for Identifying Functional Intergenic Transcribed Regions and Noncoding RNAs journal March 2018
Most “Dark Matter” Transcripts Are Associated With Known Genes journal May 2010
Regulated functional alternative splicing in Drosophila journal September 2011
The uniqueome: a mappability resource for short-tag sequencing journal November 2010
Regularization Paths for Generalized Linear Models via Coordinate Descent journal January 2010
Function without purpose: The uses of causal role function in evolutionary biology journal October 1994
Distinguishing between "Function" and "Effect" in Genome Biology journal May 2014
Defining functional DNA elements in the human genome journal April 2014
Proto-genes and de novo gene birth journal June 2012
Small open reading frames associated with morphogenesis are hidden in plant genomes journal January 2013
Infernal 1.1: 100-fold faster RNA homology searches journal September 2013
PAML 4: Phylogenetic Analysis by Maximum Likelihood journal April 2007
Close Split of Sorghum and Maize Genome Progenitors journal September 2004
miRBase: annotating high confidence microRNAs using deep sequencing data journal November 2013
mice : Multivariate Imputation by Chained Equations in R journal January 2011
Extensive microRNA-mediated crosstalk between lncRNAs and mRNAs in mouse embryonic stem cells journal March 2015

Similar Records

Predictive Models of Genetic Redundancy in Arabidopsis thaliana
Journal Article · Mon Apr 19 00:00:00 EDT 2021 · Molecular Biology and Evolution (Online) · OSTI ID:1619548

Expression and regulatory asymmetry of retained Arabidopsis thaliana transcription factor genes derived from whole genome duplication
Journal Article · Wed Mar 13 00:00:00 EDT 2019 · BMC Evolutionary Biology (Online) · OSTI ID:1619548

Translational Genomics for the Improvement of Switchgrass
Technical Report · Wed May 07 00:00:00 EDT 2014 · OSTI ID:1619548

Related Subjects