skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis

Journal Article · · Genome Biology (Online)
ORCiD logo [1];  [2];  [3];  [3];  [3];  [1];  [4];  [1];  [5];  [6];  [6];  [7];  [8];  [9];  [10];  [11];  [7];  [12];  [13];  [14] more »;  [15];  [16];  [17];  [18];  [8];  [6];  [16];  [19];  [15];  [20];  [9];  [1];  [21];  [22];  [10];  [12];  [14];  [5];  [3] « less
  1. James Hutton Institute, Dundee (Scotland)
  2. Mocean Energy, Edinburgh (United Kingdom)
  3. University of Dundee (Scotland). James Hutton Institute
  4. Centre for Genomic Regulation, Barcelona (Spain)
  5. University of Natural Resources and Life Sciences (BOKU), Vienna (Austria)
  6. RIKEN Center for Sustainable Resource Science, Yokohama (Japan)
  7. University of York Wentworth Way, York (United Kingdom)
  8. Fujian Agriculture and Forestry University, Fuzhou (China)
  9. University of Tubingen (Germany)
  10. Spanish National Research Council, Paterna (Spain)
  11. University Paris-Saclay, Gif-sur-Yvette (France)
  12. Colorado State University, Fort Collins, CO (United States)
  13. University of Texas, Austin, TX (United States)
  14. Medical University of Vienna (Austria)
  15. Adam Mickiewicz University, Poznań (Poland)
  16. Bielefeld University (Germany)
  17. Carl von Ossietzky Universität Oldenburg (Germany); Martin Luther University Halle-Wittenberg, Halle (Germany)
  18. Western University of Health Sciences, Pomona, CA (United States); Xiamen University (China)
  19. Oklahoma State University, Stillwater, OK (United States)
  20. Institute of Plant and Microbial Biology, Taipei (Taiwan)
  21. Hong Kong Baptist University, Hong Kong (China)
  22. St. Bonaventure University, NY (United States)

Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts—twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species.

Research Organization:
Colorado State Univ., Fort Collins, CO (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); Biological Sciences Research Council (BBSRC); Scottish Government Rural and Environment Science and Analytical Services division (RESAS); National Science Foundation (NSF); National Institutes of Health (NIH); Austrian Science Fund (FWF); French Agence Nationale de la Recherche; Japan Science and Technology Agency (JST); Core Research for Evolutionary Science and Technology (CREST); German Research Foundation (DFG); Research Grants Council (RGC) of Hong Kong
Grant/Contract Number:
SC0010733; BB/P009751/1; BB/R014582/1; BB/S020160/1; BB/S004610/1; MCB-2014408; GM-114297; P26333; ANR-16-CE12-0032; JPMJCR13B4; DBI1949036; MCB 2014542; STA653/14-1; STA653/15-1; IOS-154173; WA2167/8-1
OSTI ID:
1904476
Journal Information:
Genome Biology (Online), Vol. 23, Issue 1; ISSN 1474-760X
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (74)

Double-slit photoelectron interference in strong-field ionization of the neon dimer journal January 2019
Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis journal March 2012
Comprehensive splice-site analysis using comparative genomics journal August 2006
Unmasking alternative splicing inside protein-coding exons defines exitrons and their role in proteome plasticity journal May 2015
PacBio full-length cDNA sequencing integrated with RNA-seq reads drastically improves the discovery of splicing transcripts in rice journal December 2018
Alternative splicing and nonsense-mediated decay modulate expression of important regulatory genes in Arabidopsis journal November 2011
LoRDEC: accurate and efficient long read error correction journal August 2014
Orchestration of Thiamin Biosynthesis and Central Metabolism by Combined Action of the Thiamin Pyrophosphate Riboswitch and the Circadian Clock in Arabidopsis     journal January 2013
The Spen Family Protein FPA Controls Alternative Cleavage and Polyadenylation of RNA journal February 2010
High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincRNA Regulation journal November 2016
Iso-Seq analysis of Nepenthes ampullaria , Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants journal June 2017
A survey of the sorghum transcriptome using single-molecule long reads journal June 2016
Illuminating the dark side of the human transcriptome with long read transcript sequencing journal October 2020
PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass journal January 2020
Iso-Seq Allows Genome-Independent Transcriptome Profiling of Grape Berry Development journal January 2019
Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower journal July 2018
Rapid and Dynamic Alternative Splicing Impacts the Arabidopsis Cold Response Transcriptome journal May 2018
3′ Non-coding region sequences in eukaryotic messenger RNA journal September 1976
Widespread intron retention in mammals functionally tunes transcriptomes journal September 2014
SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification journal February 2018
Upstream regulatory architecture of rice genes: summarizing the baseline towards genus-wide comparative analysis of regulatory networks and allele mining journal February 2015
CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise journal November 2018
Transcription-driven chromatin repression of Intragenic transcription start sites journal February 2019
TSIS: an R package to infer alternative splicing isoform switches for time-series data journal June 2017
A high resolution single molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis workflow January 2022
Salmon provides fast and bias-aware quantification of transcript expression journal March 2017
Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time journal June 2003
An mRNA Surveillance Mechanism That Eliminates Transcripts Lacking Termination Codons journal March 2002
Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana) journal December 2016
Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research journal December 2015
Transcriptome assembly from long-read RNA-seq alignments with StringTie2 journal December 2019
Riboswitch Control of Gene Expression in Plants by Splicing and Alternative 3′ End Processing of mRNAs journal November 2007
A chromosome‐level Amaranthus cruentus genome assembly highlights gene family evolution and biosynthetic gene clusters that may underpin the nutritional value of this traditional crop journal June 2021
Dynamic Programming Alignment Accuracy journal January 1998
Improving PacBio Long Read Accuracy by Short Read Alignment journal October 2012
Paired-End Analysis of Transcription Start Sites in Arabidopsis Reveals Plant-Specific Promoter Signatures journal July 2014
Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis journal May 2015
Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data journal June 2019
SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions journal March 2018
Widespread premature transcription termination of Arabidopsis thaliana NLR genes by the spen protein FPA journal April 2021
Imaging of Endogenous Messenger RNA Splice Variants in Living Cells Reveals Nuclear Retention of Transcripts Inaccessible to Nonsense-Mediated Decay in Arabidopsis journal February 2014
Errors in long-read assemblies can critically affect protein prediction journal January 2019
A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing journal April 2017
Alternative splicing landscapes in Arabidopsis thaliana across tissues and stress conditions highlight major functional differences with animals journal January 2021
Direct sequencing of Arabidopsis thaliana RNA reveals patterns of cleavage and polyadenylation journal July 2012
Isoform sequencing provides insight into natural genetic diversity in maize journal June 2019
The nonstop decay and the RNA silencing systems operate cooperatively in plants journal April 2018
Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing journal November 2018
The antiphasic regulatory module comprising CDF5 and its antisense RNA FLORE links the circadian clock to photoperiodic flowering journal July 2017
Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches journal January 2018
Distinct Role of Core Promoter Architecture in Regulation of Light-Mediated Responses in Plant Genes journal April 2014
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing journal May 2017
Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts journal August 2017
Full-length RNA sequencing reveals unique transcriptome composition in bermudagrass journal November 2018
proovread : large-scale high-accuracy PacBio correction through iterative short read consensus journal July 2014
Cold-Dependent Expression and Alternative Splicing of Arabidopsis Long Non-coding RNAs journal February 2019
Intron retention as a component of regulated gene expression programs journal April 2017
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype journal August 2019
TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes journal July 2021
The TATA-Box Sequence in the Basal Promoter Contributes to Determining Light-Dependent Gene Expression in Plants journal July 2006
A Mechanism for the Regulation of Pre-mRNA 3′ Processing by Human Cleavage Factor Im journal December 2003
A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing journal April 2018
High-resolution profile of transcriptomes reveals a role of alternative splicing for modulating response to nitrogen in maize journal May 2020
Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing journal June 2018
Transcripts from downstream alternative transcription start sites evade uORF-mediated inhibition of gene expression in Arabidopsis journal June 2018
Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human journal April 2017
Near-optimal probabilistic RNA-seq quantification journal April 2016
The m6A pathway protects the transcriptome integrity by restricting RNA chimera formation in plants journal May 2019
Transcriptome Analyses of FY Mutants Reveal Its Role in mRNA Alternative Polyadenylation journal August 2019
Transcription Termination and Chimeric RNA Formation Controlled by Arabidopsis thaliana FPA journal October 2013
Utilizing PacBio Iso-Seq for Novel Transcript and Gene Discovery of Abiotic Stress Responses in Oryza sativa L. journal October 2020
Uncovering full-length transcript isoforms of sugarcane cultivar Khon Kaen 3 using single-molecule long-read sequencing journal October 2018
Araport11: a complete reannotation of the Arabidopsis thaliana reference genome journal February 2017
Accurate self-correction of errors in long reads using de Bruijn graphs journal June 2016