Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis

Journal Article · · Genome Biology (Online)
 [1];  [2];  [3];  [3];  [3];  [4];  [5];  [4];  [6];  [7];  [7];  [8];  [9];  [10];  [11];  [12];  [8];  [13];  [14];  [15] more »;  [16];  [17];  [18];  [19];  [9];  [7];  [17];  [20];  [16];  [21];  [10];  [4];  [22];  [23];  [11];  [13];  [15];  [6];  [3] « less
  1. James Hutton Institute, Dundee (Scotland); OSTI
  2. Mocean Energy, Edinburgh (United Kingdom)
  3. University of Dundee (Scotland). James Hutton Institute
  4. James Hutton Institute, Dundee (Scotland)
  5. Centre for Genomic Regulation, Barcelona (Spain)
  6. University of Natural Resources and Life Sciences (BOKU), Vienna (Austria)
  7. RIKEN Center for Sustainable Resource Science, Yokohama (Japan)
  8. University of York Wentworth Way, York (United Kingdom)
  9. Fujian Agriculture and Forestry University, Fuzhou (China)
  10. University of Tubingen (Germany)
  11. Spanish National Research Council, Paterna (Spain)
  12. University Paris-Saclay, Gif-sur-Yvette (France)
  13. Colorado State University, Fort Collins, CO (United States)
  14. University of Texas, Austin, TX (United States)
  15. Medical University of Vienna (Austria)
  16. Adam Mickiewicz University, Poznań (Poland)
  17. Bielefeld University (Germany)
  18. Carl von Ossietzky Universität Oldenburg (Germany); Martin Luther University Halle-Wittenberg, Halle (Germany)
  19. Western University of Health Sciences, Pomona, CA (United States); Xiamen University (China)
  20. Oklahoma State University, Stillwater, OK (United States)
  21. Institute of Plant and Microbial Biology, Taipei (Taiwan)
  22. Hong Kong Baptist University, Hong Kong (China)
  23. St. Bonaventure University, NY (United States)

Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts—twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species.

Research Organization:
Colorado State University, Fort Collins, CO (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); Biological Sciences Research Council (BBSRC); Scottish Government Rural and Environment Science and Analytical Services division (RESAS); National Science Foundation; National Institute of Health (NIH); Austrian Science Fund (FWF); French Agence Nationale de la Recherche; Japan Science and Technology Agency (JST); Core Research for Evolutionary Science and Technology (CREST); German Research Foundation (DFG); Research Grants Council (RGC) of Hong Kong
Grant/Contract Number:
SC0010733
OSTI ID:
1904476
Journal Information:
Genome Biology (Online), Journal Name: Genome Biology (Online) Journal Issue: 1 Vol. 23; ISSN 1474-760X
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (89)

Data from: Paired-end analysis of transcription start sites in Arabidopsis reveals plant-specific promoter signatures dataset January 2015
A high resolution single molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis workflow January 2022
Additional file 2: of CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise dataset January 2018
PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass journal January 2020
Intron retention as a component of regulated gene expression programs journal April 2017
A Mechanism for the Regulation of Pre-mRNA 3′ Processing by Human Cleavage Factor Im journal December 2003
The Spen Family Protein FPA Controls Alternative Cleavage and Polyadenylation of RNA journal February 2010
High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincRNA Regulation journal November 2016
Iso-Seq analysis of Nepenthes ampullaria , Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants journal June 2017
Full-length RNA sequencing reveals unique transcriptome composition in bermudagrass journal November 2018
3′ Non-coding region sequences in eukaryotic messenger RNA journal September 1976
Near-optimal probabilistic RNA-seq quantification journal April 2016
A survey of the sorghum transcriptome using single-molecule long reads journal June 2016
Salmon provides fast and bias-aware quantification of transcript expression journal March 2017
Direct sequencing of Arabidopsis thaliana RNA reveals patterns of cleavage and polyadenylation journal July 2012
Double-slit photoelectron interference in strong-field ionization of the neon dimer journal January 2019
Errors in long-read assemblies can critically affect protein prediction journal January 2019
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype journal August 2019
Transcripts from downstream alternative transcription start sites evade uORF-mediated inhibition of gene expression in Arabidopsis journal June 2018
Dynamic Programming Alignment Accuracy journal January 1998
Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data journal June 2019
proovread : large-scale high-accuracy PacBio correction through iterative short read consensus journal July 2014
LoRDEC: accurate and efficient long read error correction journal August 2014
Accurate self-correction of errors in long reads using de Bruijn graphs journal June 2016
TSIS: an R package to infer alternative splicing isoform switches for time-series data journal June 2017
Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time journal June 2003
Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts journal August 2017
Distinct Role of Core Promoter Architecture in Regulation of Light-Mediated Responses in Plant Genes journal April 2014
TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes journal July 2021
Comprehensive splice-site analysis using comparative genomics journal August 2006
Alternative splicing and nonsense-mediated decay modulate expression of important regulatory genes in Arabidopsis journal November 2011
A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing journal April 2017
The nonstop decay and the RNA silencing systems operate cooperatively in plants journal April 2018
Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis journal March 2012
Widespread intron retention in mammals functionally tunes transcriptomes journal September 2014
Unmasking alternative splicing inside protein-coding exons defines exitrons and their role in proteome plasticity journal May 2015
SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification journal February 2018
A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing journal April 2018
The TATA-Box Sequence in the Basal Promoter Contributes to Determining Light-Dependent Gene Expression in Plants journal July 2006
Riboswitch Control of Gene Expression in Plants by Splicing and Alternative 3′ End Processing of mRNAs journal November 2007
Orchestration of Thiamin Biosynthesis and Central Metabolism by Combined Action of the Thiamin Pyrophosphate Riboswitch and the Circadian Clock in Arabidopsis     journal January 2013
Imaging of Endogenous Messenger RNA Splice Variants in Living Cells Reveals Nuclear Retention of Transcripts Inaccessible to Nonsense-Mediated Decay in Arabidopsis journal February 2014
Paired-End Analysis of Transcription Start Sites in Arabidopsis Reveals Plant-Specific Promoter Signatures journal July 2014
Rapid and Dynamic Alternative Splicing Impacts the Arabidopsis Cold Response Transcriptome journal May 2018
Transcriptome Analyses of FY Mutants Reveal Its Role in mRNA Alternative Polyadenylation journal August 2019
The antiphasic regulatory module comprising CDF5 and its antisense RNA FLORE links the circadian clock to photoperiodic flowering journal July 2017
Isoform sequencing provides insight into natural genetic diversity in maize journal June 2019
Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis journal May 2015
Araport11: a complete reannotation of the Arabidopsis thaliana reference genome journal February 2017
PacBio full-length cDNA sequencing integrated with RNA-seq reads drastically improves the discovery of splicing transcripts in rice journal December 2018
A chromosome‐level Amaranthus cruentus genome assembly highlights gene family evolution and biosynthetic gene clusters that may underpin the nutritional value of this traditional crop journal June 2021
An mRNA Surveillance Mechanism That Eliminates Transcripts Lacking Termination Codons journal March 2002
Upstream regulatory architecture of rice genes: summarizing the baseline towards genus-wide comparative analysis of regulatory networks and allele mining journal February 2015
Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research journal December 2015
Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human journal April 2017
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing journal May 2017
Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches journal January 2018
Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower journal July 2018
Illuminating the dark side of the human transcriptome with long read transcript sequencing journal October 2020
High-resolution profile of transcriptomes reveals a role of alternative splicing for modulating response to nitrogen in maize journal May 2020
Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing journal November 2018
SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions journal March 2018
CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise journal November 2018
Transcriptome assembly from long-read RNA-seq alignments with StringTie2 journal December 2019
Alternative splicing landscapes in Arabidopsis thaliana across tissues and stress conditions highlight major functional differences with animals journal January 2021
Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing journal June 2018
Transcription Termination and Chimeric RNA Formation Controlled by Arabidopsis thaliana FPA journal October 2013
Transcription-driven chromatin repression of Intragenic transcription start sites journal February 2019
Improving PacBio Long Read Accuracy by Short Read Alignment journal October 2012
Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana) journal December 2016
Iso-Seq Allows Genome-Independent Transcriptome Profiling of Grape Berry Development journal January 2019
The m6A pathway protects the transcriptome integrity by restricting RNA chimera formation in plants journal May 2019
Cold-Dependent Expression and Alternative Splicing of Arabidopsis Long Non-coding RNAs journal February 2019
Utilizing PacBio Iso-Seq for Novel Transcript and Gene Discovery of Abiotic Stress Responses in Oryza sativa L. journal October 2020
Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research collection January 2015
Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human collection January 2017
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing collection January 2017
SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions collection January 2018
Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing [Supplemental Data] collection June 2018
Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower collection January 2018
Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing collection January 2018
CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise collection January 2018
Transcriptome assembly from long-read RNA-seq alignments with StringTie2 collection January 2019
High-resolution profile of transcriptomes reveals a role of alternative splicing for modulating response to nitrogen in maize collection January 2020
Illuminating the dark side of the human transcriptome with long read transcript sequencing collection January 2020
Alternative splicing landscapes in Arabidopsis thaliana across tissues and stress conditions highlight major functional differences with animals collection January 2021
Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification journal January 2020
Widespread premature transcription termination of Arabidopsis thaliana NLR genes by the spen protein FPA journal April 2021
Uncovering full-length transcript isoforms of sugarcane cultivar Khon Kaen 3 using single-molecule long-read sequencing journal October 2018