skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

Abstract

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly1. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. As a result, the Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for themore » plant comparative genomics community.« less

Authors:
 [1];  [1];  [2];  [3];  [4];  [5];  [6];  [6];  [6];  [7];  [4];  [8];  [9];  [9];  [10];  [1]
  1. Donal Danforth Plant Science Center, St. Louis, MO (United States)
  2. Univ. of California, Berkeley, CA (United States); Michigan State Univ., East Lansing, MI (United States)
  3. Univ. of Arizona, Tucson, AZ (United States); Fujan Agriculture and Forestry Univ., Fuzhou (China)
  4. Univ. of California, Berkeley, CA (United States)
  5. Univ. of Bonn, Bonn (Germany); Central Univ. of Tamil Nadu, Thiruvarur (India)
  6. Pacific Biosciences, Menlo Park, CA (United States)
  7. Univ. of Arizona, Tucson, AZ (United States)
  8. Univ. of Bonn, Bonn (Germany)
  9. BioNano Genomics, San Diego, CA (United States)
  10. Ibis Biosciences, Carlsbad, CA (United States)
Publication Date:
Research Org.:
Donal Danforth Plant Science Center, St. Louis, MO (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1436509
Grant/Contract Number:  
SC0012639
Resource Type:
Accepted Manuscript
Journal Name:
Nature (London)
Additional Journal Information:
Journal Name: Nature (London); Journal Volume: 527; Journal Issue: 7579; Journal ID: ISSN 0028-0836
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Comparative genomics; Drought; Genome evolution; Plant evolution

Citation Formats

VanBuren, Robert, Bryant, Doug, Edger, Patrick P., Tang, Haibao, Burgess, Diane, Challabathula, Dinakar, Spittle, Kristi, Hall, Richard, Gu, Jenny, Lyons, Eric, Freeling, Michael, Bartels, Dorothea, Ten Hallers, Boudewijn, Hastie, Alex, Michael, Todd P., and Mockler, Todd C. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. United States: N. p., 2015. Web. doi:10.1038/nature15714.
VanBuren, Robert, Bryant, Doug, Edger, Patrick P., Tang, Haibao, Burgess, Diane, Challabathula, Dinakar, Spittle, Kristi, Hall, Richard, Gu, Jenny, Lyons, Eric, Freeling, Michael, Bartels, Dorothea, Ten Hallers, Boudewijn, Hastie, Alex, Michael, Todd P., & Mockler, Todd C. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. United States. doi:10.1038/nature15714.
VanBuren, Robert, Bryant, Doug, Edger, Patrick P., Tang, Haibao, Burgess, Diane, Challabathula, Dinakar, Spittle, Kristi, Hall, Richard, Gu, Jenny, Lyons, Eric, Freeling, Michael, Bartels, Dorothea, Ten Hallers, Boudewijn, Hastie, Alex, Michael, Todd P., and Mockler, Todd C. Wed . "Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum". United States. doi:10.1038/nature15714. https://www.osti.gov/servlets/purl/1436509.
@article{osti_1436509,
title = {Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum},
author = {VanBuren, Robert and Bryant, Doug and Edger, Patrick P. and Tang, Haibao and Burgess, Diane and Challabathula, Dinakar and Spittle, Kristi and Hall, Richard and Gu, Jenny and Lyons, Eric and Freeling, Michael and Bartels, Dorothea and Ten Hallers, Boudewijn and Hastie, Alex and Michael, Todd P. and Mockler, Todd C.},
abstractNote = {Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly1. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. As a result, the Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.},
doi = {10.1038/nature15714},
journal = {Nature (London)},
number = 7579,
volume = 527,
place = {United States},
year = {2015},
month = {11}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 98 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

The Sorghum bicolor genome and the diversification of grasses
journal, January 2009

  • Paterson, Andrew H.; Bowers, John E.; Bruggmann, Rémy
  • Nature, Vol. 457, Issue 7229
  • DOI: 10.1038/nature07723

Topological analysis and interactive visualization of biological networks and protein structures
journal, March 2012

  • Doncheva, Nadezhda T.; Assenov, Yassen; Domingues, Francisco S.
  • Nature Protocols, Vol. 7, Issue 4
  • DOI: 10.1038/nprot.2012.004

Patching gaps in plant genomes results in gene movement and erosion of colinearity
journal, June 2010


InterProScan: protein domains identifier
journal, July 2005

  • Quevillon, E.; Silventoinen, V.; Pillai, S.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki442

Fast and accurate short read alignment with Burrows-Wheeler transform
journal, May 2009


Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data
journal, May 2013

  • Chin, Chen-Shan; Alexander, David H.; Marks, Patrick
  • Nature Methods, Vol. 10, Issue 6
  • DOI: 10.1038/nmeth.2474

MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes
journal, November 2007

  • Cantarel, B. L.; Korf, I.; Robb, S. M. C.
  • Genome Research, Vol. 18, Issue 1
  • DOI: 10.1101/gr.6743907

Trimmomatic: a flexible trimmer for Illumina sequence data
journal, April 2014


Do Plants Have a One-Way Ticket to Genomic Obesity?
journal, September 1997


Rice by the numbers: A good grain
journal, October 2014


The map-based sequence of the rice genome
journal, August 2005


Considering Transposable Element Diversification in De Novo Annotation Approaches
journal, January 2011


The miniature genome of a carnivorous plant Genlisea aurea contains a low number of genes and short non-coding sequences
journal, January 2013

  • Leushkin, Evgeny V.; Sutormin, Roman A.; Nabieva, Elena R.
  • BMC Genomics, Vol. 14, Issue 1
  • DOI: 10.1186/1471-2164-14-476

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
journal, March 2012


A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome
journal, July 2014


Comparative Genomic Paleontology across Plant Kingdom Reveals the Dynamics of TE-Driven Genome Evolution
journal, February 2013

  • El Baidouri, Moaine; Panaud, Olivier
  • Genome Biology and Evolution, Vol. 5, Issue 5
  • DOI: 10.1093/gbe/evt025

Full-length transcriptome assembly from RNA-Seq data without a reference genome
journal, May 2011

  • Grabherr, Manfred G.; Haas, Brian J.; Yassour, Moran
  • Nature Biotechnology, Vol. 29, Issue 7
  • DOI: 10.1038/nbt.1883

Plant genome size variation: bloating and purging DNA
journal, March 2014


Architecture and evolution of a minute plant genome
journal, May 2013

  • Ibarra-Laclette, Enrique; Lyons, Eric; Hernández-Guzmán, Gustavo
  • Nature, Vol. 498, Issue 7452
  • DOI: 10.1038/nature12132

Screening synteny blocks in pairwise genome comparisons through integer programming
journal, April 2011


OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes
journal, September 2003


Angiosperm genome comparisons reveal early polyploidy in the monocot lineage
journal, December 2009

  • Tang, H.; Bowers, J. E.; Wang, X.
  • Proceedings of the National Academy of Sciences, Vol. 107, Issue 1
  • DOI: 10.1073/pnas.0908007107

Resolving the complexity of the human genome using single-molecule sequencing
journal, November 2014

  • Chaisson, Mark J. P.; Huddleston, John; Dennis, Megan Y.
  • Nature, Vol. 517, Issue 7536
  • DOI: 10.1038/nature13907

The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus)
journal, April 2008


Adaptive seeds tame genomic sequence comparison
journal, January 2011


The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011

  • Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr1090

Tandem repeats finder: a program to analyze DNA sequences
journal, January 1999


Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution
journal, January 2013


A travel guide to Cytoscape plugins
journal, November 2012

  • Saito, Rintaro; Smoot, Michael E.; Ono, Keiichiro
  • Nature Methods, Vol. 9, Issue 11
  • DOI: 10.1038/nmeth.2212

Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology
journal, December 2014


A Solution to the C-Value Paradox and the Function of Junk DNA: The Genome Balance Hypothesis
journal, June 2015


STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012

  • Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1094

Genome size is a strong predictor of cell size and stomatal density in angiosperms
journal, September 2008


De novo identification of repeat families in large genomes
journal, June 2005


The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data
journal, July 2010


Genome conflict in the gramineae
journal, November 2004


LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons
journal, January 2008

  • Ellinghaus, David; Kurtz, Stefan; Willhoeft, Ute
  • BMC Bioinformatics, Vol. 9, Issue 1
  • DOI: 10.1186/1471-2105-9-18

Progress, challenges and the future of crop genomes
journal, April 2015


Reference genome sequence of the model plant Setaria
journal, May 2012

  • Bennetzen, Jeffrey L.; Schmutz, Jeremy; Wang, Hao
  • Nature Biotechnology, Vol. 30, Issue 6
  • DOI: 10.1038/nbt.2196

The Universal Protein Resource (UniProt): an expanding universe of protein information
journal, January 2006


Preparation of megabase-size DNA from plant nuclei
journal, January 1995


Defining functional DNA elements in the human genome
journal, April 2014

  • Kellis, M.; Wold, B.; Snyder, M. P.
  • Proceedings of the National Academy of Sciences, Vol. 111, Issue 17
  • DOI: 10.1073/pnas.1318948111

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing
journal, May 2015

  • Berlin, Konstantin; Koren, Sergey; Chin, Chen-Shan
  • Nature Biotechnology, Vol. 33, Issue 6
  • DOI: 10.1038/nbt.3238

Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly
journal, July 2012

  • Lam, Ernest T.; Hastie, Alex; Lin, Chin
  • Nature Biotechnology, Vol. 30, Issue 8
  • DOI: 10.1038/nbt.2303

Pfam: the protein families database
journal, November 2013

  • Finn, Robert D.; Bateman, Alex; Clements, Jody
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1223

The B73 Maize Genome: Complexity, Diversity, and Dynamics
journal, November 2009

  • Schnable, P. S.; Ware, D.; Fulton, R. S.
  • Science, Vol. 326, Issue 5956, p. 1112-1115
  • DOI: 10.1126/science.1178534

CD-HIT Suite: a web server for clustering and comparing biological sequences
journal, January 2010


Characterization of the human ESC transcriptome by hybrid sequencing
journal, November 2013

  • Au, K. F.; Sebastiano, V.; Afshar, P. T.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 50
  • DOI: 10.1073/pnas.1320101110

Repbase Update, a database of eukaryotic repetitive elements
journal, January 2005

  • Jurka, J.; Kapitonov, V. V.; Pavlicek, A.
  • Cytogenetic and Genome Research, Vol. 110, Issue 1-4
  • DOI: 10.1159/000084979

Do Plants Have a One-Way Ticket to Genomic Obesity?
journal, September 1997

  • Bennetzen, Jeffrey L.; Kellogg, Elizabeth A.
  • The Plant Cell, Vol. 9, Issue 9
  • DOI: 10.2307/3870439

The Spirodela polyrhiza genome reveals insights into its neotenous reduction fast growth and aquatic lifestyle
journal, February 2014

  • Wang, W.; Haberer, G.; Gundlach, H.
  • Nature Communications, Vol. 5, Issue 1
  • DOI: 10.1038/ncomms4311

    Works referencing / citing this record:

    Computational aspects underlying genome to phenome analysis in plants
    journal, January 2019

    • Bolger, Anthony M.; Poorter, Hendrik; Dumschott, Kathryn
    • The Plant Journal, Vol. 97, Issue 1
    • DOI: 10.1111/tpj.14179

    Haplotype-phased genome and evolution of phytonutrient pathways of tetraploid blueberry
    journal, January 2019


    Origin and evolution of the octoploid strawberry genome
    journal, February 2019


    Haplotype-resolved genomes of geminivirus-resistant and geminivirus-susceptible African cassava cultivars
    journal, September 2019


    The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution
    journal, April 2016

    • Baniaga, Anthony E.; Arrigo, Nils; Barker, Michael S.
    • Genome Biology and Evolution, Vol. 8, Issue 5
    • DOI: 10.1093/gbe/evw091

    The Rosa genome provides new insights into the domestication of modern roses
    journal, April 2018


    Single-molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
    journal, April 2017

    • Khost, Daniel E.; Eickbush, Danna G.; Larracuente, Amanda M.
    • Genome Research, Vol. 27, Issue 5
    • DOI: 10.1101/gr.213512.116

    Improved maize reference genome with single-molecule technologies
    journal, June 2017

    • Jiao, Yinping; Peluso, Paul; Shi, Jinghua
    • Nature, Vol. 546, Issue 7659
    • DOI: 10.1038/nature22971

    Developing naturally stress-resistant crops for a sustainable agriculture
    journal, November 2018


    Telling plant species apart with DNA: from barcodes to genomes
    journal, September 2016

    • Hollingsworth, Peter M.; Li, De-Zhu; van der Bank, Michelle
    • Philosophical Transactions of the Royal Society B: Biological Sciences, Vol. 371, Issue 1702
    • DOI: 10.1098/rstb.2015.0338

    The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization
    journal, December 2017

    • McCormick, Ryan F.; Truong, Sandra K.; Sreedasyam, Avinash
    • The Plant Journal, Vol. 93, Issue 2
    • DOI: 10.1111/tpj.13781

    PacBio single-molecule long-read sequencing shed new light on the complexity of the Carex breviculmis transcriptome
    journal, October 2019