DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Long-read, whole-genome shotgun sequence data for five model organisms

Abstract

Single molecule, real-time (SMRT) sequencing from Pacific Biosciences is increasingly used in many areas of biological research including de novo genome assembly, structural-variant identification, haplotype phasing, mRNA isoform discovery, and base-modification analyses. High-quality, public datasets of SMRT sequences can spur development of analytic tools that can accommodate unique characteristics of SMRT data (long read lengths, lack of GC or amplification bias, and a random error profile leading to high consensus accuracy). In this paper, we describe eight high-coverage SMRT sequence datasets from five organisms (Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster) that have been publicly released to the general scientific community (NCBI Sequence Read Archive ID SRP040522). Data were generated using two sequencing chemistries (P4C2 and P5C3) on the PacBio RS II instrument. The datasets reported here can be used without restriction by the research community to generate whole-genome assemblies, test new algorithms, investigate genome structure and evolution, and identify base modifications in some of the most widely-studied model systems in biological research.

Authors:
 [1];  [1];  [1];  [2];  [3];  [3];  [1];  [1];  [1];  [4];  [2];  [3];  [5];  [6];  [1]
  1. Pacific Biosciences of California Inc., Menlo Park, CA (United States)
  2. Flinders Univ., Adelaide, SA (Australia). School of Biological Sciences
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Dept. of Genome Dynamics
  4. Univ. of California, San Francisco, CA (United States). Dept. of Microbiology and Immunology
  5. National Biodefense Analysis and Countermeasures Center, Frederick, MD (United States)
  6. Univ. of Manchester (United Kingdom). Faculty of Life Sciences
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
OSTI Identifier:
1624538
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Scientific Data
Additional Journal Information:
Journal Volume: 1; Journal Issue: 1; Journal ID: ISSN 2052-4463
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Science & Technology - Other Topics

Citation Formats

Kim, Kristi E., Peluso, Paul, Babayan, Primo, Yeadon, P. Jane, Yu, Charles, Fisher, William W., Chin, Chen-Shan, Rapicavoli, Nicole A., Rank, David R., Li, Joachim, Catcheside, David E. A., Celniker, Susan E., Phillippy, Adam M., Bergman, Casey M., and Landolin, Jane M. Long-read, whole-genome shotgun sequence data for five model organisms. United States: N. p., 2014. Web. doi:10.1038/sdata.2014.45.
Kim, Kristi E., Peluso, Paul, Babayan, Primo, Yeadon, P. Jane, Yu, Charles, Fisher, William W., Chin, Chen-Shan, Rapicavoli, Nicole A., Rank, David R., Li, Joachim, Catcheside, David E. A., Celniker, Susan E., Phillippy, Adam M., Bergman, Casey M., & Landolin, Jane M. Long-read, whole-genome shotgun sequence data for five model organisms. United States. https://doi.org/10.1038/sdata.2014.45
Kim, Kristi E., Peluso, Paul, Babayan, Primo, Yeadon, P. Jane, Yu, Charles, Fisher, William W., Chin, Chen-Shan, Rapicavoli, Nicole A., Rank, David R., Li, Joachim, Catcheside, David E. A., Celniker, Susan E., Phillippy, Adam M., Bergman, Casey M., and Landolin, Jane M. Tue . "Long-read, whole-genome shotgun sequence data for five model organisms". United States. https://doi.org/10.1038/sdata.2014.45. https://www.osti.gov/servlets/purl/1624538.
@article{osti_1624538,
title = {Long-read, whole-genome shotgun sequence data for five model organisms},
author = {Kim, Kristi E. and Peluso, Paul and Babayan, Primo and Yeadon, P. Jane and Yu, Charles and Fisher, William W. and Chin, Chen-Shan and Rapicavoli, Nicole A. and Rank, David R. and Li, Joachim and Catcheside, David E. A. and Celniker, Susan E. and Phillippy, Adam M. and Bergman, Casey M. and Landolin, Jane M.},
abstractNote = {Single molecule, real-time (SMRT) sequencing from Pacific Biosciences is increasingly used in many areas of biological research including de novo genome assembly, structural-variant identification, haplotype phasing, mRNA isoform discovery, and base-modification analyses. High-quality, public datasets of SMRT sequences can spur development of analytic tools that can accommodate unique characteristics of SMRT data (long read lengths, lack of GC or amplification bias, and a random error profile leading to high consensus accuracy). In this paper, we describe eight high-coverage SMRT sequence datasets from five organisms (Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster) that have been publicly released to the general scientific community (NCBI Sequence Read Archive ID SRP040522). Data were generated using two sequencing chemistries (P4C2 and P5C3) on the PacBio RS II instrument. The datasets reported here can be used without restriction by the research community to generate whole-genome assemblies, test new algorithms, investigate genome structure and evolution, and identify base modifications in some of the most widely-studied model systems in biological research.},
doi = {10.1038/sdata.2014.45},
journal = {Scientific Data},
number = 1,
volume = 1,
place = {United States},
year = {Tue Nov 25 00:00:00 EST 2014},
month = {Tue Nov 25 00:00:00 EST 2014}
}

Works referenced in this record:

Hybrid error correction and de novo assembly of single-molecule sequencing reads
journal, July 2012

  • Koren, Sergey; Schatz, Michael C.; Walenz, Brian P.
  • Nature Biotechnology, Vol. 30, Issue 7
  • DOI: 10.1038/nbt.2280

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data
journal, May 2013

  • Chin, Chen-Shan; Alexander, David H.; Marks, Patrick
  • Nature Methods, Vol. 10, Issue 6
  • DOI: 10.1038/nmeth.2474

Improved performance of the PacBio SMRT technology for 16S rDNA sequencing
journal, September 2014


Population Analysis of Large Copy Number Variants and Hotspots of Human Genetic Disease
journal, February 2009

  • Itsara, Andy; Cooper, Gregory M.; Baker, Carl
  • The American Journal of Human Genetics, Vol. 84, Issue 2
  • DOI: 10.1016/j.ajhg.2008.12.014

Fast and accurate long-read alignment with Burrows–Wheeler transform
journal, January 2010


Distribution of Lysine Pathways Among Fungi: Evolutionary Implications
journal, November 1964

  • Vogel, Henry J.
  • The American Naturalist, Vol. 98, Issue 903
  • DOI: 10.1086/282338

Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology
journal, November 2012


The Complete Genome Sequence of Escherichia coli K-12
journal, September 1997


Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory
journal, September 2012


SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
journal, May 2012

  • Bankevich, Anton; Nurk, Sergey; Antipov, Dmitry
  • Journal of Computational Biology, Vol. 19, Issue 5
  • DOI: 10.1089/cmb.2012.0021

Global methylation state at base-pair resolution of the Caulobacter genome throughout the cell cycle
journal, November 2013

  • Kozdon, J. B.; Melfi, M. D.; Luong, K.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 48
  • DOI: 10.1073/pnas.1319315110

A flexible and efficient template format for circular consensus sequencing and SNP detection
journal, June 2010

  • Travers, K. J.; Chin, C. -S.; Rank, D. R.
  • Nucleic Acids Research, Vol. 38, Issue 15
  • DOI: 10.1093/nar/gkq543

Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing
posted_content, August 2014

  • Berlin, Konstantin; Koren, Sergey; Chin, Chen-Shan
  • bioRxiv
  • DOI: 10.1101/008003

The advantages of SMRT sequencing
journal, June 2013

  • Roberts, Richard J.; Carneiro, Mauricio O.; Schatz, Michael C.
  • Genome Biology, Vol. 14, Issue 6
  • DOI: 10.1186/gb-2013-14-6-405

The Reference Genome Sequence of Saccharomyces cerevisiae : Then and Now
journal, December 2013

  • Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.
  • G3: Genes|Genomes|Genetics, Vol. 4, Issue 3
  • DOI: 10.1534/g3.113.008995

Nuclease-mediated gene editing by homologous recombination of the human globin locus
journal, October 2013

  • Voit, Richard A.; Hendel, Ayal; Pruett-Miller, Shondra M.
  • Nucleic Acids Research, Vol. 42, Issue 2
  • DOI: 10.1093/nar/gkt947

Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogastereuchromatic genome sequence
journal, January 2002


Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
journal, December 2000

  • Arabidopsis Genome Initiative,
  • Nature, Vol. 408, Issue 6814, p. 796-815
  • DOI: 10.1038/35048692

Pacific biosciences sequencing technology for genotyping and variation discovery in human data
journal, January 2012


Long-Read Sequencing of Chicken Transcripts and Identification of New Transcript Isoforms
journal, April 2014


Direct detection of DNA methylation during single-molecule, real-time sequencing
journal, May 2010

  • Flusberg, Benjamin A.; Webster, Dale R.; Lee, Jessica H.
  • Nature Methods, Vol. 7, Issue 6
  • DOI: 10.1038/nmeth.1459

PBHoney: identifying genomic variants via long-read discordance and interrupted mapping
journal, June 2014

  • English, Adam C.; Salerno, William J.; Reid, Jeffrey G.
  • BMC Bioinformatics, Vol. 15, Issue 1
  • DOI: 10.1186/1471-2105-15-180

Charting the genomic landscape of seed-free plants
text, January 2021

  • Péter, Szövényi,; Andika, Gunadi,; Fay-Wei, Li,
  • Nature Publishing Group
  • DOI: 10.5167/uzh-203460

Real-Time DNA Sequencing from Single Polymerase Molecules
journal, January 2009


Defining a personal, allele-specific, and single-molecule long-read transcriptome
journal, June 2014

  • Tilgner, Hagen; Grubert, Fabian; Sharon, Donald
  • Proceedings of the National Academy of Sciences, Vol. 111, Issue 27
  • DOI: 10.1073/pnas.1400447111

Population Analysis of Large Copy Number Variants and Hotspots of Human Genetic Disease
journal, April 2009

  • Itsara, Andy; Cooper, Gregory M.; Baker, Carl
  • The American Journal of Human Genetics, Vol. 84, Issue 4
  • DOI: 10.1016/j.ajhg.2009.03.008

Reducing assembly complexity of microbial genomes with single-molecule sequencing
journal, January 2013


The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011

  • Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr1090

Structural Variation in the Human Genome and its Role in Disease
journal, February 2010


The advantages of SMRT sequencing
journal, July 2013

  • Roberts, Richard J.; Carneiro, Mauricio O.; Schatz, Michael C.
  • Genome Biology, Vol. 14, Issue 7
  • DOI: 10.1186/gb-2013-14-7-405

The genome sequence of the filamentous fungus Neurospora crassa
journal, April 2003

  • Galagan, James E.; Calvo, Sarah E.; Borkovich, Katherine A.
  • Nature, Vol. 422, Issue 6934
  • DOI: 10.1038/nature01554

Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing
journal, December 2011

  • Clark, Tyson A.; Murray, Iain A.; Morgan, Richard D.
  • Nucleic Acids Research, Vol. 40, Issue 4
  • DOI: 10.1093/nar/gkr1146

Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing
journal, November 2012

  • Fang, Gang; Munera, Diana; Friedman, David I.
  • Nature Biotechnology, Vol. 30, Issue 12
  • DOI: 10.1038/nbt.2432

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing
journal, May 2015

  • Berlin, Konstantin; Koren, Sergey; Chin, Chen-Shan
  • Nature Biotechnology, Vol. 33, Issue 6
  • DOI: 10.1038/nbt.3238

Production of viable chicken by allogeneic transplantation of primordial germ cells induced from somatic cells
journal, May 2021


Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine
journal, November 2011

  • Song, Chun-Xiao; Clark, Tyson A.; Lu, Xing-Yu
  • Nature Methods, Vol. 9, Issue 1
  • DOI: 10.1038/nmeth.1779

Genetic analysis of the brahma gene of Drosophila melanogaster and polytene chromosome subdivisions 72AB.
journal, July 1994


ENCODE-like study using PacBio sequencing
dataset, January 2014


Population Analysis of Large Copy Number Variants and Hotspots of Human Genetic Disease
journal, April 2009

  • Itsara, Andy; Cooper, Gregory M.; Baker, Carl
  • The American Journal of Human Genetics, Vol. 84, Issue 4
  • DOI: 10.1016/j.ajhg.2009.03.008

Improved performance of the PacBio SMRT technology for 16S rDNA sequencing
journal, September 2014


The genome sequence of the filamentous fungus Neurospora crassa
journal, April 2003

  • Galagan, James E.; Calvo, Sarah E.; Borkovich, Katherine A.
  • Nature, Vol. 422, Issue 6934
  • DOI: 10.1038/nature01554

Hybrid error correction and de novo assembly of single-molecule sequencing reads
journal, July 2012

  • Koren, Sergey; Schatz, Michael C.; Walenz, Brian P.
  • Nature Biotechnology, Vol. 30, Issue 7
  • DOI: 10.1038/nbt.2280

Direct detection of DNA methylation during single-molecule, real-time sequencing
journal, May 2010

  • Flusberg, Benjamin A.; Webster, Dale R.; Lee, Jessica H.
  • Nature Methods, Vol. 7, Issue 6
  • DOI: 10.1038/nmeth.1459

Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine
journal, November 2011

  • Song, Chun-Xiao; Clark, Tyson A.; Lu, Xing-Yu
  • Nature Methods, Vol. 9, Issue 1
  • DOI: 10.1038/nmeth.1779

Defining a personal, allele-specific, and single-molecule long-read transcriptome
journal, June 2014

  • Tilgner, Hagen; Grubert, Fabian; Sharon, Donald
  • Proceedings of the National Academy of Sciences, Vol. 111, Issue 27
  • DOI: 10.1073/pnas.1400447111

Fast and accurate long-read alignment with Burrows–Wheeler transform
journal, January 2010


Genetic analysis of the brahma gene of Drosophila melanogaster and polytene chromosome subdivisions 72AB.
journal, July 1994


A flexible and efficient template format for circular consensus sequencing and SNP detection
journal, June 2010

  • Travers, K. J.; Chin, C. -S.; Rank, D. R.
  • Nucleic Acids Research, Vol. 38, Issue 15
  • DOI: 10.1093/nar/gkq543

Nuclease-mediated gene editing by homologous recombination of the human globin locus
journal, October 2013

  • Voit, Richard A.; Hendel, Ayal; Pruett-Miller, Shondra M.
  • Nucleic Acids Research, Vol. 42, Issue 2
  • DOI: 10.1093/nar/gkt947

Real-Time DNA Sequencing from Single Polymerase Molecules
journal, January 2009


Exploring the Roles of DNA Methylation in the Metal-Reducing Bacterium Shewanella oneidensis MR-1
journal, August 2013

  • Bendall, M. L.; Luong, K.; Wetmore, K. M.
  • Journal of Bacteriology, Vol. 195, Issue 21
  • DOI: 10.1128/jb.00935-13

Structural Variation in the Human Genome and its Role in Disease
journal, February 2010


Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory
journal, September 2012


PBHoney: identifying genomic variants via long-read discordance and interrupted mapping
journal, June 2014

  • English, Adam C.; Salerno, William J.; Reid, Jeffrey G.
  • BMC Bioinformatics, Vol. 15, Issue 1
  • DOI: 10.1186/1471-2105-15-180

Pacific biosciences sequencing technology for genotyping and variation discovery in human data
journal, January 2012


Reducing assembly complexity of microbial genomes with single-molecule sequencing
journal, January 2013


Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology
journal, November 2012


Long-Read Sequencing of Chicken Transcripts and Identification of New Transcript Isoforms
journal, April 2014


The Reference Genome Sequence of Saccharomyces cerevisiae : Then and Now
journal, December 2013

  • Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.
  • G3: Genes|Genomes|Genetics, Vol. 4, Issue 3
  • DOI: 10.1534/g3.113.008995

Works referencing / citing this record:

A comparative evaluation of hybrid error correction methods for error-prone long reads
journal, February 2019


The Complete Chloroplast Genome Sequences for Four Amaranthus Species (Amaranthaceae)
journal, September 2016

  • Chaney, Lindsay; Mangelson, Ryan; Ramaraj, Thiruvarangan
  • Applications in Plant Sciences, Vol. 4, Issue 9
  • DOI: 10.3732/apps.1600063

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters
journal, July 2017

  • Li, Wan-Chen; Huang, Chien-Hao; Chen, Chia-Ling
  • Biotechnology for Biofuels, Vol. 10, Issue 1
  • DOI: 10.1186/s13068-017-0825-x

Islands of retroelements are the major components of Drosophila centromeres
journal, February 2019

  • Chang, Ching-Ho; Chavan, Ankita; Palladino, Jason
  • PLOS Biology
  • DOI: 10.1101/537357

Birth of a new gene on the Y chromosome of Drosophila melanogaster
journal, September 2015

  • Carvalho, Antonio Bernardo; Vicoso, Beatriz; Russo, Claudia A. M.
  • Proceedings of the National Academy of Sciences, Vol. 112, Issue 40
  • DOI: 10.1073/pnas.1516543112

Gerbil: A Fast and Memory-Efficient k-mer Counter with GPU-Support
book, January 2016


Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage
posted_content, January 2016

  • Chakraborty, Mahul; Baldwin-Brown, James G.; Long, Anthony D.
  • bioRxiv
  • DOI: 10.1101/029306

Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing
journal, July 2015


Gerbil: A Fast and Memory-Efficient $k$-mer Counter with GPU-Support
preprint, January 2016


Clostridium fermenticellae sp. nov., isolated from the mud in a fermentation cellar for the production of the Chinese liquor, baijiu
journal, March 2019

  • Xu, Peng-Xiang; Chai, Li-Juan; Qiu, Ting
  • International Journal of Systematic and Evolutionary Microbiology, Vol. 69, Issue 3
  • DOI: 10.1099/ijsem.0.003254

Characteristics and homogeneity of N6-methylation in human genomes
journal, March 2019


Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage
journal, July 2016

  • Chakraborty, Mahul; Baldwin-Brown, James G.; Long, Anthony D.
  • Nucleic Acids Research
  • DOI: 10.1093/nar/gkw654

A chromosome-scale assembly of the major African malaria vector Anopheles funestus
posted_content, December 2018

  • Ghurye, Jay; Koren, Sergey; Small, Scott T.
  • bioRxiv
  • DOI: 10.1101/492777

Standing chromosomal variation in Lake Whitefish species pairs: the role of historical contingency and relevance for speciation
journal, September 2016

  • Dion-Côté, Anne-Marie; Symonová, Radka; Lamaze, Fabien C.
  • Molecular Ecology, Vol. 26, Issue 1
  • DOI: 10.1111/mec.13816

Unique transposon landscapes are pervasive across Drosophila melanogaster genomes
journal, November 2015

  • Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay
  • Nucleic Acids Research, Vol. 43, Issue 22
  • DOI: 10.1093/nar/gkv1193

Cryptic genetic variation accelerates evolution by opening access to diverse adaptive peaks
journal, July 2019


A De Novo Genome Sequence Assembly of the Arabidopsis thaliana Accession Niederzenz-1 Displays Presence/Absence Variation and Strong Synteny
journal, October 2016


Mapping-based genome size estimation
posted_content, January 2019


GENOME REPORT: High-quality genome assemblies of 15 Drosophila species generated using Nanopore sequencing
journal, June 2018

  • Miller, Danny E.; Staber, Cynthia; Zeitlinger, Julia
  • G3 Genes|Genomes|Genetics
  • DOI: 10.1101/267393

Chromosome-level assembly of Arabidopsis thaliana L er reveals the extent of translocation and inversion polymorphisms
journal, June 2016

  • Zapata, Luis; Ding, Jia; Willing, Eva-Maria
  • Proceedings of the National Academy of Sciences, Vol. 113, Issue 28
  • DOI: 10.1073/pnas.1607532113

LoRTE: Detecting transposon-induced genomic variants using low coverage PacBio long read sequences
journal, April 2017


Cryptic genetic variation accelerates evolution by opening access to diverse adaptive peaks
text, January 2019

  • Zheng, Jia; Payne, Joshua L.; Wagner, Andreas
  • American Association for the Advancement of Science
  • DOI: 10.5167/uzh-182149

A field ornithologist's guide to genomics: Practical considerations for ecology and conservation
journal, October 2016

  • Oyler-McCance, Sara J.; Oh, Kevin P.; Langin, Kathryn M.
  • The Auk, Vol. 133, Issue 4
  • DOI: 10.1642/auk-16-49.1

Chloroplast Genome Sequence of Arabidopsis thaliana Accession Landsberg erecta , Assembled from Single-Molecule, Real-Time Sequencing Data
journal, September 2016

  • Stadermann, Kai Bernd; Holtgräwe, Daniela; Weisshaar, Bernd
  • Genome Announcements, Vol. 4, Issue 5
  • DOI: 10.1128/genomea.00975-16

High Resolution Annotation of Zebrafish Transcriptome Using Long-Read Sequencing
posted_content, August 2017

  • Nudelman, German; Frasca, Antonio; Kent, Brandon
  • Genome Research
  • DOI: 10.1101/174821

A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set
journal, May 2019


A high-quality genome assembly from a single, field-collected spotted lanternfly (Lycorma delicatula) using the PacBio Sequel II system
journal, October 2019


Long-read sequence capture of the hemoglobin gene clusters across species
posted_content, January 2018

  • Hoff, Siv Nam Khang; Baalsrud, Helle T.; Tooming-Klunderud, Ave
  • bioRxiv
  • DOI: 10.1101/297796

Contrasting evolutionary genome dynamics between domesticated and wild yeasts
journal, April 2017

  • Yue, Jia-Xing; Li, Jing; Aigrain, Louise
  • Nature Genetics, Vol. 49, Issue 6
  • DOI: 10.1038/ng.3847

Single-molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
journal, April 2017

  • Khost, Daniel E.; Eickbush, Danna G.; Larracuente, Amanda M.
  • Genome Research, Vol. 27, Issue 5
  • DOI: 10.1101/gr.213512.116

Commensal Propionibacterium strain UF1 mitigates intestinal inflammation via Th17 cell regulation
journal, September 2017

  • Colliou, Natacha; Ge, Yong; Sahay, Bikash
  • Journal of Clinical Investigation, Vol. 127, Issue 11
  • DOI: 10.1172/jci95376

Chromosome-level hybrid de novo genome assemblies as an attainable option for non-model organisms
posted_content, January 2019

  • Jaworski, Coline C.; Allan, Carson W.; Matzkin, Luciano M.
  • bioRxiv
  • DOI: 10.1101/748228

Illumina error correction near highly repetitive DNA regions improves de novo genome assembly
journal, June 2019


Gerbil: a fast and memory-efficient k-mer counter with GPU-support
journal, March 2017

  • Erbert, Marius; Rechner, Steffen; Müller-Hannemann, Matthias
  • Algorithms for Molecular Biology, Vol. 12, Issue 1
  • DOI: 10.1186/s13015-017-0097-9

Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome
journal, August 2015


Single-Molecule Sequencing of the Drosophila serrata Genome
journal, January 2017

  • Allen, Scott L.; Delaney, Emily K.; Kopp, Artyom
  • G3: Genes|Genomes|Genetics, Vol. 7, Issue 3
  • DOI: 10.1534/g3.116.037598

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing
journal, May 2015

  • Berlin, Konstantin; Koren, Sergey; Chin, Chen-Shan
  • Nature Biotechnology, Vol. 33, Issue 6
  • DOI: 10.1038/nbt.3238

A chromosome-scale assembly of the major African malaria vector Anopheles funestus
journal, June 2019


Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula
journal, August 2017


Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation
journal, March 2017

  • Koren, Sergey; Walenz, Brian P.; Berlin, Konstantin
  • Genome Research, Vol. 27, Issue 5
  • DOI: 10.1101/gr.215087.116

Characteristics and homogeneity of N6-methylation in human genomes.
text, January 2019

  • Pacini, Clare E.; Bradshaw, Charles; Garrett, Nigel
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.37741

A Chromosome-level Sequence Assembly Reveals the Structure of the Arabidopsis thaliana Nd-1 Genome and its Gene Set
posted_content, January 2019

  • Pucker, Boas; Holtgräwe, Daniela; Stadermann, Kai Bernd
  • bioRxiv
  • DOI: 10.1101/407627

LoRTE: Detecting transposon-induced genomic variants using low coverage PacBio long read sequences
posted_content, September 2016


Rapid low-cost assembly of the Drosophila melanogaster reference genome using low-coverage, long-read sequencing
journal, June 2018

  • Solares, Edwin A.; Chakraborty, Mahul; Miller, Danny E.
  • G3: Genes|Genomes|Genetics
  • DOI: 10.1101/267401

Rapid Low-Cost Assembly of the Drosophila melanogaster Reference Genome Using Low-Coverage, Long-Read Sequencing
journal, October 2018

  • Solares, Edwin A.; Chakraborty, Mahul; Miller, Danny E.
  • G3 Genes|Genomes|Genetics, Vol. 8, Issue 10
  • DOI: 10.1534/g3.118.200162

Highly Contiguous Genome Assemblies of 15 Drosophila Species Generated Using Nanopore Sequencing
journal, August 2018

  • Miller, Danny E.; Staber, Cynthia; Zeitlinger, Julia
  • G3: Genes|Genomes|Genetics, Vol. 8, Issue 10
  • DOI: 10.1534/g3.118.200160

Single molecule long read sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
journal, July 2016

  • Khost, D. E.; Eickbush, D. G.; Larracuente, A. M.
  • Genome Research
  • DOI: 10.1101/054155

High-Quality Draft Genome Sequence and Annotation of the Basidiomycete Yeast Sporisorium graminicola CBS10092, a Producer of Mannosylerythritol Lipids
journal, October 2019

  • Solano-González, Stefany; Darby, Alistair C.; Cossar, Doug
  • Microbiology Resource Announcements, Vol. 8, Issue 42
  • DOI: 10.1128/mra.00479-19

Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies
journal, April 2015

  • Utturkar, Sagar M.; Klingeman, Dawn M.; Bruno-Barcena, José M.
  • Scientific Data, Vol. 2, Issue 1
  • DOI: 10.1038/sdata.2015.14

A complete bacterial genome assembled de novo using only nanopore sequencing data
journal, June 2015

  • Loman, Nicholas J.; Quick, Joshua; Simpson, Jared T.
  • Nature Methods, Vol. 12, Issue 8
  • DOI: 10.1038/nmeth.3444

High resolution annotation of zebrafish transcriptome using long-read sequencing
journal, July 2018

  • Nudelman, German; Frasca, Antonio; Kent, Brandon
  • Genome Research, Vol. 28, Issue 9
  • DOI: 10.1101/gr.223586.117

Assembly of long error-prone reads using de Bruijn graphs
journal, December 2016

  • Lin, Yu; Yuan, Jeffrey; Kolmogorov, Mikhail
  • Proceedings of the National Academy of Sciences, Vol. 113, Issue 52
  • DOI: 10.1073/pnas.1604560113

Heterochromatin-Enriched Assemblies Reveal the Sequence and Organization of the Drosophila melanogaster Y Chromosome
journal, November 2018


Mistranslation can enhance fitness through purging of deleterious mutations
journal, May 2017

  • Bratulic, Sinisa; Toll-Riera, Macarena; Wagner, Andreas
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms15410

Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies
journal, April 2015

  • Utturkar, Sagar M.; Klingeman, Dawn M.; Bruno-Barcena, José M.
  • Scientific Data, Vol. 2, Issue 1
  • DOI: 10.1038/sdata.2015.14

Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage
journal, July 2016

  • Chakraborty, Mahul; Baldwin-Brown, James G.; Long, Anthony D.
  • Nucleic Acids Research
  • DOI: 10.1093/nar/gkw654

A chromosome-scale assembly of the major African malaria vector Anopheles funestus
posted_content, December 2018

  • Ghurye, Jay; Koren, Sergey; Small, Scott T.
  • bioRxiv
  • DOI: 10.1101/492777

Multiplex sequencing of bacterial artificial chromosomes for assembling complex plant genomes
journal, January 2016

  • Beier, Sebastian; Himmelbach, Axel; Schmutzer, Thomas
  • Plant Biotechnology Journal, Vol. 14, Issue 7
  • DOI: 10.1111/pbi.12511

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula
journal, August 2017


Gerbil: a fast and memory-efficient k-mer counter with GPU-support
journal, March 2017

  • Erbert, Marius; Rechner, Steffen; Müller-Hannemann, Matthias
  • Algorithms for Molecular Biology, Vol. 12, Issue 1
  • DOI: 10.1186/s13015-017-0097-9

Islands of retroelements are major components of Drosophila centromeres
journal, May 2019


Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing
journal, July 2015


Long-Read Single Molecule Sequencing to Resolve Tandem Gene Copies: The Mst77Y Region on the Drosophila melanogaster Y Chromosome
journal, June 2015

  • Krsticevic, Flavia J.; Schrago, Carlos G.; Carvalho, A. Bernardo
  • G3 Genes|Genomes|Genetics, Vol. 5, Issue 6
  • DOI: 10.1534/g3.115.017277

Single-Molecule Sequencing of the Drosophila serrata Genome
journal, January 2017

  • Allen, Scott L.; Delaney, Emily K.; Kopp, Artyom
  • G3: Genes|Genomes|Genetics, Vol. 7, Issue 3
  • DOI: 10.1534/g3.116.037598

Highly Contiguous Genome Assemblies of 15 Drosophila Species Generated Using Nanopore Sequencing
journal, August 2018

  • Miller, Danny E.; Staber, Cynthia; Zeitlinger, Julia
  • G3: Genes|Genomes|Genetics, Vol. 8, Issue 10
  • DOI: 10.1534/g3.118.200160

Identification of thyroid tumor cell vulnerabilities through a siRNA-based functional screening
journal, September 2015


Detection of Genomic Structural Variants from Next-Generation Sequencing Data
journal, June 2015

  • Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto
  • Frontiers in Bioengineering and Biotechnology, Vol. 3
  • DOI: 10.3389/fbioe.2015.00092

Targeting Non-Oncogene Addiction: Focus on Thyroid Cancer
journal, January 2020


Next-Generation Sequencing Approaches in Cancer: Where Have They Brought Us and Where Will They Take Us?
journal, September 2015