skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: tRNA functional signatures classify plastids as late-branching cyanobacteria

Abstract

Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data. Using Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies (“phyloclassifies”) cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data. Phylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositionalmore » heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies.« less

Authors:
ORCiD logo [1];  [2];  [3];  [4]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of California, Merced, CA (United States)
  2. Univ. of California, Merced, CA (United States); Insight Data Science, San Francisco, CA (United States)
  3. Northern Illinois Univ., DeKalb, IL (United States)
  4. Univ. of California, Merced, CA (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE; National Science Foundation (NSF); NIH/NIAID
OSTI Identifier:
1606714
Grant/Contract Number:  
AC05-00OR22725; INSPIRE-1344279; 1R21AI127582-0; ACI-1429783
Resource Type:
Accepted Manuscript
Journal Name:
BMC Evolutionary Biology (Online)
Additional Journal Information:
Journal Name: BMC Evolutionary Biology (Online); Journal Volume: 19; Journal Issue: 1; Journal ID: ISSN 1471-2148
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; plastids; tRNAs; cyanobacteria; primary endosymbiosis; machine learning

Citation Formats

Lawrence, Travis J., Amrine, Katherine C. H., Swingley, Wesley D., and Ardell, David H.. tRNA functional signatures classify plastids as late-branching cyanobacteria. United States: N. p., 2019. Web. https://doi.org/10.1186/s12862-019-1552-7.
Lawrence, Travis J., Amrine, Katherine C. H., Swingley, Wesley D., & Ardell, David H.. tRNA functional signatures classify plastids as late-branching cyanobacteria. United States. https://doi.org/10.1186/s12862-019-1552-7
Lawrence, Travis J., Amrine, Katherine C. H., Swingley, Wesley D., and Ardell, David H.. Mon . "tRNA functional signatures classify plastids as late-branching cyanobacteria". United States. https://doi.org/10.1186/s12862-019-1552-7. https://www.osti.gov/servlets/purl/1606714.
@article{osti_1606714,
title = {tRNA functional signatures classify plastids as late-branching cyanobacteria},
author = {Lawrence, Travis J. and Amrine, Katherine C. H. and Swingley, Wesley D. and Ardell, David H.},
abstractNote = {Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data. Using Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies (“phyloclassifies”) cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data. Phylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies.},
doi = {10.1186/s12862-019-1552-7},
journal = {BMC Evolutionary Biology (Online)},
number = 1,
volume = 19,
place = {United States},
year = {2019},
month = {12}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process
journal, June 2004

  • Lartillot, Nicolas; Philippe, Hervé
  • Molecular Biology and Evolution, Vol. 21, Issue 6
  • DOI: 10.1093/molbev/msh112

A Site- and Time-Heterogeneous Model of Amino Acid Replacement
journal, January 2008

  • Blanquart, Samuel; Lartillot, Nicolas
  • Molecular Biology and Evolution, Vol. 25, Issue 5
  • DOI: 10.1093/molbev/msn018

Phylogeny and Self-Splicing Ability of the Plastid tRNA-Leu Group I Intron
journal, December 2003


An Early-Branching Freshwater Cyanobacterium at the Origin of Plastids
journal, February 2017

  • Ponce-Toledo, Rafael I.; Deschamps, Philippe; López-García, Purificación
  • Current Biology, Vol. 27, Issue 3
  • DOI: 10.1016/j.cub.2016.11.056

The plastid ancestor originated among one of the major cyanobacterial lineages
journal, September 2014

  • Ochoa de Alda, Jesús A. G.; Esteban, Rocío; Diago, María Luz
  • Nature Communications, Vol. 5, Issue 1
  • DOI: 10.1038/ncomms5937

ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences
journal, January 2004

  • Laslett, Dean; Canback, Bjorn
  • Nucleic Acids Research, Vol. 32, Issue 1, p. 11-16
  • DOI: 10.1093/nar/gkh152

Investigating Deep Phylogenetic Relationships among Cyanobacteria and Plastids by Small Subunit rRNA Sequence Analysis
journal, July 1999


Compilation of tRNA sequences and sequences of tRNA genes
journal, January 1998

  • Sprinzl, Mathias; Horn, Carsten; Brown, Melissa
  • Nucleic Acids Research, Vol. 26, Issue 1, p. 148-153
  • DOI: 10.1093/nar/26.1.148

Estimating the timing of early eukaryotic diversification with multigene molecular clocks
journal, August 2011

  • Parfrey, L. W.; Lahr, D. J. G.; Knoll, A. H.
  • Proceedings of the National Academy of Sciences, Vol. 108, Issue 33
  • DOI: 10.1073/pnas.1110633108

An Improved General Amino Acid Replacement Matrix
journal, April 2008


Genes of Cyanobacterial Origin in Plant Nuclear Genomes Point to a Heterocyst-Forming Plastid Ancestor
journal, February 2008

  • Deusch, O.; Landan, G.; Roettger, M.
  • Molecular Biology and Evolution, Vol. 25, Issue 4
  • DOI: 10.1093/molbev/msn022

Large-Scale Phylogenomic Analyses Indicate a Deep Origin of Primary Plastids within Cyanobacteria
journal, June 2011

  • Criscuolo, Alexis; Gribaldo, Simonetta
  • Molecular Biology and Evolution, Vol. 28, Issue 11
  • DOI: 10.1093/molbev/msr108

RNA sequence analysis using covariance models
journal, January 1994


Initiator tRNA genes template the 3′ CCA end at high frequencies in bacteria
journal, December 2016


Evolution: Red Algal Genome Affirms a Common Origin of All Plastids
journal, July 2004


Annotated English translation of Mereschkowsky's 1905 paper ‘Über Natur und Ursprung der Chromatophoren imPflanzenreiche’
journal, August 1999


The evolution of glycogen and starch metabolism in eukaryotes gives molecular clues to understand the establishment of plastid endosymbiosis
journal, January 2011

  • Ball, Steven; Colleoni, Christophe; Cenci, Ugo
  • Journal of Experimental Botany, Vol. 62, Issue 6
  • DOI: 10.1093/jxb/erq411

The origin and early evolution of plants on land
journal, September 1997

  • Kenrick, Paul; Crane, Peter R.
  • Nature, Vol. 389, Issue 6646
  • DOI: 10.1038/37918

tRNA Signatures Reveal a Polyphyletic Origin of SAR11 Strains among Alphaproteobacteria
journal, February 2014

  • Amrine, Katherine C. H.; Swingley, Wesley D.; Ardell, David H.
  • PLoS Computational Biology, Vol. 10, Issue 2
  • DOI: 10.1371/journal.pcbi.1003454

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence
journal, March 1997


Computational analysis of tRNA identity
journal, November 2009


Reductive genome evolution at both ends of the bacterial population size spectrum
journal, September 2014

  • Batut, Bérénice; Knibbe, Carole; Marais, Gabriel
  • Nature Reviews Microbiology, Vol. 12, Issue 12
  • DOI: 10.1038/nrmicro3331

Protein signatures (molecular synapomorphies) that are distinctive characteristics of the major cyanobacterial clades
journal, July 2009

  • Gupta, R. S.
  • INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Vol. 59, Issue 10
  • DOI: 10.1099/ijs.0.005678-0

The Evolutionary Origin of a Terrestrial Flora
journal, October 2015


Intraphylum Diversity and Complex Evolution of Cyanobacterial Aminoacyl-tRNA Synthetases
journal, August 2008

  • Luque, I.; Riera-Alberola, M. L.; Andujar, A.
  • Molecular Biology and Evolution, Vol. 25, Issue 11
  • DOI: 10.1093/molbev/msn197

On Reduced Amino Acid Alphabets for Phylogenetic Inference
journal, May 2007


Compositional Biases among Synonymous Substitutions Cause Conflict between Gene and Protein Trees for Plastid Origins
journal, May 2014

  • Li, Blaise; Lopes, João S.; Foster, Peter G.
  • Molecular Biology and Evolution, Vol. 31, Issue 7
  • DOI: 10.1093/molbev/msu105

Evolutionary constraints on the plastid tRNA set decoding methionine and isoleucine
journal, May 2012

  • Alkatib, Sibah; Fleischmann, Tobias T.; Scharff, Lars B.
  • Nucleic Acids Research, Vol. 40, Issue 14
  • DOI: 10.1093/nar/gks350

A new criterion and method for amino acid classification
journal, May 2004

  • Kosiol, Carolin; Goldman, Nick; H. Buttimore, Nigel
  • Journal of Theoretical Biology, Vol. 228, Issue 1
  • DOI: 10.1016/j.jtbi.2003.12.010

Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing
journal, December 2012

  • Shih, P. M.; Wu, D.; Latifi, A.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 3, p. 1053-1058
  • DOI: 10.1073/pnas.1217107110

SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building
journal, October 2009

  • Gouy, M.; Guindon, S.; Gascuel, O.
  • Molecular Biology and Evolution, Vol. 27, Issue 2
  • DOI: 10.1093/molbev/msp259

Plastid establishment did not require a chlamydial partner
journal, March 2015

  • Domman, Daryl; Horn, Matthias; Embley, T. Martin
  • Nature Communications, Vol. 6, Issue 1
  • DOI: 10.1038/ncomms7421

A brief review of molecular information theory
journal, September 2010


PhyloBayes MPI: Phylogenetic Reconstruction with Infinite Mixtures of Profiles in a Parallel Environment
journal, April 2013

  • Lartillot, Nicolas; Rodrigue, Nicolas; Stubbs, Daniel
  • Systematic Biology, Vol. 62, Issue 4
  • DOI: 10.1093/sysbio/syt022

Computing Bayes Factors Using Thermodynamic Integration
journal, April 2006


Opportunities and obstacles for deep learning in biology and medicine
journal, April 2018

  • Ching, Travers; Himmelstein, Daniel S.; Beaulieu-Jones, Brett K.
  • Journal of The Royal Society Interface, Vol. 15, Issue 141
  • DOI: 10.1098/rsif.2017.0387

Modeling Compositional Heterogeneity
journal, June 2004


Primary endosymbiosis events date to the later Proterozoic with cross-calibrated phylogenetic dating of duplicated ATPase proteins
journal, June 2013

  • Shih, P. M.; Matzke, N. J.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 30
  • DOI: 10.1073/pnas.1305813110

The gain of two chloroplast tRNA introns marks the green algal ancestors of land plants
journal, May 1990

  • Manhart, J. R.; Palmer, J. D.
  • Nature, Vol. 345, Issue 6272
  • DOI: 10.1038/345268a0

The Phylogeny of Plastids: a Review Based on Comparisons of Small-Subunit Ribosomal rna Coding Regions
journal, August 1995


Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes
journal, February 2004

  • Timmis, Jeremy N.; Ayliffe, Michael A.; Huang, Chun Y.
  • Nature Reviews Genetics, Vol. 5, Issue 2
  • DOI: 10.1038/nrg1271

Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model
journal, January 2007

  • Lartillot, Nicolas; Brinkmann, Henner; Philippe, Hervé
  • BMC Evolutionary Biology, Vol. 7, Issue Suppl 1
  • DOI: 10.1186/1471-2148-7-S1-S4

Compilation and comparison of transfer RNA genes from tobacco chloroplasts
journal, January 1989

  • Sugiura, Masahiro; Wakasugi, Tatsuya; Kung, Shain‐dow
  • Critical Reviews in Plant Sciences, Vol. 8, Issue 2
  • DOI: 10.1080/07352688909382271

The Revised Classification of Eukaryotes
journal, September 2012


TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA synthetase
journal, February 2006


Displaying the in formation contents of structural RNA alignments: the structure logos
journal, January 1997


Genome sequence of the cyanobacterium Prochlorococcus marinus SS120, a nearly minimal oxyphototrophic genome
journal, August 2003

  • Dufresne, A.; Salanoubat, M.; Partensky, F.
  • Proceedings of the National Academy of Sciences, Vol. 100, Issue 17
  • DOI: 10.1073/pnas.1733211100

Early photosynthetic eukaryotes inhabited low-salinity habitats
journal, August 2017

  • Sánchez-Baracaldo, Patricia; Raven, John A.; Pisani, Davide
  • Proceedings of the National Academy of Sciences, Vol. 114, Issue 37
  • DOI: 10.1073/pnas.1620089114

Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation
journal, August 2003

  • Rocap, Gabrielle; Larimer, Frank W.; Lamerdin, Jane
  • Nature, Vol. 424, Issue 6952
  • DOI: 10.1038/nature01947

FAST: FAST Analysis of Sequences Toolbox
journal, May 2015

  • Lawrence, Travis J.; Kauffman, Kyle T.; Amrine, Katherine C. H.
  • Frontiers in Genetics, Vol. 6
  • DOI: 10.3389/fgene.2015.00172

Genomes of Stigonematalean Cyanobacteria (Subsection V) and the Evolution of Oxygenic Photosynthesis from Prokaryotes to Plastids
journal, December 2012

  • Dagan, Tal; Roettger, Mayo; Stucken, Karina
  • Genome Biology and Evolution, Vol. 5, Issue 1
  • DOI: 10.1093/gbe/evs117

Difficult phylogenetic questions: more data, maybe; better methods, certainly
journal, December 2011


Dating the cyanobacterial ancestor of the chloroplast
journal, March 2010

  • Falcón, Luisa I.; Magallón, Susana; Castillo, Amanda
  • The ISME Journal, Vol. 4, Issue 6
  • DOI: 10.1038/ismej.2010.2

Metabolic Symbiosis and the Birth of the Plant Kingdom
journal, January 2008

  • Deschamps, P.; Colleoni, C.; Nakamura, Y.
  • Molecular Biology and Evolution, Vol. 25, Issue 3
  • DOI: 10.1093/molbev/msm280

Annotated English translation of Mereschkowsky's 1905 paper ‘Über Natur und Ursprung der Chromatophoren im Pflanzenreiche’
journal, August 1999


tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence
journal, March 1997