skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Inference of Transmission Network Structure from HIV Phylogenetic Trees

Abstract

Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures themore » transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.« less

Authors:
ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [3];  [4];  [2]
  1. Stockholm Univ. (Sweden). Dept. of Mathematics; Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  3. Karolinska Inst., Stockholm (Sweden). Dept. of Microbiology, Tumor and Cell Biology; Karolinska Univ. Hospital, Stockholm (Sweden)
  4. Stockholm Univ. (Sweden). Dept. of Mathematics
Publication Date:
Research Org.:
Stockholm Univ. (Sweden); Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE; National Inst. of Health (NIH) (United States); Swedish Research Council (SRC)
Contributing Org.:
Karolinska Inst., Stockholm (Sweden); Karolinska Univ. Hospital, Stockholm (Sweden)
OSTI Identifier:
1360708
Report Number(s):
LA-UR-16-27500
Journal ID: ISSN 1553-7358
Grant/Contract Number:
AC52-06NA25396; R01AI087520; 340-2013-5003
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
PLoS Computational Biology (Online)
Additional Journal Information:
Journal Name: PLoS Computational Biology (Online); Journal Volume: 13; Journal Issue: 1; Journal ID: ISSN 1553-7358
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; HIV-1; Cherries; Network analysis; Viral evolution; Infectious disease epidemiology; Phylogenetics; Phylogenetic analysis; Epidemiological statistics

Citation Formats

Giardina, Federica, Romero-Severson, Ethan Obie, Albert, Jan, Britton, Tom, and Leitner, Thomas. Inference of Transmission Network Structure from HIV Phylogenetic Trees. United States: N. p., 2017. Web. doi:10.1371/journal.pcbi.1005316.
Giardina, Federica, Romero-Severson, Ethan Obie, Albert, Jan, Britton, Tom, & Leitner, Thomas. Inference of Transmission Network Structure from HIV Phylogenetic Trees. United States. doi:10.1371/journal.pcbi.1005316.
Giardina, Federica, Romero-Severson, Ethan Obie, Albert, Jan, Britton, Tom, and Leitner, Thomas. Fri . "Inference of Transmission Network Structure from HIV Phylogenetic Trees". United States. doi:10.1371/journal.pcbi.1005316. https://www.osti.gov/servlets/purl/1360708.
@article{osti_1360708,
title = {Inference of Transmission Network Structure from HIV Phylogenetic Trees},
author = {Giardina, Federica and Romero-Severson, Ethan Obie and Albert, Jan and Britton, Tom and Leitner, Thomas},
abstractNote = {Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures the transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.},
doi = {10.1371/journal.pcbi.1005316},
journal = {PLoS Computational Biology (Online)},
number = 1,
volume = 13,
place = {United States},
year = {Fri Jan 13 00:00:00 EST 2017},
month = {Fri Jan 13 00:00:00 EST 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:
  • Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrialmore » data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements.« less
  • Glycoside hydrolase family 7 (GH7) cellobiohydrolases (CBHs) are enzymes often employed in plant cell wall degradation across eukaryotic kingdoms of life, as they provide significant hydrolytic potential in cellulose turnover. To date, many fungal GH7 CBHs have been examined, yet many questions regarding structure-activity relationships in these important natural and commercial enzymes remain. Here, we present the crystal structures and a biochemical analysis of two GH7 CBHs from social amoeba: Dictyostelium discoideum Cel7A (DdiCel7A) and Dictyostelium purpureum Cel7A (DpuCel7A). DdiCel7A and DpuCel7A natively consist of a catalytic domain and do not exhibit a carbohydrate-binding module (CBM). The structures of DdiCel7Amore » and DpuCel7A, resolved to 2.1 Å and 2.7 Å, respectively, are homologous to those of other GH7 CBHs with an enclosed active-site tunnel. Two primary differences between the Dictyostelium CBHs and the archetypal model GH7 CBH, Trichoderma reesei Cel7A (TreCel7A), occur near the hydrolytic active site and the product-binding sites. To compare the activities of these enzymes with the activity of TreCel7A, the family 1 TreCel7A CBM and linker were added to the C terminus of each of the Dictyostelium enzymes, creating DdiCel7A CBM and DpuCel7A CBM, which were recombinantly expressed in T. reesei. DdiCel7A CBM and DpuCel7A CBM hydrolyzed Avicel, pretreated corn stover, and phosphoric acid-swollen cellulose as efficiently as TreCel7A when hydrolysis was compared at their temperature optima. The K i of cellobiose was significantly higher for DdiCel7A CBM and DpuCel7A CBM than for TreCel7A: 205, 130, and 29 μM, respectively. Finally, taken together, the present study highlights the remarkable degree of conservation of the activity of these key natural and industrial enzymes across quite distant phylogenetic trees of life.« less
  • A probabilistic model of evolution in a character is presented. It involves two character states, 0 and 1. The population may have a third state, 01, in which there is polymorphism for both character states. There are three evolutionary events in the model: origination of state 1, reversion from state 1, and loss of polymorphism, plus an event corresponding to total misinterpretation of the character by the taxonomist. The maxium likelihood method of estimating the phylogeny is described. When the probabilities of the four events are taken to be extreme, then depending on their relative sizes under different circumstances fourmore » different phylogenetic inference methods emerge as maximum likelihood methods. Three are known: the Camin-Sokal parsimony method, Farris's Dollo parsimony method, and the Estabrook-Johnson-McMorris compatibility method. A new method, the polymorphism parsimony method, also emerges. It explains parallelism and convergence by persistence of character-state polymorphism after a unique orgin of the derived character state, and attempts to find that evolutionary tree which requires the least extent of polymorphism. Details of implementation of the polymorphism parsimony method are given. Some variants of the evolutionary model are discussed, involving unrooted character state trees. The use of the model to resolve a paradox which arises when we attempt to apply the Dollo parsimony method to multiple-state characters is briefly considered.« less
  • The crystal structure of the unbound form of HIV-1 subtype A protease (PR) has been determined to 1.7 {angstrom} resolution and refined as a homodimer in the hexagonal space group P6{sub 1} to an R{sub cryst} of 20.5%. The structure is similar in overall shape and fold to the previously determined subtype B, C and F PRs. The major differences lie in the conformation of the flap region. The flaps in the crystal structures of the unbound subtype B and C PRs, which were crystallized in tetragonal space groups, are either semi-open or wide open. In the present structure ofmore » subtype A PR the flaps are found in the closed position, a conformation that would be more anticipated in the structure of HIV protease complexed with an inhibitor. The amino-acid differences between the subtypes and their respective crystal space groups are discussed in terms of the differences in the flap conformations.« less
  • Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding themore » best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.« less