Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Multiple Hypothesis Testing to Detect Lineages under Positive Selection that Affects Only a Few Sites

Summary: Multiple Hypothesis Testing to Detect Lineages under Positive Selection that
Affects Only a Few Sites
Maria Anisimova and Ziheng Yang
Department of Biology and Centre for Mathematics and Physics in the Life Sciences and Experimental Biology, University College
London, London, United Kingdom
Detection of positive Darwinian selection has become ever more important with the rapid growth of genomic data sets.
Recent branch­site models of codon substitution account for variation of selective pressure over branches on the tree and
across sites in the sequence and provide a means to detect short episodes of molecular adaptation affecting just a few
sites. In likelihood ratio tests based on such models, the branches to be tested for positive selection have to be specified
a priori. In the absence of a biological hypothesis to designate so-called foreground branches, one may test many
branches, but a correction for multiple testing becomes necessary. In this paper, we employ computer simulation to
evaluate the performance of 6 multiple test correction procedures when the branch­site models are used to test every
branch on the phylogeny for positive selection. Four of the methods control the familywise error rates (FWERs), whereas
the other 2 control the false discovery rate (FDR). We found that all correction procedures achieved acceptable FWER
except for extremely divergent sequences and serious model violations, when the test may become unreliable. The power
of the test to detect positive selection is influenced by the strength of selection and the sequence divergence, with the
highest power observed at intermediate divergences. The 4 correction procedures that control the FWER had similar
power. We recommend Rom's procedure for its slightly higher power, but the simple Bonferroni correction is useable as
well. The 2 correction procedures that control the FDR had slightly more power and also higher FWER. We demonstrate
the multiple test procedures by analyzing gene sequences from the extracellular domain of the cluster of differentiation 2


Source: Anisimova, Maria - Institute of Scientific Computing, Eidgenössische Technische Hochschule Zürich (ETHZ)
Yang, Ziheng - Department of Genetics, Evolution and Environment, University College London


Collections: Biology and Medicine; Environmental Sciences and Ecology