Performance, Accuracy, and Web Server for Evolutionary Placement of Short Sequence Reads under Maximum Likelihood

May 2011
Systematic Biology;May2011, Vol. 60 Issue 3, p291
Academic Journal
We present an evolutionary placement algorithm (EPA) and a Web server for the rapid assignment of sequence fragments (short reads) to edges of a given phylogenetic tree under the maximum-likelihood model. The accuracy of the algorithm is evaluated on several real-world data sets and compared with placement by pair-wise sequence comparison, using edit distances and BLAST. We introduce a slow and accurate as well as a fast and less accurate placement algorithm. For the slow algorithm, we develop additional heuristic techniques that yield almost the same run times as the fast version with only a small loss of accuracy. When those additional heuristics are employed, the run time of the more accurate algorithm is comparable with that of a simple BLAST search for data sets with a high number of short query sequences. Moreover, the accuracy of the EPA is significantly higher, in particular when the sample of taxa in the reference topology is sparse or inadequate. Our algorithm, which has been integrated into RAxML, therefore provides an equally fast but more accurate alternative to BLAST for tree-based inference of the evolutionary origin and composition of short sequence reads. We are also actively developing a Web server that offers a freely available service for computing read placements on trees using the EPA.


Related Articles

  • Increasing the Efficiency of Searches for the Maximum Likelihood Tree in a Phylogenetic Analysis of up to 150 Nucleotide Sequences. Morrison, David A. // Systematic Biology;Dec2007, Vol. 56 Issue 6, p988 

    Even when the maximum likelihood (ML) tree is a better estimate of the true phylogenetic tree than those produced by other methods, the result of a poor ML search may be no better than that of a more thorough search under some faster criterion. The ability to find the globally optimal ML tree is...

  • Parsimony and Model-Based Analyses of Indels in Avian Nuclear Genes Reveal Congruent and Incongruent Phylogenetic Signals. Yuri, Tamaki; Kimball, Rebecca T.; Harshman, John; Bowie, Rauri C. K.; Braun, Michael J.; Chojnowski, Jena L.; Kin-Lan Han; Hackett, Shannon J.; Huddleston, Christopher J.; Moore, William S.; Reddy, Sushma; Sheldon, Frederick H.; Steadman, David W.; Witt, Christopher C.; Braun, Edward L. // Biology (2079-7737);Mar2013, Vol. 2 Issue 1, p419 

    Insertion/deletion (indel) mutations, which are represented by gaps in multiple sequence alignments, have been used to examine phylogenetic hypotheses for some time. However, most analyses combine gap data with the nucleotide sequences in which they are embedded, probably because most...

  • RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation. Liu, Kevin; Linder, C. Randal; Warnow, Tandy // PLoS ONE;2011, Vol. 6 Issue 11, p1 

    Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular...

  • The maximum likelihood degree of Fermat hypersurfaces. Agostini, Daniele; Alberelli, Davide; Grande, Francesco; Lella, Paolo // Journal of Algebraic Statistics;2015, Vol. 6 Issue 2, p108 

    We study the critical points of the likelihood function over the Fermat hypersurface. This problem is related to one of the main problems in statistical optimization: maximum likelihood estimation. The number of critical points over a projective variety is a topological invariant of the variety...

  • Long Branch Effects Distort Maximum Likelihood Phylogenies in Simulations Despite Selection of the Correct Model. Kück, Patrick; Mayer, Christoph; Wägele, Johann-Wolfgang; Misof, Bernhard // PLoS ONE;May2012, Vol. 7 Issue 5, p1 

    The aim of our study was to test the robustness and efficiency of maximum likelihood with respect to different long branch effects on multiple-taxon trees. We simulated data of different alignment lengths under two different 11-taxon trees and a broad range of different branch length conditions....

  • Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection. Brown, Gavin; Pocock, Adam; Ming-Jie Zhao; Luján, Mikel // Journal of Machine Learning Research;Jan2012, Vol. 13 Issue 1, p27 

    We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: "what are the implicit statistical assumptions of feature selection...

  • phangorn: phylogenetic analysis in R. Schliep, Klaus Peter // Bioinformatics;Feb2011, Vol. 27 Issue 4, p592 

    Summary: phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum...

  • Extending the BEAGLE library to a multi-FPGA platform. Jin, Zheming; Bakos, Jason D. // BMC Bioinformatics;2013, Vol. 14 Issue 1, p1 

    Background: Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST,...

  • Image de-nosing based on Contourlet transform and improved NeighShink. ZHANG Lei; KANG Bao-sheng; LI Hong-an // Application Research of Computers / Jisuanji Yingyong Yanjiu;Apr2014, Vol. 31 Issue 4, p1267 

    In order to eliminate the noise in the image effectively and to protect the image detail better, this paper proposed a new method for image denosing based on Contourlet transform and improved NeighShink. It used the stein unbiased risk estimating in the directional subband of each scale for...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics