Email updates

Keep up to date with the latest news and content from Algorithms for Molecular Biology and BioMed Central.

Open Access Research

Towards a practical O(nlogn) phylogeny algorithm

Jakub Truszkowski*, Yanqi Hao and Daniel G Brown

Author Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Waterloo ON N2L 3G1 Canada

For all author emails, please log on.

Algorithms for Molecular Biology 2012, 7:32  doi:10.1186/1748-7188-7-32

Published: 26 November 2012

Abstract

Recently, we have identified a randomized quartet phylogeny algorithm that has O(nlogn) runtime with high probability, which is asymptotically optimal. Our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O(loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of Neighbour Joining , which requires Θ(n3) runtime in existing implementations. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost. An early prototype implementation of our software is available at http://www.cs.uwaterloo.ca/jmtruszk/qtree.tar.gz webcite.

Keywords:
Phylogeny; Random walk; Quartet