Chi-Tim Ng, Chun Li and Xiaodan Fan Pages 329 - 348 ( 20 )
Background: There is an increasing need to routinely and quickly compare multiple sequences of, for example, bird flu virus genomes to infer their evolutionary relationship. This entails a fast simultaneous inference of both sequence alignment and phylogeny. Current methods cannot meet the speed requirement though a high phylogeny accuracy is maintained in such scenarios.
Objective: We propose a Fast Algorithm for constructing Multiple sequence Alignment and Phylogeny (FAMAP) from closely related DNA sequences.
Method: FAMAP is essentially a sequentially-inputting algorithm and can be implemented in a progressive fashion, i.e., adding a new sequence into an existing tree or multiple sequence alignment. Its time complexity is O[NP(L)] + O(NG) and its space complexity is O(N) + O(G) + O[Q(L)] , where N is the number of sequences, N is the number of mutations on the phylogeny, L is the maximum length of the sequences, and P(L) and Q(L) are the time and space complexity of aligning a pair of sequences of length L, depending on the pairwise alignment algorithm employed.
Results: Intensive simulation studies shows that our method is superior in terms of speed over other popular methods and has comparable accuracy of both multiple sequence alignment and the phylogeny.
Conclusion: Our new algorithm might be one of the best choices when the user wants to quickly obtain a reliable phylogeny estimation from dozens of closely related long sequences
Bioinformatics software, fast algorithm, multiple sequence alignment, phylogenetic tree, simultaneous reconstruction, tree topology
Department of Statistics, Faculty of Natural Science, Chonnam National University, Gwangju, Department of Statistics, Faculty of Science, The Chinese University of Hong Kong, Hong Kong SAR, Department of Statistics, Faculty of Science, The Chinese University of Hong Kong, Hong Kong SAR