Log on/register
BioMed Central home | Journals A-Z | Feedback | Support | My details
 

This article is part of the series Selected papers from WABI 09, edited by Tandy Warnow and Steven Salzberg.

Open AccessResearch

Back-translation for discovering distant protein homologies in the presence of frameshift mutations

Marta Gîrdea1,2 email, Laurent Noé1,2 email and Gregory Kucherov1,2,3 email

Laboratoire d'Informatique Fondamentale de Lille (Centre National de la Recherche Scientifique, Université Lille 1), Lille, France

Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France

French-Russian J-V Poncelet Laboratory, Moscow, Russia

author email corresponding author email

Algorithms for Molecular Biology 2010, 5:6doi:10.1186/1748-7188-5-6

Published: 4 January 2010

Abstract

Background

Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level.

Results

We developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/ webcite.

Conclusions

Our approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.