 ResearchMapping sequences by partsGilles Didier1 and Carito Guziolowski2  1Institut de Mathématiques de Luminy, 163 avenue de Luminy, Case 907, 13288 Marseille Cedex 9, France. 2Projet Symbiose, IRISA – campus de Beaulieu, 35042 Rennes Cedex, France. author email corresponding author email
Algorithms for Molecular Biology 2007,
2:11doi:10.1186/1748-7188-2-11
|
|
| Published: |
19 September 2007 |
Abstract
Background:
We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores.
Results:
We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N) using O (|s| × |t| × N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events.
Practical Application:
The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events. |