<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1748-7188-5-12</ui><ji>1748-7188</ji><fm>
<dochead>Research</dochead>
<bibl>
<title>
<p>FlexSnap: Flexible Non-sequential Protein Structure Alignment</p>
</title>
<aug>
<au ca="yes" id="A1"><snm>Salem</snm><fnm>Saeed</fnm><insr iid="I1"/><email>saeed.salem@ndsu.edu</email></au>
<au id="A2"><snm>Zaki</snm><mi>J</mi><fnm>Mohammed</fnm><insr iid="I2"/><email>zaki@cs.rpi.edu</email></au>
<au id="A3"><snm>Bystroff</snm><fnm>Chris</fnm><insr iid="I3"/><email>bystrc@rpi.edu</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Computer Science, North Dakota State University, Fargo, ND 58108, USA</p></ins>
<ins id="I2"><p>Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, USA</p></ins>
<ins id="I3"><p>Department of Biology, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, USA</p></ins>
</insg>
<source>Algorithms for Molecular Biology</source>
<issn>1748-7188</issn>
<pubdate>2010</pubdate>
<volume>5</volume>
<issue>1</issue>
<fpage>12</fpage>
<url>http://www.almob.org/content/5/1/12</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1748-7188-5-12</pubid><pubid idtype="pmpid">20047669</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>19</day><month>8</month><year>2009</year></date></rec><acc><date><day>4</day><month>1</month><year>2010</year></date></acc><pub><date><day>4</day><month>1</month><year>2010</year></date></pub></history>
<cpyrt><year>2010</year><collab>Salem et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>Proteins have evolved subject to energetic selection pressure for stability and flexibility. Structural similarity between proteins that have gone through conformational changes can be captured effectively if flexibility is considered. Topologically unrelated proteins that preserve secondary structure packing interactions can be detected if both flexibility and Sequential permutations are considered. We propose the FlexSnap algorithm for flexible non-topological protein structural alignment.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>The effectiveness of FlexSnap is demonstrated by measuring the agreement of its alignments with manually curated non-sequential structural alignments. FlexSnap showed competitive results against state-of-the-art algorithms, like DALI, SARF2, MultiProt, FlexProt, and FATCAT. Moreover on the DynDom dataset, FlexSnap reported longer alignments with smaller <it>rmsd</it>.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>We have introduced FlexSnap, a greedy chaining algorithm that reports both sequential and non-sequential alignments and allows twists (hinges). We assessed the quality of the FlexSnap alignments by measuring its agreements with manually curated non-sequential alignments. On the FlexProt dataset, FlexSnap was competitive to state-of-the-art flexible alignment methods. Moreover, we demonstrated the benefits of introducing hinges by showing significant improvements in the alignments reported by FlexSnap for the structure pairs for which rigid alignment methods reported alignments with either low coverage or large <it>rmsd</it>.</p>
</sec>
<sec>
<st>
<p>Availability</p>
</st>
<p>An implementation of the FlexSnap algorithm will be made available online at <url>http://www.cs.rpi.edu/~zaki/software/flexsnap</url>.</p>
</sec>
</sec>
</abs>
</fm><meta>
<classifications>
<classification id="wabi" subtype="theme_series_title" type="BMC">Selected papers from WABI 09</classification>
<classification id="wabi" subtype="theme_series_editor" type="BMC">Tandy Warnow and Steven Salzberg</classification>
</classifications>
</meta><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>The wide spectrum of functions performed by proteins are enabled by their intrinsic flexibility <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. It is known that proteins go through conformational changes to perform their functions. Homologous proteins have evolved to adopt conformational changes in their structure. Therefore, similarity between two proteins which have similar structures with one of them having undergone a conformational change will not be captured unless flexibility is considered.</p>
<p>The problem of flexible protein structural alignment has not received much attention. Even though there are a plethora of methods for protein structure comparison <abbrgrp>
<abbr bid="B2">2</abbr>
<abbr bid="B3">3</abbr>
<abbr bid="B4">4</abbr>
<abbr bid="B5">5</abbr>
<abbr bid="B6">6</abbr>
<abbr bid="B7">7</abbr>
<abbr bid="B8">8</abbr>
</abbrgrp>, the majority of the existing methods report only sequential alignments and thus cannot capture non-sequential alignments. Non-sequential similarity can occur naturally due to circular permutations <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp> or convergent evolution <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>. The case is even harder for flexible alignment since only two methods, FlexProt <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>, and FATCAT <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp> report flexible alignments. Nevertheless, both methods are inherently limited to <it>sequential </it>flexible structural alignment because both methods employ sequential chaining techniques. The complexity of protein structural alignment depends on how the similarity is assessed. Kolodny and Linial <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp> showed that the problem is NP-hard if the similarity score is distance matrix based. Therefore, over the years, a number of heuristic approaches have been proposed, which can mainly be classified into two main categories: dynamic programming and clustering.</p>
<p>Dynamic Programming (DP) is a general paradigm to solve problems that exhibit the optimal substructure property <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>. DP-based methods, Structal <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp> and SSAP <abbrgrp>
<abbr bid="B16">16</abbr>
</abbrgrp>, construct a scoring matrix <it>S</it>, where each entry, <it>S</it>
<sub>
<it>ij</it>
</sub>, corresponds to the score of matching the <it>i</it>-th residue in protein A and the <it>j</it>-th residue in protein <it>B</it>. Given a scoring scheme between residues in the two proteins, dynamic programming finds the global alignment that maximizes the score. DP-based methods suffer from two main limitations: first, the alignment is sequential and thus non-topological similarity cannot be detected, and second, it is difficult to design a scoring function that is globally optimal <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>. In fact, structure alignment does not have the optimal substructure property, therefore DP-based methods can find only a suboptimal solution <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. The other category of alignment methods, the Clustering-based methods, DALI <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>, SARF2 <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>, CE <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>, SCALI <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>, and FATCAT <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>, seek to assemble the alignment out of smaller compatible (similar) element pairs such that the score of the alignment is as high as possible <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp>. Two compatible element pairs are consistent (can be assembled together) if the substructures obtained by elements of the pairs are similar. The clustering problem is NP-hard <abbrgrp>
<abbr bid="B19">19</abbr>
</abbrgrp>, thus several heuristics have been proposed. The approaches differ in how the set of compatible element pairs is constructed and how the consistency is measured. Both SARF2 and SCALI produce non-sequential alignments.</p>
<p>The two main flexible alignment methods, FlexProt <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> and FATCAT <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>, work by clustering (chaining) aligned fragment pairs (AFPs) and allowing flexibility while chaining, by introducing hinges (twists). FlexProt searches for the longest set of AFPs that allow different number of hinges. It then reports different alignments with different number of hinges. The FATCAT method works by chaining AFPs using dynamic programming. The score of an alignment ending with a given AFP is computed as the maximum score of connecting the AFP with any of alignments that end before the AFP. A penalty is applied to the score to compensate for gaps, root mean squared deviation (<it>rmsd</it>), and hinges. A third method, which can handle flexible alignments, is the HingeProt <abbrgrp>
<abbr bid="B20">20</abbr>
</abbrgrp> method. HingeProt first partitions one of the two proteins into rigid parts using a Gaussian-Network-Model-based (GNM) approach and then aligns each rigid region with the other protein using the MultiProt <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp> method. HingeProt uses the MultiProt algorithm in the sequential mode and thus does not report flexible non-sequential alignments. Therefore, the accuracy of the HingeProt approach depends on the accuracy of identifying the rigid domains which is a hard problem as the best known method, HingeMaster <abbrgrp>
<abbr bid="B21">21</abbr>
</abbrgrp>, has a sensitivity of only 50%.</p>
<p>To address the limitations of exisiting algorithms we propose FlexSnap, a greedy algorithm for flexible sequential and non-sequential protein structural alignment (the name of the algorithm is a non-sequential permutation of the bold letters in <b>Flex</b>ible <b>n</b>on-<b>S</b>equential <b>p</b>rotein <b>a</b>lignment). The algorithm assembles the alignment from the set of AFPs and allows non-sequential alignments and hinges. We demonstrate the effectiveness of FlexSnap by evaluating its alignments' agreement with manually curated non-sequential alignments. Moreover, FlexSnap shows competitive results on the FlexProt dataset when compared to the main flexible alignment methods, FlexProt and FATCAT.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<p>The main idea of the FlexSnap approach is to assemble the alignment from short well-aligned fragment pairs, which are called AFPs. As we assemble the alignment by adding AFPs, introducing hinges when necessary. Figure <figr fid="F1">1</figr> shows how the alignment is constructed from smaller aligned fragment pairs. When chaining a fragment pair to the alignment, we choose the fragment that has the highest score when joined with the last rigid region in the alignment. The score rewards longer alignments with small <it>rmsd </it>and penalizes large <it>rmsd</it>, gaps, and the introduction of hinges. In the next subsections, we provide a detailed discussion of the FlexSnap algorithm.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Flexible Structural Alignment</p></caption><text>
   <p><b>Flexible Structural Alignment</b>. The Figure shows proteins <it>A </it>and <it>B </it>which have 3 similar structure fragments. A rigid alignment (top right) is not able to align the blue fragment, but a flexible alignment (bottom right) can do this easily by introducing a hinge between the rigid block (the black and green fragments) and the blue fragment. As we assemble the alignment from well-aligned pairs, we introduce hinges to get a longer alignment and smaller <it>rmsd</it>.</p>
</text><graphic file="1748-7188-5-12-1"/></fig>
<sec>
<st>
<p>AFPs Extraction</p>
</st>
<p>Let <it>A </it>= {<it>A</it>
<sub>1</sub>, <it>A</it>
<sub>2</sub>,..., <it>A</it>
<sub>
<it>n</it>
</sub>} and <it>B </it>= {<it>B</it>
<sub>1</sub>, <it>B</it>
<sub>2</sub>,..., <it>B</it>
<sub>
<it>n</it>
</sub>} be two proteins with <it>n </it>and <it>m </it>residues respectively, where <it>A</it>
<sub>
<it>i </it>
</sub>&#8712; &#8476;<sup>3 &#215; 1 </sup>(similarly <it>B</it>
<sub>
<it>i</it>
</sub>) represents the 3D coordinates of the <it>C<sub>&#945; </sub>
</it>atom of the <it>i</it>-th residue in protein <it>A</it>. The first step in FlexSnap is to generate a list of aligned fragment pairs (AFPs):</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i1.gif"/>
</display-formula>
</p>
<p>Each AFP, (<it>i</it>, <it>j</it>, <it>l</it>), is a fragment that starts at the <it>i</it>-th residue in <it>A </it>and <it>j</it>-th residue in <it>B </it>and it has a length of <it>l </it>residues. An AFP is formally represented as a set of <it>l </it>equivalenced pairs between the two proteins, and given as:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i2.gif"/>
</display-formula>
</p>
<p>where (<it>A</it>
<sub>
<it>i</it>
</sub>, <it>B</it>
<sub>
<it>j</it>
</sub>) indicates that the <it>i</it>
<sup>
<it>th </it>
</sup>residue of protein <it>A </it>is paired with the <it>j</it>
<sup>
<it>th </it>
</sup>residue of protein <it>B</it>, and <it>l </it>is AFP's length. Each AFP must satisfy a user-defined similarity constraint. In FlexSnap, we employ the root mean square deviation as the similarity measure, i.e., <it>rmsd</it>(<it>i</it>, <it>j</it>, <it>l</it>) &#8804; &#949;. Moreover, we require that the length of the AFP be at least <it>L</it>, i.e., 3 &#8804; <it>L </it>&#8804; <it>l</it>. Furthermore, we define <inline-formula>
<graphic file="1748-7188-5-12-i3.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1748-7188-5-12-i4.gif"/>
</inline-formula> to be the beginning and end of the AFP<sub>
<it>k </it>
</sub>along the backbone of protein <it>B</it>. For example, for a triplet AFP<sub>
<it>k </it>
</sub>= (<it>i</it>, <it>j</it>, <it>l</it>) and protein <it>A</it>, <inline-formula>
<graphic file="1748-7188-5-12-i5.gif"/>
</inline-formula> = <it>i </it>and <inline-formula>
<graphic file="1748-7188-5-12-i6.gif"/>
</inline-formula> = <it>i </it>+ <it>l </it>- 1.</p>
<p>The number of possible AFPs can be as large as <it>O</it>(<it>n</it>
<sup>3</sup>). The set of all AFPs can be obtained by iterating over all the triplets (<it>i</it>, <it>j</it>, <it>l</it>),</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i7.gif"/>
</display-formula>
</p>
<p>and for each triplet checking if the <it>rmsd</it>(<it>i, j, l</it>) &#8804; &#949;. The <it>rmsd </it>of a fragment of length <it>l </it>can be obtained in <it>O</it>(<it>l</it>) <abbrgrp>
<abbr bid="B22">22</abbr>
</abbrgrp>. A naive implementation that iterates over all the triplets (<it>i, j, l</it>) to obtain the set of all the AFPs would have an <it>O</it>(<it>n</it>
<sup>4</sup>) time complexity. However, by observing that the <it>rmsd </it>of the AFP (<it>i, j, l + 1</it>) can be computed incrementally from the <it>rmsd </it>of AFP (<it>i, j, l</it>) in constant time, the set of aligned fragment pairs (AFPs) can be obtained in <it>O</it>(<it>n</it>
<sup>3</sup>) time complexity <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>.</p>
<p>The main idea to incrementally compute the <it>rmsd </it>is to simplify the <it>rmsd </it>formula. Given two sets, <it>A </it>and <it>B</it>, of <it>N </it>points each, the root mean square deviation (<it>rmsd</it>) is calculated as <abbrgrp>
<abbr bid="B23">23</abbr>
</abbrgrp>:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i8.gif"/>
</display-formula>
</p>
<p>where <it>A' </it>and <it>B' </it>denote the points after recentering, i.e., <inline-formula>
<graphic file="1748-7188-5-12-i9.gif"/>
</inline-formula>, and the <it>d</it>
<sub>
<it>i</it>
</sub>'s are the singular values of <it>C </it>= <it>A'B'</it>
<sup>
<it>T</it>
</sup>, which is a 3 &#215; 3 covariance matrix given as:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i10.gif"/>
</display-formula>
</p>
<p>In rare cases when the determinant of <it>C </it>is negative, then <it>d</it>
<sub>3 </sub>= -1 * <it>d</it>
<sub>3</sub>. Equation (1) can be simplified as:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i11.gif"/>
</display-formula>
</p>
<p>It is clear that all the terms used in equation (3) can be updated in constant time, and thus computing the <it>rmsd </it>for <it>N </it>+ 1 points requires constant time if we have all the terms evaluated for the first <it>N </it>points. Therefore computing the <it>rmsd </it>for AFP(<it>i, j, l</it>) for all values of <it>l</it>'s (for a given <it>i </it>and <it>j </it>) requires only <it>O</it>(<it>n</it>) time. Thus, the total time complexity for the seeds extraction step is <it>O</it>(<it>n</it>
<sup>3</sup>) ...</p>
</sec>
<sec>
<st>
<p>Flexible Chaining</p>
</st>
<p>The second step in FlexSnap is to construct the alignment by selecting a subset of the AFPs. Given a set of AFPs, <it>P</it>, obtained in the AFPs extraction step, we are interested in finding a subset of AFPs, <it>R </it>&#8838; <it>P</it>, such that all the AFPs in <it>R </it>are mutually non-overlapping and the score of the selected AFPs in <it>R </it>is as large as possible. At one hand, we want to get as large an alignment as possible, while on the other hand, we want to minimize the number of hinges and gaps. Therefore, our goal is to optimize a score that rewards long alignments with small <it>rmsd</it>, and penalizes the introduction of hinges and gaps.</p>
<p>The set of AFPs can be thought of as runs in an <it>n &#215; m </it>matrix <it>S</it>, where <it>n </it>and <it>m </it>are the sizes of proteins <it>A </it>and <it>B</it>, respectively (see Figure <figr fid="F2">2</figr>). We define a precedence relation, &#8826;, between two AFPs such that <it>P</it>
<sub>
<it>i </it>
</sub>&#8826; <it>P</it>
<sub>
<it>j </it>
</sub>if <it>P</it>
<sub>
<it>i </it>
</sub>appears either in the upper or lower left quadrant of <it>P</it>
<sub>
<it>j</it>
</sub>, i.e. <inline-formula>
<graphic file="1748-7188-5-12-i12.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1748-7188-5-12-i13.gif"/>
</inline-formula>, or <inline-formula>
<graphic file="1748-7188-5-12-i14.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1748-7188-5-12-i15.gif"/>
</inline-formula> (recall that <inline-formula>
<graphic file="1748-7188-5-12-i16.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1748-7188-5-12-i17.gif"/>
</inline-formula> denote the beginning and end, respectively, of AFP <it>P</it>
<sub>
<it>i </it>
</sub>in protein <it>A</it>). Generally speaking, we say that two AFPs, <it>P</it>
<sub>
<it>i </it>
</sub>and <it>P</it>
<sub>
<it>j</it>
</sub>, can be chained if they do not overlap, i.e., <it>P</it>
<sub>
<it>i </it>
</sub>&#8826; <it>P</it>
<sub>
<it>j </it>
</sub>or <it>P</it>
<sub>
<it>j </it>
</sub>&#8826; <it>P</it>
<sub>
<it>i </it>
</sub>As depicted in Figure <figr fid="F2">2</figr>, <it>P</it>
<sub>7 </sub>and <it>P</it>
<sub>8 </sub>can be chained to <it>P</it>
<sub>1</sub>.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Flexible Structural Alignment by AFPs chaining</p></caption><text>
   <p><b>Flexible Structural Alignment by AFPs chaining</b>. When extending the alignment <it>R </it>= {<it>P</it><sub>1</sub>, <it>P</it><sub>2</sub>, <it>P</it><sub>3</sub>}, the score of extending <it>R </it>with each AFP is computed and we extend the alignment with the AFP that gives the best score. The score <it>S</it>(<it>P</it><sub>4</sub>, <it>P</it><sub>2</sub>, <it>P</it><sub>3</sub>)) indicates the score of adding <it>P</it><sub>4 </sub>to the region composed of <it>P</it><sub>2 </sub>and <it>P</it><sub>3</sub>.</p>
</text><graphic file="1748-7188-5-12-2"/></fig>
<p>For sequential chaining, we define a sequential precedence relation, &#8826;<sub>
<it>s</it>
</sub>, such that <it>P</it>
<sub>
<it>i </it>
</sub>precedes <it>P</it>
<sub>
<it>j </it>
</sub>(written as <it>P</it>
<sub>
<it>i </it>
</sub>&#8826;<sub>
<it>s </it>
</sub>
<it>P</it>
<sub>
<it>j</it>
</sub>) if <it>P</it>
<sub>
<it>i </it>
</sub>appears strictly in the upper left quadrant with respect to <it>P</it>
<sub>
<it>j</it>
</sub>, i.e. <inline-formula>
<graphic file="1748-7188-5-12-i12.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1748-7188-5-12-i13.gif"/>
</inline-formula>. Two AFPs <it>P</it>
<sub>
<it>i </it>
</sub>and <it>P</it>
<sub>
<it>j </it>
</sub>can be sequentially chained together if <it>P</it>
<sub>
<it>i </it>
</sub>&#8826;<sub>
<it>s </it>
</sub>
<it>P</it>
<sub>
<it>j </it>
</sub>or <it>P</it>
<sub>
<it>j </it>
</sub>&#8826;<sub>
<it>s </it>
</sub>
<it>P</it>
<sub>
<it>i</it>
</sub>. In Figure <figr fid="F2">2</figr>, <it>P</it>
<sub>7 </sub>and <it>P</it>
<sub>2 </sub>can be sequentially chained to <it>P</it>
<sub>1</sub>. An AFP, <it>P</it>
<sub>
<it>i</it>
</sub>, can be chained to an alignment <it>R</it>, denoted as (<it>R </it>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>), if it does not overlap with any AFP in <it>R</it>. In Figure <figr fid="F2">2</figr>, <it>P</it>
<sub>7</sub>, <it>P</it>
<sub>4</sub>, and <it>P</it>
<sub>5 </sub>can be sequentially chained to <it>R </it>which consists of AFPs {<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>2</sub>, <it>P</it>
<sub>3</sub>}; and both <it>P</it>
<sub>6 </sub>and <it>P</it>
<sub>8 </sub>can be non-sequentially chained to <it>R</it>. Next, we shall introduce our solution for the general flexible chaining problem.</p>
</sec>
<sec>
<st>
<p>The FlexSnap Approach</p>
</st>
<p>The goal of chaining is to find the highest scoring subset of AFPs, i.e., <it>R </it>&#8838; <it>P</it>, such that all the AFPs in <it>R </it>are mutually consistent and non-overlapping. The problem of finding the highest scoring subset of AFPs is essentially the same as finding the maximum weighted clique in a graph <it>G </it>= (<it>V, E, w</it>) where the set of vertices <it>V </it>represent the set of AFPs, each vertex <it>v</it>
<sub>
<it>i </it>
</sub>has a weight equal to the score of the AFP, <it>w</it>(<it>v</it>
<sub>
<it>i</it>
</sub>) = <it>S</it>(<it>P</it>
<sub>
<it>i</it>
</sub>), where the score of an AFP <it>P</it>
<sub>
<it>i</it>
</sub>, <it>S</it>(<it>P</it>
<sub>
<it>i</it>
</sub>), could be its length or some other combination of length and <it>rmsd</it>. There is an edge (<it>v</it>
<sub>
<it>i</it>
</sub>, <it>v</it>
<sub>
<it>j</it>
</sub>) &#8712; <it>E </it>if the AFPs <it>P</it>
<sub>
<it>i </it>
</sub>and <it>P</it>
<sub>
<it>j </it>
</sub>do not overlap and are consistent (can be joined with small <it>rmsd </it>or have similar rotation matrices).</p>
<p>The problem of finding the maximum weighted clique in a graph is computationally expensive; it is NP-hard <abbrgrp>
<abbr bid="B19">19</abbr>
</abbrgrp>. Thus, we propose a greedy algorithm to find an approximate solution for the chaining problem. The main idea is to start building the alignment from an initial AFP and to add AFPs to the alignment. We start the alignment by selecting the longest AFP, then we iteratively add new AFPs to the alignment as long as the newly added AFP improves the score of the alignment. Given an alignment, <it>R</it>, we add to it the AFP that contributes most. We keep growing the alignment until no more AFPs can be added. The contribution of an AFP to the alignment is scored by how consistent the AFP is with the alignment and how good the AFP is. When adding an AFP to an alignment, we reward longer AFPs with smaller <it>rmsd</it>, and we penalize for gaps, inconsistency, and hinges. The penalty takes into consideration: 1) the number of gaps introduced; 2) the increase in <it>rmsd </it>when combining two or more AFPs; 3) the introduction of new hinges.</p>
<p>As depicted in Figure <figr fid="F2">2</figr>, the scores of extending the alignment, <it>R</it>, with <it>P</it>
<sub>4</sub>, <it>P</it>
<sub>5</sub>, <it>P</it>
<sub>6</sub>, <it>P</it>
<sub>7</sub>, or <it>P</it>
<sub>8 </sub>are computed and the AFP with the best score is added to the alignment. When measuring the score of adding an AFP to the alignment, we actually measure the score of adding the AFP to the last rigid region, and not just to the last fragment, in the alignment. In Figure <figr fid="F2">2</figr>, the score of adding <it>P</it>
<sub>4 </sub>to <it>R </it>is the score of adding <it>P</it>
<sub>4 </sub>to the region composed of <it>P</it>
<sub>2 </sub>and <it>P</it>
<sub>3</sub>. Since <it>P</it>
<sub>2 </sub>and <it>P</it>
<sub>3 </sub>together form a rigid sub-alignment (as we can see there is no hinge between them). When adding <it>P</it>
<sub>7 </sub>to <it>R</it>, the score of adding <it>P</it>
<sub>7 </sub>to the region composed only of <it>P</it>
<sub>1 </sub>is computed.</p>
<p>Figure <figr fid="F3">3</figr> shows the pseudo-code for the greedy chaining algorithm used in FlexSnap. Since the chaining is a greedy algorithm, we run the algorithm <it>K </it>times starting from the <it>K </it>highest scoring non-overlapping AFPs and we report the alignment with the best score.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>A greedy AFP chaining algorithm</p></caption><text>
   <p><b>A greedy AFP chaining algorithm</b>. A greedy algorithm for AFP chaining. The algorithm iteratively chooses an AFP to add to <it>R </it>(lines 4-7) until no more AFPs can be added, or the best score of adding an AFP to <it>R </it>is negative.</p>
</text><graphic file="1748-7188-5-12-3"/></fig>
<sec>
<st>
<p>Alignment Extension Score</p>
</st>
<p>Next, we will discuss how we extend a partial alignment with the next best AFP. More specifically, given an alignment <it>R</it>, the next AFP to chain to the alignment is the one that maximizes the following scoring function:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i18.gif"/>
</display-formula>
</p>
<p>where <it>R </it>&#8594; <it>P</it>
<sub>
<it>i </it>
</sub>indicates that <it>P</it>
<sub>
<it>i </it>
</sub>does not overlap with <it>R</it>, and <it>S</it>(<it>R, P</it>
<sub>
<it>i</it>
</sub>) is the score of chaining <it>P</it>
<sub>
<it>i </it>
</sub>to <it>R</it>. The score, <it>S</it>(<it>R, P</it>
<sub>
<it>i</it>
</sub>), is a combination of the weight of the AFP, <it>W</it>(<it>P</it>
<sub>
<it>i</it>
</sub>), and the penalty of extending <it>R </it>with <it>P</it>
<sub>
<it>i</it>
</sub>, <it>C</it>(<it>R </it>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>). The score is defined as follows:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i19.gif"/>
</display-formula>
</p>
<p>where <it>C</it>(<it>R </it>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>) is the penalty incurred when connecting <it>P</it>
<sub>
<it>i </it>
</sub>to <it>R</it>, and <it>W</it>(<it>P</it>
<sub>
<it>i</it>
</sub>) is the score of the AFP itself. The scoring function rewards longer AFPs with small <it>rmsd </it>and penalize gaps and hinges. If the addition of an AFP <it>P</it>
<sub>
<it>i </it>
</sub>to the alignment results in a large <it>rmsd</it>, then we introduce a hinge only if <it>W</it>(<it>P</it>
<sub>
<it>i</it>
</sub>) is large enough to compensate for the penalty incurred. A similar approach for penalizing gaps and hinges was used in the FATCAT method <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>. Though their score and cost functions are different, and they do not consider rigid regions as we do in FlexSnap when connecting an AFP to the alignment. The score of connecting <it>P</it>
<sub>
<it>i </it>
</sub>to <it>R </it>is defined as follows:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i20.gif"/>
</display-formula>
</p>
<p>where <it>M</it>
<sub>
<it>g </it>
</sub>is the penalty for a gap, <it>M</it>
<sub>
<it>r </it>
</sub>is the maximum penalty for a hinge, and <inline-formula>
<graphic file="1748-7188-5-12-i21.gif"/>
</inline-formula> is the <it>rmsd </it>of connecting <it>P</it>
<sub>
<it>i </it>
</sub>to the last rigid region in <it>R</it>. If <inline-formula>
<graphic file="1748-7188-5-12-i21.gif"/>
</inline-formula> increases above a user-defined threshold, <it>D</it>
<sub>
<it>c</it>
</sub>, we introduce a hinge and the penalty is maximum; if not, the penalty is proportional to how far the <it>rmsd </it>value is from <it>&#949; </it>(maximum <it>rmsd </it>for an AFP). Moreover, we allow only a maximum number of <it>H </it>hinges. The score for an AFP is a function of its length and <it>rmsd</it>. The score is the length of the AFP, <it>L</it>(<it>P</it>
<sub>
<it>i</it>
</sub>), plus a contribution of the <it>rmsd </it>of the AFP, <it>rmsd</it>(<it>P</it>
<sub>
<it>i</it>
</sub>), to the score, and is given as:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i22.gif"/>
</display-formula>
</p>
<p>The complexity of the chaining algorithm depends on the number of AFPs, M, that the two structures have. In the worst case, <it>M </it>could be close to <it>n</it>
<sup>3</sup>, but in practice it is much less, i.e., <it>M </it>&#8804; <it>n</it>
<sup>2</sup>. The complexity of the algorithm is <it>Mlog</it>(<it>M</it>) + <it>k </it>* <it>M </it>* <it>n</it>, where k is the number of AFPs in the final solution and <it>n </it>is the size of the larger protein.</p>
</sec>
<sec>
<st>
<p>Sequential Flexible Chaining</p>
</st>
<p>The above general chaining algorithm reports both sequential and non-sequential alignments. In the results section, we demonstrate that the quality of its non-sequential alignments is competitive to state-of-the-art non-sequential alignment methods. However, for sequential flexible alignment, there are more efficient chaining algorithms, namely the approach proposed by the FATCAT algorithm. The FATCAT algorithm follows a dynamic programming approach for chaining the AFPs. In FATCAT, the score of an alignment ending with an AFP, <it>P</it>
<sub>
<it>i</it>
</sub>, is defined in terms of the score of <it>P</it>
<sub>
<it>j</it>
</sub>'s and the connection cost of <it>P</it>
<sub>
<it>i </it>
</sub>with these <it>P</it>
<sub>
<it>j</it>
</sub>'s such that <it>P</it>
<sub>
<it>j </it>
</sub>precedes <it>P</it>
<sub>
<it>i </it>
</sub>(<it>P</it>
<sub>
<it>j </it>
</sub>&#8826;<sub>
<it>s </it>
</sub>
<it>P</it>
<sub>
<it>i</it>
</sub>). More specifically, FATCAT defines the score of the alignment that ends with <it>P</it>
<sub>
<it>i </it>
</sub>as follows:</p>
<p>
<display-formula>
<graphic file="1748-7188-5-12-i23.gif"/>
</display-formula>
</p>
<p>where <it>C</it>(<it>P</it>
<sub>
<it>j </it>
</sub>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>) is the penalty incurred when connecting <it>P</it>
<sub>
<it>i </it>
</sub>to the alignment that ends with <it>P</it>
<sub>
<it>j </it>
</sub>and it is similar to the penalty function used in the general chaining and <it>W</it>(<it>P</it>
<sub>
<it>i</it>
</sub>) is the score of the AFP itself. We propose an approach that is similar in spirit to FACTCAT, however, it is different in two important aspects. The first aspect is the optimality of the alignment reported by FATCAT. The main issue here is that the scoring function has an <it>rmsd </it>term since <it>W</it>(<it>P</it>
<sub>
<it>i</it>
</sub>) is a function of the length of <it>P</it>
<sub>
<it>i </it>
</sub>and its <it>rmsd</it>. Therefore, <it>S</it>(<it>P</it>
<sub>
<it>i</it>
</sub>) cannot be optimal because we do not know of a scoring function that involves the <it>rmsd </it>value that is additive and optimal (<it>rmsd </it>score is not a metric since it does not satisfy the triangular inequality property). Therefore, the optimality of FATCAT alignments is not guaranteed since the sub-optimality property of the dynamic programming does not hold if the score incorporates an <it>rmsd </it>term. In Figure <figr fid="F4">4</figr>, let the optimal alignment be {<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>4</sub>, <it>P</it>
<sub>5</sub>, <it>P</it>
<sub>6</sub>}, the sub-optimality property requires that {<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>4</sub>, <it>P</it>
<sub>5</sub>} is also optimal, and it is the best alignment that ends with <it>P</it>
<sub>5</sub>. This is not necessarily true in structural alignment, because it could happen that the alignment {<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>2</sub>, <it>P</it>
<sub>3</sub>, <it>P</it>
<sub>5</sub>} is better {<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>4</sub>, <it>P</it>
<sub>5</sub>}. In general, the flexible structural alignment does not exhibit the optimal substructure that would justify the use of dynamic programming.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>A greedy sequential AFP chaining algorithm</p></caption><text>
   <p><b>A greedy sequential AFP chaining algorithm</b>. A greedy sequential algorithm for AFP chaining. When encountering the beginning of an AFP, the algorithm computes the scores of adding the AFP to the alignments in the upper left corner and the AFP is chained to the alignment with which it gives the highest score.</p>
</text><graphic file="1748-7188-5-12-4"/></fig>
<p>In FlexSnap, we follow a similar approach as the approach presented in <abbrgrp>
<abbr bid="B24">24</abbr>
</abbrgrp> for chaining substrings. In the original algorithm, once we reach the end of a substring (segment), <it>P</it>
<sub>
<it>i</it>
</sub>, we delete all the solutions that end with <it>P</it>
<sub>
<it>j</it>
</sub>'s whose ends are lower and to the left of the endpoint of <it>P</it>
<sub>
<it>i </it>
</sub>and <it>S</it>(<it>P</it>
<sub>
<it>j</it>
</sub>) &lt;<it>S</it>(<it>P</it>
<sub>
<it>i</it>
</sub>). For the segments shown in Figure <figr fid="F4">4</figr>, let <it>S</it>(<it>P</it>
<sub>3</sub>) &gt;<it>S</it>(<it>P</it>
<sub>4</sub>), once we encounter the end of <it>P</it>
<sub>3</sub>, we should delete the solution that ends with <it>P</it>
<sub>4</sub>. When we encounter <it>P</it>
<sub>5</sub>, we know that the best solution it can be chained to ends with <it>P</it>
<sub>3</sub>. This approach works fine for regular chaining problems (like strings). However for the structural alignment problem, this approach does not yield the optimal alignment since the problem does not exhibit the optimal substructure property. Therefore, in FlexSnap, once we reach the end of an AFP, <it>P</it>
<sub>
<it>i</it>
</sub>, we do not delete all solutions that end with <it>P</it>
<sub>
<it>j</it>
</sub>'s which are to the left and below <it>P</it>
<sub>
<it>i</it>
</sub>; instead we only delete the ones that have very low scores as compared to <it>S</it>(<it>P</it>
<sub>
<it>i</it>
</sub>). Though not optimal, this approach gave better results for sequential chaining than the pure greedy approach presented in the previous section.</p>
<p>The second aspect where FlexSnap is different from FATCAT is that in FATCAT <it>C</it>(<it>P</it>
<sub>
<it>j </it>
</sub>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>) is the connection cost of <it>P</it>
<sub>
<it>i </it>
</sub>and <it>P</it>
<sub>
<it>j </it>
</sub>while in FlexSnap it is the connection cost of <it>P</it>
<sub>
<it>i </it>
</sub>to the rigid region that contains <it>P</it>
<sub>
<it>j</it>
</sub>. In FATCAT, if <it>P</it>
<sub>
<it>j </it>
</sub>belongs to a rigid region and the connection cost of <it>P</it>
<sub>
<it>i </it>
</sub>with <it>P</it>
<sub>
<it>j </it>
</sub>is small, <it>P</it>
<sub>
<it>i </it>
</sub>will be added to the same rigid region as <it>P</it>
<sub>
<it>j </it>
</sub>even though <it>P</it>
<sub>
<it>i </it>
</sub>might not be consistent with other AFPs in the same region. In Figure <figr fid="F4">4</figr>, if we were connecting <it>P</it>
<sub>5 </sub>to <it>R</it>
<sub>2 </sub>that ends with <it>P</it>
<sub>4</sub>, FATCAT would compute the connection cost <it>C</it>(<it>P</it>
<sub>5</sub>, <it>P</it>
<sub>4</sub>) but FlexSnap would compute <it>C</it>((<it>P</it>
<sub>1</sub>, <it>P</it>
<sub>4</sub>) &#8594; <it>P</it>
<sub>5</sub>) since <it>P</it>
<sub>4 </sub>belongs to the rigid region that contains <it>P</it>
<sub>1</sub>. In FATCAT, when connecting <it>P</it>
<sub>5 </sub>to <it>R</it>
<sub>2</sub>, we might get the conclusion that there is no need to introduce a hinge and thus <it>P</it>
<sub>5 </sub>belongs to the same rigid region as <it>P</it>
<sub>1 </sub>and <it>P</it>
<sub>4</sub>. This may lead to a large <it>rmsd </it>when we report the alignment since we did not check if <it>P</it>
<sub>5 </sub>is consistent with <it>P</it>
<sub>1</sub>. However, when FlexSnap adds <it>P</it>
<sub>5 </sub>to the same rigid region as <it>P</it>
<sub>4 </sub>and <it>P</it>
<sub>1</sub>, it will not harm the final <it>rmsd </it>when we report the alignment as FlexSnap ensures that all the segments in the same rigid region are consistent. In the results section, we investigate how computing the connection cost with the whole rigid region as opposed to the last segment in the rigid region affects the quality of the alignment. For some structure pairs, considering the whole rigid region in computing the connection cost resulted in significant improvements.</p>
</sec>
</sec>
</sec>
<sec>
<st>
<p>Results and Discussion</p>
</st>
<p>To assess the quality of FlexSnap alignment compared to other structural alignment methods, we evaluated the agreement of the methods' alignments with reference manually-curated alignments. We compared FlexSnap against sequential methods (DALI <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp> and CE <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>), non-sequential methods (SARF2 <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>, MultiProt <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>, and SCALI <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>), and flexible sequential alignment methods (FlexProt <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> and FATCAT <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>). Finally, we analyzed the flexibility on the DynDom dataset <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp>, which is a comprehensive and non-redundant dataset of protein domain movements.</p>
<p>All the experiments were run on a 1.66 GHz Intel Core Duo machine with 1 GB of main memory running Ubuntu Linux. The chaining algorithm is efficient and its running time varies from 1 second to a minute depending on the size of the proteins. We used the corresponding web server for most of the other alignment methods. The optimal values for the different parameters were found empirically such that they give the best agreement with manually curated alignments; we used <it>L </it>= 8, <it>&#949; </it>= 2 &#197;, <it>D</it>
<sub>
<it>c </it>
</sub>= 3 &#197;, <it>&#945; </it>= 0.5, <it>M</it>
<sub>
<it>r </it>
</sub>= -10, <it>M</it>
<sub>
<it>g </it>
</sub>= -1, and <it>H </it>= 3 (see Figure <figr fid="F3">3</figr>).</p>
<sec>
<st>
<p>Non-Sequential Alignments</p>
</st>
<p>We used the reference alignments for the structure pairs which have circular permutation in the RIPC dataset <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp>. The RIPC set contains 40 structurally related protein pairs which are challenging to align because they have indels, repetitions, circular permutations, and show conformational flexibility <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp>. There are 10 pairs in the RIPC dataset that have circular permutation. Since the structure pairs have non-sequential alignments, to be fair, we only compare with algorithms that can handle non-sequentiality. However, we report the average agreement for some sequential methods as well. The agreement of a given alignment, <it>S</it>, with the reference alignment, <it>R</it>, is defined as the percentage of the residue pairs in the alignment which are identically aligned as in the reference alignment (<it>I</it>
<sub>
<it>S</it>
</sub>) relative to the reference alignment's length (<it>L</it>
<sub>
<it>R</it>
</sub>), i.e., <it>A</it>(<it>S</it>, <it>R</it>) = (<it>I</it>
<sub>
<it>S</it>
</sub>/<it>L</it>
<sub>
<it>R</it>
</sub>) &#215; 100. Table <tblr tid="T1">1</tblr> shows the agreements of four different methods with the reference alignments in the RIPC dataset. The results show that FlexSnap is competitive to state-of-the-art methods for non-sequential alignment. In fact, it has the highest average agreement (79%) among the methods shown. The average agreement of most of the sequential alignment methods we compared with were drastically lower: DALI <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp> (40%), CE <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>(36%), FATCAT <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>(28%), and LGA <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp>(38%).</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Comparison of SARF, MultiProt, SCALI, and FlexSnap on the RIPC dataset. </p></caption><tblbdy cols="14">
      <r>
         <c cspan="2" ca="center">
            <p>
               <b>SCOPID</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>SARF</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>MultiProt</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>SCALI</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>FlexSnap</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="14">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>
               <b>Pro1</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Pro2</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>size</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>rmsd</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>A</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>size</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>rmsd</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>A</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>size</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>rmsd</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>A</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>size</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>rmsd</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>A</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="14">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1nkl__</p>
         </c>
         <c ca="center">
            <p>d1qdma1</p>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
         <c ca="center">
            <p>2.21</p>
         </c>
         <c ca="center">
            <p>92</p>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
         <c ca="center">
            <p>1.82</p>
         </c>
         <c ca="center">
            <p>68</p>
         </c>
         <c ca="center">
            <p>62</p>
         </c>
         <c ca="center">
            <p>1.94</p>
         </c>
         <c ca="center">
            <p>69</p>
         </c>
         <c ca="center">
            <p>73</p>
         </c>
         <c ca="center">
            <p>2.39</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1nls__</p>
         </c>
         <c ca="center">
            <p>d2bqpa_</p>
         </c>
         <c ca="center">
            <p>212</p>
         </c>
         <c ca="center">
            <p>1.50</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>213</p>
         </c>
         <c ca="center">
            <p>1.03</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
         <c ca="center">
            <p>195</p>
         </c>
         <c ca="center">
            <p>1.62</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>210</p>
         </c>
         <c ca="center">
            <p>2.81</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1qasa2</p>
         </c>
         <c ca="center">
            <p>d1rsy__</p>
         </c>
         <c ca="center">
            <p>109</p>
         </c>
         <c ca="center">
            <p>2.27</p>
         </c>
         <c ca="center">
            <p>65</p>
         </c>
         <c ca="center">
            <p>107</p>
         </c>
         <c ca="center">
            <p>1.24</p>
         </c>
         <c ca="center">
            <p>93</p>
         </c>
         <c ca="center">
            <p>98</p>
         </c>
         <c ca="center">
            <p>1.92</p>
         </c>
         <c ca="center">
            <p>82</p>
         </c>
         <c ca="center">
            <p>111</p>
         </c>
         <c ca="center">
            <p>1.73</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1b5ta_</p>
         </c>
         <c ca="center">
            <p>d1k87a2</p>
         </c>
         <c ca="center">
            <p>171</p>
         </c>
         <c ca="center">
            <p>2.63</p>
         </c>
         <c ca="center">
            <p>63</p>
         </c>
         <c ca="center">
            <p>144</p>
         </c>
         <c ca="center">
            <p>2.04</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>159</p>
         </c>
         <c ca="center">
            <p>3.38</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>177</p>
         </c>
         <c ca="center">
            <p>2.99</p>
         </c>
         <c ca="center">
            <p>50</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1jwyb_</p>
         </c>
         <c ca="center">
            <p>d1puja</p>
         </c>
         <c ca="center">
            <p>115</p>
         </c>
         <c ca="center">
            <p>2.43</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>108</p>
         </c>
         <c ca="center">
            <p>1.81</p>
         </c>
         <c ca="center">
            <p>92</p>
         </c>
         <c ca="center">
            <p>110</p>
         </c>
         <c ca="center">
            <p>4.60</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>116</p>
         </c>
         <c ca="center">
            <p>2.61</p>
         </c>
         <c ca="center">
            <p>92</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1jwyb_</p>
         </c>
         <c ca="center">
            <p>d1u0la2_</p>
         </c>
         <c ca="center">
            <p>97</p>
         </c>
         <c ca="center">
            <p>2.02</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
         <c ca="center">
            <p>103</p>
         </c>
         <c ca="center">
            <p>1.86</p>
         </c>
         <c ca="center">
            <p>91</p>
         </c>
         <c ca="center">
            <p>91</p>
         </c>
         <c ca="center">
            <p>4.52</p>
         </c>
         <c ca="center">
            <p>90</p>
         </c>
         <c ca="center">
            <p>96</p>
         </c>
         <c ca="center">
            <p>2.82</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1nw5a_</p>
         </c>
         <c ca="center">
            <p>d2adma</p>
         </c>
         <c ca="center">
            <p>129</p>
         </c>
         <c ca="center">
            <p>2.52</p>
         </c>
         <c ca="center">
            <p>85</p>
         </c>
         <c ca="center">
            <p>130</p>
         </c>
         <c ca="center">
            <p>2.11</p>
         </c>
         <c ca="center">
            <p>92</p>
         </c>
         <c ca="center">
            <p>132</p>
         </c>
         <c ca="center">
            <p>3.73</p>
         </c>
         <c ca="center">
            <p>84</p>
         </c>
         <c ca="center">
            <p>128</p>
         </c>
         <c ca="center">
            <p>2.91</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1gsa 1_</p>
         </c>
         <c ca="center">
            <p>d2hgsa1</p>
         </c>
         <c ca="center">
            <p>73</p>
         </c>
         <c ca="center">
            <p>2.59</p>
         </c>
         <c ca="center">
            <p>20</p>
         </c>
         <c ca="center">
            <p>74</p>
         </c>
         <c ca="center">
            <p>1.56</p>
         </c>
         <c ca="center">
            <p>40</p>
         </c>
         <c ca="center">
            <p>69</p>
         </c>
         <c ca="center">
            <p>3.23</p>
         </c>
         <c ca="center">
            <p>40</p>
         </c>
         <c ca="center">
            <p>73</p>
         </c>
         <c ca="center">
            <p>2.81</p>
         </c>
         <c ca="center">
            <p>20</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1qq5a_</p>
         </c>
         <c ca="center">
            <p>d3chy__</p>
         </c>
         <c ca="center">
            <p>88</p>
         </c>
         <c ca="center">
            <p>2.39</p>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
         <c ca="center">
            <p>82</p>
         </c>
         <c ca="center">
            <p>1.97</p>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
         <c ca="center">
            <p>52</p>
         </c>
         <c ca="center">
            <p>2.08</p>
         </c>
         <c ca="center">
            <p>66</p>
         </c>
         <c ca="center">
            <p>93</p>
         </c>
         <c ca="center">
            <p>2.94</p>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>d1kiaa_</p>
         </c>
         <c ca="center">
            <p>d1nw5a_</p>
         </c>
         <c ca="center">
            <p>146</p>
         </c>
         <c ca="center">
            <p>2.48</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>153</p>
         </c>
         <c ca="center">
            <p>1.85</p>
         </c>
         <c ca="center">
            <p>75</p>
         </c>
         <c ca="center">
            <p>138</p>
         </c>
         <c ca="center">
            <p>3.99</p>
         </c>
         <c ca="center">
            <p>75</p>
         </c>
         <c ca="center">
            <p>141</p>
         </c>
         <c ca="center">
            <p>2.69</p>
         </c>
         <c ca="center">
            <p>75</p>
         </c>
      </r>
      <r>
         <c cspan="14">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Avg.</p>
         </c>
         <c ca="center">
            <p>Agreement</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>74</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>72</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>67</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>79</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Three values are reported for each alignment: its length, its <it>rmsd</it>, and A which is its agreement with the reference alignment in the RIPC dataset</p>
   </tblfn></tbl>
<p>FlexSnap alignments have 100 percent agreement on four structure pairs. One such pair is the alignment of NK-lysin (<ext-link ext-link-id="1nkl" ext-link-type="pdb">1nkl</ext-link>, 78 residues) with prophytepsin (<ext-link ext-link-id="1qdm" ext-link-type="pdb">1qdm</ext-link>, chain A, 77 residues). On this pair, all the sequential alignment methods (CE, DALI, FATCAT, and LGA) returned zero agreements. For the non-sequential ones: SARF returned 92%, MultiProt got 68%, and SCALI returned 69%. The reference alignment had 72 aligned pairs. As shown in Figure <figr fid="F5">5</figr>, the sequential alignment methods (only DALI and FATCAT shown) have their alignment paths along the diagonal and do not agree with the reference alignment (shown as circles).</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Comparison of the agreements of the alignments with one structure pair from the RIPC dataset</p></caption><text>
   <p><b>Comparison of the agreements of the alignments with one structure pair from the RIPC dataset</b>. Comparison of the agreement between the reference alignment and 6 other alignment methods on the structure pair of prophytepsin(d1qdma1) and nk-lysin(d1nkl__). Residue positions of d1qdma1 and d1nkl__ are plotted on the x-axis and y-axis, respectively. Note: the reference alignment pairs are shown in circles. The SARF, MultiProt, SCALI, and FlexSnap plots overlap with the reference alignment. FlexSnap has 100 percent coverage of the reference alignment; there is a triangle in every circle.</p>
</text><graphic file="1748-7188-5-12-5"/></fig>
</sec>
<sec>
<st>
<p>Sequential Flexible Alignments</p>
</st>
<p>Table <tblr tid="T2">2</tblr> shows the alignments of different methods on the FlexProt dataset <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> which is obtained from the database of macromolecular motions<abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp>. We have implemented two versions of FlexSnap namely FlexSnap<sup>
<it>F</it>
</sup>, and <inline-formula>
<graphic file="1748-7188-5-12-i24.gif"/>
</inline-formula>; In FlexSnap<sup>
<it>F</it>
</sup>, <it>C</it>(<it>P</it>
<sub>
<it>j </it>
</sub>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>) is the cost of connecting <it>P</it>
<sub>
<it>i </it>
</sub>with the rigid region to which <it>P</it>
<sub>
<it>j </it>
</sub>belongs. In the second version, <inline-formula>
<graphic file="1748-7188-5-12-i24.gif"/>
</inline-formula>, <it>C</it>(<it>P</it>
<sub>
<it>j </it>
</sub>&#8594; <it>P</it>
<sub>
<it>i</it>
</sub>) is the connection cost of <it>P</it>
<sub>
<it>i </it>
</sub>with only <it>P</it>
<sub>
<it>j</it>
</sub>. It is observed that when considering the entire rigid region, as in FlexSnap<sup>
<it>F</it>
</sup>, we get much better alignments, i.e., they have lower <it>rmsd </it>and fewer hinges. Moreover, FlexSnap<sup>
<it>F </it>
</sup>gives comparable results to the FATCAT method. In few cases, it got slightly shorter alignments with much better <it>rmsd </it>as in the case of the third and fourth alignment pairs.</p>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Comparison of FlexProt, FATCAT, FlexSnap<sup><it>F</it></sup>, and <inline-formula><graphic file="1748-7188-5-12-i24.gif"/></inline-formula>.</p></caption><tblbdy cols="14">
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>FlexProt</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>FATCAT</b>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>FlexSnap</b>
               <sup>
                  <it>F</it>
               </sup>
            </p>
         </c>
         <c cspan="3" ca="center">
            <p>
               <b>
                  <inline-formula>
                     <graphic file="1748-7188-5-12-i24.gif"/>
                  </inline-formula>
               </b>
            </p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>
               <b>Pro1</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Pro2</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>l</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>r</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>l</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>r</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>l</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>r</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>l</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>r</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
               </b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="14">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1wdnA(223)</p>
         </c>
         <c ca="center">
            <p>1gggA(220)</p>
         </c>
         <c ca="center">
            <p>218</p>
         </c>
         <c ca="center">
            <p>0.94</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>220</p>
         </c>
         <c ca="center">
            <p>1.01</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>220</p>
         </c>
         <c ca="center">
            <p>0.96</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>220</p>
         </c>
         <c ca="center">
            <p>0.96</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1hpbP(238)</p>
         </c>
         <c ca="center">
            <p>1gggA(220)</p>
         </c>
         <c ca="center">
            <p>220</p>
         </c>
         <c ca="center">
            <p>2.34</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>213</p>
         </c>
         <c ca="center">
            <p>1.59</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>211</p>
         </c>
         <c ca="center">
            <p>1.67</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>210</p>
         </c>
         <c ca="center">
            <p>3.88</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>2bbmA(148)</p>
         </c>
         <c ca="center">
            <p>1cll_(144)</p>
         </c>
         <c ca="center">
            <p>139</p>
         </c>
         <c ca="center">
            <p>2.22</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>144</p>
         </c>
         <c ca="center">
            <p>2.28</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>138</p>
         </c>
         <c ca="center">
            <p>1.8</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>138</p>
         </c>
         <c ca="center">
            <p>1.80</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>2bbmA(148)</p>
         </c>
         <c ca="center">
            <p>1top_(162)</p>
         </c>
         <c ca="center">
            <p>147</p>
         </c>
         <c ca="center">
            <p>2.40</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>145</p>
         </c>
         <c ca="center">
            <p>2.28</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>137</p>
         </c>
         <c ca="center">
            <p>1.78</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>137</p>
         </c>
         <c ca="center">
            <p>1.78</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1akeA(214)</p>
         </c>
         <c ca="center">
            <p>2ak3A(226)</p>
         </c>
         <c ca="center">
            <p>200</p>
         </c>
         <c ca="center">
            <p>2.44</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>202</p>
         </c>
         <c ca="center">
            <p>1.54</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>207</p>
         </c>
         <c ca="center">
            <p>2.05</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>206</p>
         </c>
         <c ca="center">
            <p>6.72</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>2ak3A(226)</p>
         </c>
         <c ca="center">
            <p>1uke_(193)</p>
         </c>
         <c ca="center">
            <p>182</p>
         </c>
         <c ca="center">
            <p>2.90</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>188</p>
         </c>
         <c ca="center">
            <p>2.97</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>184</p>
         </c>
         <c ca="center">
            <p>2.36</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>184</p>
         </c>
         <c ca="center">
            <p>3.08</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1mcpL(220)</p>
         </c>
         <c ca="center">
            <p>4fabL(219)</p>
         </c>
         <c ca="center">
            <p>218</p>
         </c>
         <c ca="center">
            <p>1.93</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>217</p>
         </c>
         <c ca="center">
            <p>1.40</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>217</p>
         </c>
         <c ca="center">
            <p>1.49</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>217</p>
         </c>
         <c ca="center">
            <p>1.49</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1mcpL(220)</p>
         </c>
         <c ca="center">
            <p>1tcrB(237)</p>
         </c>
         <c ca="center">
            <p>212</p>
         </c>
         <c ca="center">
            <p>2.33</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>213</p>
         </c>
         <c ca="center">
            <p>2.20</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>202</p>
         </c>
         <c ca="center">
            <p>2.3</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>200</p>
         </c>
         <c ca="center">
            <p>2.38</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1lfh (691)</p>
         </c>
         <c ca="center">
            <p>1lfg_(691)</p>
         </c>
         <c ca="center">
            <p>691</p>
         </c>
         <c ca="center">
            <p>1.41</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>686</p>
         </c>
         <c ca="center">
            <p>0.89</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>688</p>
         </c>
         <c ca="center">
            <p>0.99</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>688</p>
         </c>
         <c ca="center">
            <p>0.99</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1tfd (294)</p>
         </c>
         <c ca="center">
            <p>1lfh_(691)</p>
         </c>
         <c ca="center">
            <p>291</p>
         </c>
         <c ca="center">
            <p>1.98</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>290</p>
         </c>
         <c ca="center">
            <p>1.37</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>287</p>
         </c>
         <c ca="center">
            <p>1.89</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>283</p>
         </c>
         <c ca="center">
            <p>1.41</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1b9wA(91)</p>
         </c>
         <c ca="center">
            <p>1danL(142)</p>
         </c>
         <c ca="center">
            <p>75</p>
         </c>
         <c ca="center">
            <p>2.78</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>80</p>
         </c>
         <c ca="center">
            <p>2.39</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>82</p>
         </c>
         <c ca="center">
            <p>2.25</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>83</p>
         </c>
         <c ca="center">
            <p>2.7</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1qf6A(641)</p>
         </c>
         <c ca="center">
            <p>1adjA(420)</p>
         </c>
         <c ca="center">
            <p>323</p>
         </c>
         <c ca="center">
            <p>4.43</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>351</p>
         </c>
         <c ca="center">
            <p>2.68</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>326</p>
         </c>
         <c ca="center">
            <p>2.45</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>320</p>
         </c>
         <c ca="center">
            <p>2.47</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>2clrA(275)</p>
         </c>
         <c ca="center">
            <p>3fruA(269)</p>
         </c>
         <c ca="center">
            <p>253</p>
         </c>
         <c ca="center">
            <p>2.71</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>245</p>
         </c>
         <c ca="center">
            <p>3.06</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>254</p>
         </c>
         <c ca="center">
            <p>2.57</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>252</p>
         </c>
         <c ca="center">
            <p>4.31</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1fmk (438)</p>
         </c>
         <c ca="center">
            <p>1qcfA(450)</p>
         </c>
         <c ca="center">
            <p>424</p>
         </c>
         <c ca="center">
            <p>1.25</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>433</p>
         </c>
         <c ca="center">
            <p>2.27</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>413</p>
         </c>
         <c ca="center">
            <p>2.71</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>413</p>
         </c>
         <c ca="center">
            <p>2.44</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1fmk (438)</p>
         </c>
         <c ca="center">
            <p>1tkiA(321)</p>
         </c>
         <c ca="center">
            <p>231</p>
         </c>
         <c ca="center">
            <p>3.28</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>238</p>
         </c>
         <c ca="center">
            <p>3.07</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>241</p>
         </c>
         <c ca="center">
            <p>2.58</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>242</p>
         </c>
         <c ca="center">
            <p>3.14</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>1a21A(194)</p>
         </c>
         <c ca="center">
            <p>1hwgC(191)</p>
         </c>
         <c ca="center">
            <p>163</p>
         </c>
         <c ca="center">
            <p>2.75</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
         <c ca="center">
            <p>153</p>
         </c>
         <c ca="center">
            <p>3.16</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>156</p>
         </c>
         <c ca="center">
            <p>2.35</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>155</p>
         </c>
         <c ca="center">
            <p>3.79</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Comparison of FlexProt, FATCAT, FlexSnap<sup><it>F</it></sup>, and <inline-formula><graphic file="1748-7188-5-12-i24.gif"/></inline-formula>. Each alignment is reported in the following format: its length, <it>l</it>, its <it>rmsd</it>, <it>r</it>, and the number of hinges introduced, <it>T</it>.</p>
   </tblfn></tbl>
</sec>
<sec>
<st>
<p>Flexibility in the DynDom Dataset</p>
</st>
<p>The DynDom dataset <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp> is a comprehensive and non-redundant dataset of protein domain movements; it has been compiled by an exhaustive analysis of protein domain movements on all available protein structures using the DynDom program <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>. The protein conformations are first grouped into families based on sequence similarity, resulting in 1825 families with an average of 11.5 family members. Then a clustering procedure is applied to members of the same family to remove dynamic redundancy (same motion) and finally running the DynDom program to analyze domain movements in each family. There are currently 2035 representative pairs belonging to 1578 families in the DynDom dataset. Since these representative pairs involve domain movements, rigid alignment methods would not be able to align these pairs effectively, while flexible alignment methods will be able to introduce hinges and align the pairs more effectively. We define the coverage of the alignment as the percentage of the number of residues in the alignment to the length of the smaller protein. More formally, the coverage of an alignment of length <it>N</it>
<sub>
<it>mat </it>
</sub>is defined as <inline-formula>
<graphic file="1748-7188-5-12-i25.gif"/>
</inline-formula>, where |<it>A</it>| is the length of protein <it>A</it>, similarly for |<it>B</it>|.</p>
<p>Table <tblr tid="T3">3</tblr> shows the average coverage, <it>rmsd</it>, and hinges reported by different methods on the DynDom dataset. For the same structure pair, FlexProt reports different solutions with different number of hinges ranging from 0 to 5 hinges. For the sake of fair comparison, we choose the FlexProt alignment with the same number of hinges as the solution reported by FlexSnap. Moreover, we also run FlexSnap in rigid mode (FlexSnap<sup>
<it>R</it>
</sup>) with the number of allowed hinges set to 0 to investigate how it compares to rigid alignment methods. DALI has the highest coverage followed by FlexSnap. However, the average <it>rmsd </it>of FlexSnap alignments is much smaller than the average <it>rmsd </it>for DALI alignments. On average, FlexSnap introduced 0.59 hinges in the alignments. By introducing flexibility in the alignments, FlexSnap reported alignments with significantly smaller <it>rmsd </it>while maintaining high alignment coverage. Also when run in the rigid mode, FlexSnap<sup>
<it>R </it>
</sup>is competitive to state-of-the-art methods like DALI, Structal, and MultiProt.</p>
<tbl id="T3"><title><p>Table 3</p></title><caption><p>Comparison of several alignment methods on the DynDom dataset</p></caption><tblbdy cols="6">
      <r>
         <c ca="center">
            <p>
               <b>DALI</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Structal</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>MultiProt</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexSnap</b>
               <sup>
                  <it>R</it>
               </sup>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexSnap</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexProt</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it></p>
         </c>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it></p>
         </c>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it></p>
         </c>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it></p>
         </c>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it>/<it>aH</it></p>
         </c>
         <c ca="center">
            <p><it>aC</it>/<it>aR</it>/<it>aH</it></p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>97/2.31</p>
         </c>
         <c ca="center">
            <p>87/1.27</p>
         </c>
         <c ca="center">
            <p>87/1.15</p>
         </c>
         <c ca="center">
            <p>85/1.60</p>
         </c>
         <c ca="center">
            <p>96/1.46/0.45</p>
         </c>
         <c ca="center">
            <p>88/2.14/0.45</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Two values are reported for the alignments of each method: average coverage for the method (<it>aC </it>in %), and average <it>rmsd </it>(<it>aR </it>in &#197;). For FlexSnap and FlexProt, we also report the average number of hinges introduced (<it>aH</it>). FlexSnap<sup><it>R </it></sup>is FlexSnap in rigid mode with the number of maximum allowed hinges set to zero.</p>
   </tblfn></tbl>
<sec>
<st>
<p>DynDom Pairs with Low Coverage</p>
</st>
<p>Rigid alignment methods try to optimize a score that is usually dependent on the length and <it>rmsd </it>of the alignment. Therefore, they might prefer shorter alignment with small <it>rmsd </it>over a longer alignment with significantly larger <it>rmsd</it>. In some cases, like when there is a movement in one of the proteins, they have no choice but to report a shorter alignment with an acceptable <it>rmsd </it>value. We analyze the alignments of structure pairs for which rigid alignment methods returned short alignments as compared to the length of the smaller protein. We run three different rigid alignment methods, DALI, Structal, and MultiProt, and get the pairs for which any of the methods returned a coverage less than or equal to 60%. The list has 30 pairs for DALI, 282 for Structal, and 164 for MultiProt. An example of a rigid alignment with low coverage is shown in Figure <figr fid="F6">6</figr>. For this DynDom pair, Structal reported an alignment of 52 residues with <it>rmsd </it>0.40 &#197;; MultiProt's alignment was 54 with <it>rmsd </it>0.52 &#197;.</p>
<fig id="F6"><title><p>Figure 6</p></title><caption><p>An example of a rigid alignment with low coverage</p></caption><text>
   <p><b>An example of a rigid alignment with low coverage</b>. A DynDom pair with low alignment coverage: Rigid vs. Flexible alignment.</p>
</text><graphic file="1748-7188-5-12-6"/></fig>
<p>Table <tblr tid="T4">4</tblr> shows the average coverage, <it>rmsd</it>, and hinges reported by FlexSnap on these structure pairs. For fair comparison, we choose the FlexProt alignment with the same number of hinges as the FlexSnap solution. FlexSnap significantly improves the coverage of the alignments of these hard pairs. Moreover, it does so while maintaining good <it>rmsd </it>values and introducing on average about 1.5 hinges. In FlexSnap's scoring function, hinges are penalized and we only introduce a hinge if there is a significant increase in the alignment score. That explains why the number of hinges introduced is not large. DALI optimizes a score that incorporates the length and <it>rmsd </it>of the alignment. Thus for these 30 pairs, the score is too low for longer alignments, and thus DALI chooses to report shorter alignments with good <it>rmsd</it>, and thus low coverage on these 30 pairs. The Structal method reported low coverage alignments on many more structure pairs when compared to DALI. The reason behind that is the fact that the Structal method depends on the initial alignments for its initial transformations and it might miss the true alignment if the initial alignments are not good starting points.</p>
<tbl id="T4"><title><p>Table 4</p></title><caption><p>Comparison of FlexSnap and FlexProt on the DynDom pairs for which rigid alignment methods returned <it>coverage </it>&#8804; 60%</p></caption><tblbdy cols="5">
      <r>
         <c cspan="3" ca="center">
            <p>
               <b>Rigid Alignment</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexSnap</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexProt</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>
               <b>Method</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>#Pairs</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)/<it>aH</it></b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)/<it>aH</it></b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>DALI</p>
         </c>
         <c ca="center">
            <p>30</p>
         </c>
         <c ca="center">
            <p>31/2.3</p>
         </c>
         <c ca="center">
            <p>89/1.75/1.37</p>
         </c>
         <c ca="center">
            <p>79/2.36/1.37</p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Structal</p>
         </c>
         <c ca="center">
            <p>282</p>
         </c>
         <c ca="center">
            <p>52/0.77</p>
         </c>
         <c ca="center">
            <p>94/1.72/1.34</p>
         </c>
         <c ca="center">
            <p>93/2.08/1.34</p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>MultiProt</p>
         </c>
         <c ca="center">
            <p>164</p>
         </c>
         <c ca="center">
            <p>53/1.12</p>
         </c>
         <c ca="center">
            <p>92/1.59/1.56</p>
         </c>
         <c ca="center">
            <p>93/2.0/1.56</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Two values are reported for the alignments of each method: average coverage for the method (<it>aC </it>in %), and average <it>rmsd </it>(<it>aR </it>in &#197;). For FlexSnap and FlexProt, we also report the average number of hinges introduced (<it>aH</it>).</p>
   </tblfn></tbl>
</sec>
<sec>
<st>
<p>DynDom Pairs with Large rmsd</p>
</st>
<p>In some cases rigid alignment methods would seek to optimize the score that favors longer alignments with acceptable <it>rmsd </it>values, and thus they may have good coverage on some pairs, but the <it>rmsd </it>values may be too large. Flexible alignments can be employed for these cases to get similar alignments but with much better <it>rmsd </it>values. For each of our test methods, namely DALI, Structal, and MultiProt, we compiled a list of the structure pairs for which the method reported an alignment with <it>rmsd </it>&#8805; 4.0 &#197;, and we ran FlexSnap on these pairs. An example of a rigid alignment with large <it>rmsd </it>is shown in Figure <figr fid="F7">7</figr>. FlexSnap reported an alignment with 100% coverage with an <it>rmsd </it>of 0.71 &#197; by introducing only one hinge in the alignment. Table <tblr tid="T5">5</tblr> shows the average coverage, and <it>rmsd </it>as reported by the native rigid method and by FlexSnap. Under this criterion, DALI reported alignments with <it>rmsd </it>&#8805; 4.0 &#197; on 295 pairs, much more than what the other methods reported. MultiProt didn't report any alignment with large <it>rmsd</it>. In fact, all of the MultiProt alignments had <it>rmsd </it>&#8804; 2.3 &#197;; this can be explained by noting that MultiProt includes in the alignment only residue pairs which are closely aligned and thus the overall <it>rmsd </it>will not be large.</p>
<tbl id="T5"><title><p>Table 5</p></title><caption><p>Comparison of FlexSnap and FlexProt on the DynDom pairs for which rigid alignment methods returned alignments with <it>rmsd </it>&#8805; 4.0 &#197;</p></caption><tblbdy cols="5">
      <r>
         <c cspan="2" ca="center">
            <p>
               <b>Rigid Alignment</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>FlexSnap</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>FlexProt</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>
               <b>Method</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>#Pairs</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)/<it>aH</it></b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>aC</it>(%)/<it>aR</it>(&#197;)/<it>aH</it></b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>DALI</p>
         </c>
         <c ca="center">
            <p>295</p>
         </c>
         <c ca="center">
            <p>94/5.89</p>
         </c>
         <c ca="center">
            <p>94/1.61/1.54</p>
         </c>
         <c ca="center">
            <p>93/2.02/1.54</p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Structal</p>
         </c>
         <c ca="center">
            <p>16</p>
         </c>
         <c ca="center">
            <p>88/5.03</p>
         </c>
         <c ca="center">
            <p>82/1.97/2.19</p>
         </c>
         <c ca="center">
            <p>78/2.75/2.19</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Two values are reported for the alignments of each method: average coverage (<it>aC </it>in %), and average <it>rmsd </it>(<it>aR </it>in &#197;). For FlexSnap and FlexProt, we also report the average number of hinges introduced (<it>aH</it>).</p>
   </tblfn></tbl>
<fig id="F7"><title><p>Figure 7</p></title><caption><p>An example of a rigid alignment with large <it>rmsd</it></p></caption><text>
   <p><b>An example of a rigid alignment with large <it>rmsd</it></b>. A DynDom pair with large alignment <it>rmsd</it>: Rigid vs. Flexible alignment.</p>
</text><graphic file="1748-7188-5-12-7"/></fig>
<p>FlexSnap significantly improved the average <it>rmsd </it>of the alignments of these pairs. For the 295 pairs for which DALI reported an average <it>rmsd </it>of 5.89 &#197;, FlexSnap reported an average <it>rmsd </it>of 1.61 &#197;. For the 16 pairs reported by Structal, FlexSnap average <it>rmsd </it>is 1.97 &#197; as opposed to 5.03 &#197; reported by Structal.</p>
</sec>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>We have introduced FlexSnap, a greedy chaining algorithm that reports both sequential and non-sequential alignments and allows twists (hinges). We assessed the quality of the FlexSnap alignments by measuring its agreements with manually curated non-sequential alignments (on the RIPC dataset). On the FlexProt dataset, FlexSnap was competitive to state-of-the-art flexbile alignment methods. Moreover, we demonstrated the benefits of introducing hinges by showing the significant improvement in the alignments reported by FlexSnap for the structure pairs for which rigid alignment methods reported alignments with either low coverage or large <it>rmsd </it>(on the DynDom dataset).</p>
</sec>
<sec>
<st>
<p>Competing interests</p>
</st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>SS and MJZ designed the FlexSnap algorithm with the proposed scoring scheme. SS coded the algorithm in C++ and wrote the paper. CB proposed the experiments and helped analyze the alignments. All authors read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>We would like to thank the anonymous reviewers for their comments and suggestions. Also, we thank the authors of the different alignment algorithms for making their programs available. This work was supported in part by NSF Grants EMT-0829835 and EIA-0103708, NIH Grant 1R01EB0080161-01A1, and NIH grant number P20 RR016741 from the INBRE program of the National Center for Research Resources.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Protein domain movements: detection of rigid domains and visualization of hinges in comparisons of atomic coordinates</p></title><aug><au><snm>Wriggers</snm><fnm>W</fnm></au><au><snm>Schulten</snm><fnm>K</fnm></au></aug><source>Proteins: Structure, Function, and Genetics</source><pubdate>1997</pubdate><volume>29</volume><fpage>1</fpage><lpage>14</lpage><xrefbib><pubid idtype="doi">10.1002/(SICI)1097-0134(199709)29:1&lt;1::AID-PROT1&gt;3.0.CO;2-J</pubid></xrefbib></bibl><bibl id="B2"><title><p>Protein structure comparison by alignment of distance matrices</p></title><aug><au><snm>Holm</snm><fnm>L</fnm></au><au><snm>Sander</snm><fnm>C</fnm></au></aug><source>J Mol Biol</source><pubdate>1993</pubdate><volume>233</volume><fpage>123</fpage><lpage>138</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/jmbi.1993.1489</pubid><pubid idtype="pmpid" link="fulltext">8377180</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core</p></title><aug><au><snm>Subbiah</snm><fnm>S</fnm></au><au><snm>Laurents</snm><fnm>D</fnm></au><au><snm>Levitt</snm><fnm>M</fnm></au></aug><source>curr biol</source><pubdate>1993</pubdate><volume>3</volume><fpage>141</fpage><lpage>148</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0960-9822(93)90255-M</pubid><pubid idtype="pmpid" link="fulltext">15335781</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>SARFing the PDB</p></title><aug><au><snm>Alexandrov</snm><fnm>N</fnm></au></aug><source>Protein Engineering</source><pubdate>1996</pubdate><volume>50</volume><issue>9</issue><fpage>727</fpage><lpage>732</lpage></bibl><bibl id="B5"><title><p>Protein structure alignment by incremental combinatorial extension (CE) of the optimal path</p></title><aug><au><snm>Shindyalov</snm><fnm>I</fnm></au><au><snm>Bourn</snm><fnm>P</fnm></au></aug><source>Protein Eng</source><pubdate>1998</pubdate><volume>11</volume><fpage>739</fpage><lpage>747</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/protein/11.9.739</pubid><pubid idtype="pmpid" link="fulltext">9796821</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>A method for simultaneous alignment of multiple protein structures</p></title><aug><au><snm>Shatsky</snm><fnm>M</fnm></au><au><snm>Nussinov</snm><fnm>R</fnm></au><au><snm>Wolfson</snm><fnm>H</fnm></au></aug><source>Proteins: Structure, Function, and Bioinformatics</source><pubdate>2004</pubdate><volume>56</volume><fpage>143</fpage><lpage>156</lpage><xrefbib><pubid idtype="doi">10.1002/prot.10628</pubid></xrefbib></bibl><bibl id="B7"><title><p>Non-sequential Structure-based Alignments Reveal Topology-independent Core Packing Arrangements in Proteins</p></title><aug><au><snm>Yuan</snm><fnm>X</fnm></au><au><snm>Bystroff</snm><fnm>C</fnm></au></aug><source>Bioinformatics</source><pubdate>2003</pubdate><volume>21</volume><issue>7</issue><fpage>1010</fpage><lpage>1019</lpage><xrefbib><pubid idtype="doi">10.1093/bioinformatics/bti128</pubid></xrefbib></bibl><bibl id="B8"><title><p>FAST: A Novel Protein Structure Alignment Algorithm</p></title><aug><au><snm>Zhu</snm><fnm>J</fnm></au><au><snm>Weng</snm><fnm>Z</fnm></au></aug><source>Proteins: Structure, Function and Bioinformatics</source><pubdate>2005</pubdate><volume>14</volume><fpage>417</fpage><lpage>423</lpage></bibl><bibl id="B9"><title><p>Circular permutations of natural protein sequences: structural evidence</p></title><aug><au><snm>Lindqvist</snm><fnm>Y</fnm></au><au><snm>Schneider</snm><fnm>G</fnm></au></aug><source>Curr Opin Struct Biol</source><pubdate>1997</pubdate><volume>7</volume><issue>3</issue><fpage>422</fpage><lpage>427</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0959-440X(97)80061-9</pubid><pubid idtype="pmpid" link="fulltext">9204286</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Common Structural Cliques: a tool for protein structure and function analysis</p></title><aug><au><snm>Milik</snm><fnm>M</fnm></au><au><snm>Szalma</snm><fnm>S</fnm></au><au><snm>Olszewski</snm><fnm>K</fnm></au></aug><source>Protein Engineering</source><pubdate>2003</pubdate><volume>16</volume><issue>8</issue><fpage>543</fpage><lpage>552</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/protein/gzg080</pubid><pubid idtype="pmpid" link="fulltext">12968072</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Flexible protein alignment and hinge detection</p></title><aug><au><snm>Shatsky</snm><fnm>M</fnm></au><au><snm>Nussinov</snm><fnm>R</fnm></au><au><snm>Wolfson</snm><fnm>H</fnm></au></aug><source>Proteins: Structure, Function, and Bioinformatics</source><pubdate>2002</pubdate><volume>48</volume><fpage>242</fpage><lpage>256</lpage><xrefbib><pubid idtype="doi">10.1002/prot.10100</pubid></xrefbib></bibl><bibl id="B12"><title><p>Flexible structure alignment by chaining aligned fragment pairs allowing twists</p></title><aug><au><snm>Ye</snm><fnm>Y</fnm></au><au><snm>Godzik</snm><fnm>A</fnm></au></aug><source>Bioinformatics</source><pubdate>2003</pubdate><volume>19</volume><fpage>II246</fpage><lpage>II255</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">14534198</pubid></xrefbib></bibl><bibl id="B13"><title><p>Approximate protein structural alignment in polynomial time</p></title><aug><au><snm>Kolodny</snm><fnm>R</fnm></au><au><snm>Linial</snm><fnm>N</fnm></au></aug><source>PNAS</source><pubdate>2004</pubdate><volume>101</volume><fpage>12201</fpage><lpage>12206</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0404383101</pubid><pubid idtype="pmcid">514457</pubid><pubid idtype="pmpid" link="fulltext">15304646</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>A general method applicable to the search for similarities in the amino acid sequence of two proteins</p></title><aug><au><snm>Needleman</snm><fnm>S</fnm></au><au><snm>Wunsch</snm><fnm>C</fnm></au></aug><source>J Mol Biol</source><pubdate>1970</pubdate><volume>48</volume><fpage>443</fpage><lpage>453</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0022-2836(70)90057-4</pubid><pubid idtype="pmpid" link="fulltext">5420325</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures</p></title><aug><au><snm>Gerstein</snm><fnm>M</fnm></au><au><snm>Levitt</snm><fnm>M</fnm></au></aug><source>Proc Int Conf Intell Syst Mol Biol</source><pubdate>1996</pubdate><volume>4</volume><fpage>59</fpage><lpage>67</lpage><xrefbib><pubid idtype="pmpid">8877505</pubid></xrefbib></bibl><bibl id="B16"><title><p>SSAP: sequential structure alignment program for protein structure comparison</p></title><aug><au><snm>Orengo</snm><fnm>C</fnm></au><au><snm>Taylor</snm><fnm>W</fnm></au></aug><source>Methods Enzymol</source><pubdate>1996</pubdate><volume>266</volume><fpage>617</fpage><lpage>35</lpage><xrefbib><pubidlist><pubid idtype="doi">full_text</pubid><pubid idtype="pmpid">8743709</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><aug><au><snm>Eidhammer</snm><fnm>I</fnm></au><au><snm>Jonassen</snm><fnm>I</fnm></au><au><snm>Taylor</snm><fnm>WR</fnm></au></aug><source>Protein Bioinformatics: An algorithmic Approach to Sequence and Structure Analysis</source><publisher>UK: John Wiley &amp; Sons Ltd</publisher><pubdate>2004</pubdate></bibl><bibl id="B18"><title><p>Structure comparison and structure patterns</p></title><aug><au><snm>Eidhammer</snm><fnm>I</fnm></au><au><snm>Jonassen</snm><fnm>I</fnm></au><au><snm>Taylor</snm><fnm>W</fnm></au></aug><source>J Comput Biol</source><pubdate>2000</pubdate><volume>7</volume><issue>5</issue><fpage>685</fpage><lpage>716</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/106652701446152</pubid><pubid idtype="pmpid" link="fulltext">11153094</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><aug><au><snm>Garey</snm><fnm>M</fnm></au><au><snm>Johnson</snm><fnm>D</fnm></au></aug><source>Computers and Intractability: A Guide to the Theory of NP-Completeness</source><publisher>San Francisco, CA: W.H. Freeman</publisher><pubdate>1979</pubdate></bibl><bibl id="B20"><title><p>HingeProt: Automated Prediction of Hinges in Protein Structures</p></title><aug><au><snm>Emekli</snm><fnm>U</fnm></au><au><snm>Schneidman-Duhovny</snm><fnm>D</fnm></au><au><snm>Wolfson</snm><fnm>H</fnm></au><au><snm>Nussinov</snm><fnm>R</fnm></au><au><snm>Haliloglu</snm><fnm>T</fnm></au></aug><source>Proteins</source><pubdate>2008</pubdate><volume>70</volume><issue>4</issue><fpage>1219</fpage><lpage>1227</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.21613</pubid><pubid idtype="pmpid" link="fulltext">17847101</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>HingeMaster: normal mode hinge prediction approach and integration of complementary predictors</p></title><aug><au><snm>Flores</snm><fnm>S</fnm></au><au><snm>Keating</snm><fnm>K</fnm></au><au><snm>Painter</snm><fnm>J</fnm></au><au><snm>Morcos</snm><fnm>F</fnm></au><au><snm>Nguyen</snm><fnm>K</fnm></au><au><snm>Merritt</snm><fnm>E</fnm></au><au><snm>Kuhn</snm><fnm>L</fnm></au><au><snm>Gerstein</snm><fnm>M</fnm></au></aug><source>Proteins</source><pubdate>2008</pubdate><volume>73</volume><fpage>299</fpage><lpage>319</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.22060</pubid><pubid idtype="pmpid" link="fulltext">18433058</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>A solution for the best rotation to relate two sets of vectors</p></title><aug><au><snm>Kabsch</snm><fnm>W</fnm></au></aug><source>Acta Crystallogr</source><pubdate>1976</pubdate><volume>A32</volume><fpage>922</fpage><lpage>923</lpage></bibl><bibl id="B23"><title><p>Identification of partially obscured objects in two dimensions by matching of noisy characteristic curves</p></title><aug><au><snm>Chwartz</snm><fnm>J</fnm></au><au><snm>Sharir</snm><fnm>M</fnm></au></aug><source>Int J Robotics Res</source><pubdate>1987</pubdate><volume>6</volume><fpage>29</fpage><lpage>44</lpage><xrefbib><pubid idtype="doi">10.1177/027836498700600203</pubid></xrefbib></bibl><bibl id="B24"><aug><au><snm>Gusfield</snm><fnm>D</fnm></au></aug><source>Algorithms on strings, trees, and sequences: Computer science and computational biology</source><publisher>New York: Cambridge University Press</publisher><pubdate>1999</pubdate></bibl><bibl id="B25"><title><p>A comprehensive and non-redundant database of protein domain movements</p></title><aug><au><snm>Qi</snm><fnm>G</fnm></au><au><snm>Lee</snm><fnm>R</fnm></au><au><snm>Hayward</snm><fnm>S</fnm></au></aug><source>Bioinformatics</source><pubdate>2005</pubdate><volume>21</volume><issue>12</issue><fpage>2832</fpage><lpage>2838</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/bti420</pubid><pubid idtype="pmpid" link="fulltext">15802286</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>Comparative Analysis of Protein Structure Alignments</p></title><aug><au><snm>Mayr</snm><fnm>G</fnm></au><au><snm>Dominques</snm><fnm>F</fnm></au><au><snm>Lackner</snm><fnm>P</fnm></au></aug><source>BMC Structural Biol</source><pubdate>2007</pubdate><volume>7</volume><issue>50</issue><fpage>564</fpage><lpage>77</lpage></bibl><bibl id="B27"><title><p>LGA - a Method for Finding 3D Similarities in Protein Structures</p></title><aug><au><snm>Zemla</snm><fnm>A</fnm></au></aug><source>Nucleic Acids Research</source><pubdate>2003</pubdate><volume>31</volume><issue>13</issue><fpage>3370</fpage><lpage>3374</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg571</pubid><pubid idtype="pmcid">168977</pubid><pubid idtype="pmpid" link="fulltext">12824330</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>A database of macromolecular motions</p></title><aug><au><snm>Gerstein</snm><fnm>M</fnm></au><au><snm>Krebs</snm><fnm>W</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1998</pubdate><volume>26</volume><issue>18</issue><fpage>4280</fpage><lpage>4290</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/26.18.4280</pubid><pubid idtype="pmcid">147832</pubid><pubid idtype="pmpid" link="fulltext">9722650</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Systematic Analysis of Domain Motions in Proteins from Conformational Change; New Results on Citrate Synthase and T4 Lysozyme</p></title><aug><au><snm>Hayward</snm><fnm>S</fnm></au><au><snm>Berendsen</snm><fnm>H</fnm></au></aug><source>Proteins, Structure, Function and Genetics</source><pubdate>1998</pubdate><volume>30</volume><fpage>144</fpage><lpage>154</lpage><xrefbib><pubid idtype="doi">10.1002/(SICI)1097-0134(19980201)30:2&lt;144::AID-PROT4&gt;3.0.CO;2-N</pubid></xrefbib></bibl></refgrp>
</bm></art>