3.3 Computing the reversal distance without hurdles and fortresses in O(n) time
Observation: two components are either disjoint, nested, or chained.
Definition 3.8 (component tree): Given a permutation \pi and its components,
the /component tree/ T_\pi is constructed as follows:
1. Each component is represented by a round node. It is colored black if
oriented and white, otherwise.
2. Each maximal chain is represented by a square node, containing its children.
3. A square node is the child of the smallest component that contains the chain.
(tree of \pi^3)
[ ]
/ | \
o o o
(0..4)(4..7)(7..8)
Definition 3.9 (component tree cover): A /cover/ C of T_\pi is a collection of
paths joining all the unoriented components of \pi, such that each terminal node
of a path belongs to a unique path. A path is /short/ if it contains only one
component, otherwise it is /long/.
The cost t(C) of a cover C is the sum of costs of all paths, whereby a short
path has cost 1 and a long path has cost 2.
-> \pi^3 has one cover with cost 1
We then have
Theorem 3.2: The reversal distance of a signed permutation \pi of n symbols is
srd(\pi) = n -c + t,
where c is the number of cycles and t the cost of an optimal cover of the
component tree T_\pi.
Optimal covers can be computed in O(n) time for a permutation \pi with n
symbols. Similarly, cycles can by found withing O(n) time. Hence, srd(\pi) can
be computed in O(n) time.
-------------------------------------------------------------------------------
4. Sorting by (signed) reversals
Literature:
- Eric Tannier, Anne Bergeron, Marie-France Sagot: Advances on sorting by
reversals (2007)
- Anne Bergeron, Julia Mixtacki, Jens Stoye: The Inversion Distance Problem.
In: O. Gascuel (ed.): Mathematics of Evolution and Phylogeny. Chapter 10,
pp. 262-290. Oxford University Press, 2005.
Recall: The reversal distance d = srd(\pi) of a permutation \pi of n symbols
can be computed in O(n) time.
Problem 4.1 (Sorting by (signed) Reversals): Given two signed permutations \pi
and \sigma, find a series of reversals \rho_1, ... \rho_d, such that \pi \circ
\rho_1 \circ ... \circ \rho_d = \sigma and srd(\pi, \sigma) = d.
It turns out that finding an actual sorting scenario is more complicated than
computing the reversal distance.
Let us study the effect of reversals on the cycles of breakpoint graph in more
detail using the example of
\pi^3 = (-2 -3 1 4 6 5 7)
(BG(\pi^3) shown in Java Program)
What happens when we apply a reversal on the same cycle?
Type I: divergent edges -> breaks the cycle \Delta c = +1
Type II: convergent edges -> \Delta c = 0, but may change cycle orientation
What about different cycles?
Type III: Merges two cycles -> \Delta c = -1
Algorithms for computing a sorting scenario work in two steps:
1. Use reversals of type II and III to transform all unoriented into
oriented components.
2. Apply reversals of type I to break cycles into trivial cycles
(adjacencies).
Step 1 can be performed in O(n) time, see Bader, Moret, and Yan (2001) whereas
step (2) requires O(n^(3/2)) time by the best, known algorithm by Tannier and
Sagot (2005) and Han (2006).
We now focus how step 2 and study a simpler algorithm that can achieve this task
in quadratic time by identifying so-called 'save reversals'.
A reversal always /acts/ on two (reality) edges, but it can affect other edges.
To study these effects, we need a more appropriate data structure: Meet the
/overlap graph/:
Definition 4.1: The /overlap graph/ OV(\pi) of a permutation \pi is
the graph whose vertices are the n+1 desire edges (arcs) of BG(\pi) and whose
edges correspond to crossings of between them. Vertices corresponding to
oriented edges are colored black, and white otherwise.
(graph OV(\pi^3) drawn on board)
Observation 4.1: (i) Isolated vertices correspond to adjacencies of the
permutation. (ii) A component of \pi is a connected component of OV(\pi)
What happens to OV(\pi) when we apply a reversal?
Recall the definition of a vertex-induced subgraph:
Definition 4.2 ((vertex-) induced subgraph): Given a set of vertices S \subseteq
V of a graph G = (V, E), the S-induced subgraph of G is the graph G' = (S, E'),
where E' = { (u, v) \in E | u, v \in S }.
Definition 4.3 (local complementation): Let G_v be the induced subgraph of a
vertex v and its adjacent vertices (the neighborhood of v), /local
complementation/ of v, denoted G/v, is the operation that complements all (i)
edges and (ii) colors of vertices of subgraph G_v.
Lemma 4.1: For a permutation \pi and an oriented vertex v of the overlap graph,
OV(\pi \circ \rho(v)) = OV(\pi)/v.
Theorem 4.1: If OV(\pi) has no unoriented component, then it has an oriented
vertex v such that OV(\pi)/v has no unoriented component.
Proof (see Bergeron et. al) makes use of the following definition and lemma:
Definition 4.4: The score of a reversal is the number of black (oriented)
vertices in the resulting permutation.
The score of a reversal can be easily computed in the overlap graph: Let v be a
vertex of OV(.). Clearly the score of \rho(v) is given by
score(\rho(v)) = T(v) + U(v) − O(v) − 1 where
T(v) is the total number of vertices in the component containing v and
U(v), O(v) are the number of unoriented (white), respectively oriented
(black), vertices adjacent to v.
Lemma 4.2: A type I reversal of maximal score does not create new unoriented
components.
Such a reversal is called /safe/.
Theorem 4.2: If v is a safe vertex of the overlap graph of a permutation \pi,
then d(\pi \circ \rho(v)) = d(\pi) − 1