>>> RECALL 2.2 A 2-approximation algorithm for the reversal distance (see Kececioglu and Sankoff, 1992) Definition 2.2: For a permutation π = (π_1 π_2 ··· π_n), two elements π_i and π_{i+1} form a /breakpoint/ (BP), if |π_i − π_i+1| > 1 and otherwise an /adjacency/ (ADJ). Ex.: in π^2 are (1 5), (5 3) breakpoints ... but so are (2) and (4)! -> add 0, n+1 to beginning and end of permutation. Observation 2.1: There are at most n+1 breakpoints and the only permutation without breakpoints is the identity. <<< Definition 2.2 (cont'd): The number of breakpoints in permutation \pi is denoted by b(\pi) Idea: Apply a reversal that reduces b(.) in every step. Ex.: π^4 = (2 3 1 4 6 5) π^4'= (0|2 3|1|4|6 5|7) # b(.) = 5 ^---^ (0|2 3|1|4 5 6 7) # b(.) = 3 ^-----^ (0 1|3 2|4 5 6 7) # b(.) = 2 ^---^ (0 1 2 3 4 5 6 7) # b(.) = 0 Observation 2.2: An algorithm that always reduces b(.) by one in each step is a 2-approximation. Proof: Any reversal can eliminate at most 2 breakpoints (one at the left end and one at the right end), therefore OPT(\pi) <= b(\pi)/2. Thus, r = A(\pi)/OPT(\pi) <= 2 if A(\pi) >= b(\pi). However, it is not always possible to reduce b(.): π^5 = \(0|4 5 6|1 2 3|7) Any permutation can be partitioned into increasing strips (overlined) and decreasing strips (underlined). A strip with one element is increasing for 0 and n+1, decreasing otherwise. π^6 = (^0^_21_^345^^78^_6_^9^) Lemma 2.2: If there is at least one decreasing strip, there is a reversal that reduces the number of breakpoints. Proof: Consider the smallest element k in all decreasing strips. The element k − 1 must be in a increasing strip. All the strips in permutation π^5 are increasing. What can we do to guarantee that we can decrease the number of breakpoints in such a case? A reversal of an increasing strip (b(.) does not change) produces a decreasing strip. (^0^_654_^123^^7^) Theorem 2.1: Let \pi be a permutation with a decreasing strip. If every reversal that reduces b(\pi) leaves a permutation with no decreasing strips, \pi accommodates a reversal that reduces b(\pi) by two. (Exercise: Prove!) Algorithm 2.2 (greedy): Input: \pi Output: reversal distance rd(\pi), sorting scenario (\rho_1, \rho_2, ..., \rho_d) d <- 0 while \pi contains a breakpoint do d <- d + 1 Let \rho_d be a reversal that removes the most breakpoints of \pi, resolving ties among those that remove one breakpoint in favor of reversals that leave a decreasing strip \pi <- \pi \circ \rho_d end return d, (\rho_1, \rho_2, ... \rho_d) Lemma 2.3: Algorithm 2.2 sorts sorts every permutation \pi in at most b(n) reversals. Best approximation ratio: 11/8, Berman, Hannenhalli, and Karpinski 2002 3. The signed reversal distance Definition 3.1: A signed permutation is a permutation on the set {1, . . . , n} in which every element has an orientation, indicated by a sign "+" or "-". To simplify, the "+" is usually omitted. Example: \pi^1=(-2 -1 4 3 5 -8 6 7 9) Application in genome rearrangement studies: A chromosome is a DNA molecule composed of antiparallel strands that can be read in either of the two possible directions. A /gene/ is associated with an interval on a DNA strand hand has a /reading direction/ (5'-to-3' or left-to-right, by convention). (draw genome above in arrow notation) Definition 3.1 (cont'd): By convention, a permutation of size n representing a chromosomal sequence with n genes is bordered by 0 and n+1. Definition 3.2: In a signed permutation, a pair of consecutive elements i·(i +1) or -(i +1)·-i is called an /adjacency/ (ADJ) and otherwise a /breakpoint/ (BP). Definition 3.3: A /reversal/ of an interval in a signed permutation reverts the order and sign of all elements of the interval. Let's sort this signed permutation \pi^2 = (0 -3 -4 1 -5 -2 6) \pi^2 \circ \rho(5,6) = (0 -3 -4 1 2 5 6) .. \circ \rho(3,5) = (0 -3 -2 -1 4 5 6) .. \circ \rho(2,4) = (0 1 2 3 4 5 6) Problem 3.1 ("Reversal Distance"): Given two signed permutations \pi and \sigma, find srd(\pi, \sigma), the minimum number of reversals needed to transform \pi into \sigma (again, we assume that \sigma is always the identity and use abbreviated notation srd(\pi) := srd(\pi, id) First linear time algorithm solving Problem 3.1 by Bader and Moret (2001) -> srd(\pi^2) = 3 (optimal, thus a solution to Problem 3.1) 3.1 A tight lower bound for srd(\pi) Definition 3.4: The /breakpoint graph/ of a signed permutation \pi is the graph BG(\pi) = (V, E), whose vertex set V contains, for 1 \leq g \leq n, two vertices g^t and g^h called the /tail/ and the /head/ of gene g, plus two vertices 0^h and n+1^t. The edge set E is the union of two perfect matchings R and D of V: - "reality edges" R contains edge from \pi_i^h if \pi_i is non-negative, and from \pi_i^t otherwise, to \pi_i^t if \pi_{i+1} is non-negative, and to \pi_{i+1}^h otherwise, for 0 \leq i \leq n. - "desire edges" D := {{g_h, (g+1)_t} | 0 \leq g \leq n } (adjacencies of the identity) --> Question, how would BG(id) look like? (BG(\pi^2) drawn, using two different colors for the two matchings R and D)