Bielefeld University - AG Genome Informatics

Haplotyping Oct. 22nd, 2007, Paul Medvedev

Given a set of haplotypes from a population that has evolved from one original individual, how many recombination events must have occured to produce those haplotypes? The Myers and Griffiths 2003 paper is an often cited paper as the first non-trivial bound on this number. I recommend to read the first 7 pages, up until "Example." This is the work of statisticians/geneticists doing combinatorics, so even if the results take up some space they are really quite simple (once you understand them :) of course ). They give two new bounds, the history bound and the haplotype bound. The history bound is always stronger than the haplotype bound, but it takes much longer to compute.

A paper by Song, Wu, and Gusfield in ISMB 2005 improves on some of the results in the above work. Their main contribution (in my opinion) is that they give a much faster algorithm to compute the haplotype bound. This is contained in section 3.0 (less than one page). I recommend you at least skim that section, and at least read the introduction (sec 1.0 and 1.1). They do a good job of summarizing a lot of the results from Myers and Griffiths in a combinatorial style that we are used to as computer scientists. If time permits you can read the whole paper. On Monday, I will go through the combinatorial details of the algorithms presented in these two papers.