Variation calling challenge

Content of the project

You are given reference genome A and simulated short read data D from genome B. Genome B is variant of A with some simulated SNPs, and short and large indels. The task is to reconstruct B' from A and D such that B' is as close to B as possible. The closeness is measured by the alignment score of B’ and B; the alignment is produced by projecting the alignment of B’ to A, given by you, to the alignment of B’ to B through the original generated alignment of A to B. You can use any tools available to produce sequence B’ and its alignment to A.


The dataset can be downloaded here .
The project lasts 4 weeks and it can be done in small teams.
Guidance is offered 2 hours per week.

First meeting is on: Monday 26.11. from 15-17 in U10-155

