Algorithmen in der Bioinformatik (S)

392219 Wittler Summer 2022 Wednesday 10:15-11:45 U10-146

(Je nach Wunsch/Bedarf der Studierenden wird das Seminar auf Deutsch oder Englisch durchgeführt.
Depending on the wishes/demands of the students, this seminar can be held in English or German.)

Based on original research papers, the participants will give oral presentations (20-45 min) and write short summaries (5-10 pages) about algorithmic problems in bioinformatics and their solutions. Talks and essays can be done in German or English. The first day covers an overview of possible topics, which will then be distributed to the students. Aspects of scientific writing and presenting will be covered as well.

The overarching topic of this semester are k-mers (a.k.a. q-grams). This simple concept builds a basis for many algorithmic solutions in bioinformatics, such as assembly, alignment, genome comparison, pangenomics, etc.

Possible concrete methods/publications to be presented/discussed in the seminar are:


De Bruijn Graphs

  • Holley, Guillaume, and Páll Melsted. “Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.” Genome biology 21.1 (2020): 1-20.
  • Luhmann, Nina, Guillaume Holley, and Mark Achtman. “BlastFrost: fast querying of 100,000 s of bacterial genomes in Bifrost graphs.” Genome biology 22.1 (2021): 1-15.
  • Ekim, Barış, Bonnie Berger, and Rayan Chikhi. “Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer.” Cell systems 12.10 (2021): 958-968.


Counting k-mers


06.04. Organization, topic selection
13.04. Scientific reading and writing, introduction to LaTeX Slides: HowToRead, Notes on Scientific writing and citing, Latex-template, Bibtex-example (remove “.pdf” from filenames), Bibliography styles
27.04. Introductions, conclusions, math Notes 1, Notes 2, extended Latex-template (remove “.pdf” from filename)
11.05. Tables, figures, algorithms extended Latex-template (remove “.pdf” from filename)
18.05. Checklist on writing, intro to presentations (in LaTeX Beamer) checklist, LaTeX-beamer template (remove “.pdf” from filename)
01.06. E.S.: “Velvet: algorithms for de novo short read assembly using de Bruijn graphs.” Zerbino, Daniel R., and Ewan Birney. Genome research 18.5 (2008): 821-829.
I.C.: “Space-efficient and exact de Bruijn graph representation based on a Bloom filter. (Minia)” Chikhi, Rayan, and Guillaume Rizk. Algorithms for Molecular Biology 8.1 (2013): 1-9.
08.06. V.K.: “Efficient q-gram filters for finding all ε-matches over a given length. (SWIFT)” Rasmussen, Kim R., Jens Stoye, and Eugene W. Myers. Journal of Computational Biology 13.2 (2006): 296-308.
P.H.: “Mash: fast genome and metagenome distance estimation using MinHash.” Ondov, Brian D., et al. Genome biology 17.1 (2016): 1-14.
22.06. K.B.: “A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. (Jellyfish)” Marçais, Guillaume, and Carl Kingsford. Bioinformatics 27.6 (2011): 764-770.
J.G.: “KMC 3: counting and manipulating k-mer statistics.” (and previous versions) Kokot, Marek, Maciej Długosz, and Sebastian Deorowicz. Bioinformatics 33.17 (2017): 2759-2761.
29.06. J.S.: “Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.” Li, Heng. Bioinformatics 32.14 (2016): 2103-2110.
L.B.: “Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer.” Ekim, Barış, Bonnie Berger, and Rayan Chikhi. Cell systems 12.10 (2021): 958-968.