Bachelor and Master Thesis Topics within the AG Genome Informatics

The following is a (probably incomplete) list of thesis topics offered within the Genome Informatics group. If you are interested in working on one of these (or anything else in Genome Informatics), please contact the responsible group members or see Prof. Jens Stoye.

Improvement of Sequence-to-Graph Alignment (Bachelor)

Tizian Schulz (Please also refer to the project page: PLAST)

PLAST is a new heuristic method to find maximum scoring local alignments of a DNA query sequence to a pangenome represented as a compacted colored de Bruijn graph. The first method has been published here, but there exist various ideas how to improve the method. Some are well suited for a Bachelor thesis. Contact Tizian for details.

(Runtime) Heuristic for the Fast Comparison of Genomes (Bachelor/Master)

Leonard Bohnenkämper (Also refer to the gitlab1/ 2)

DING (publication) is an exact ILP solution for an NP-hard problem, comparing arbitrary genomes on a high level under the DCJ-Indel model. It is already very fast for small to medium size genomes. However, for some large or very complex genomes that occur in practice, DING is not able to calculate solutions. There are some ideas how to circumvent this problem using approximate or heuristic methods, which could be developed as a Bachelor thesis or as a Master thesis or project module. Contact Leonard for details.

DCJ-Indels of Natural Genes (Bachelor)

Jens Stoye

We have developed the tool DING (DCJ-Indels of Natural Genomes), which could also be applied to protein sequences, resulting in a new tool DING (DCJ-Indels of Natural Genes).

Necessary is basic knowledge in algorithms and sequence analysis, ideally also algorithms in comparative genomics.

Visualizing Phylogenetic Splits (Bachelor)

Roland Wittler (Please also refer to the project page: SANS)

SANS is an efficient method for alignment-free, whole-genome based phylogeny estimation that follows a pangenomic approach to efficiently calculate a set of splits in a phylogenetic tree or network. Splits Tree is a tool to visualize such split networks. In this project, the output of SANS should be simplified by replacing individual, textual genome labels by colored bullets that indicate phylogenetic subgroups (or other properties), see example.

Horizontal Gene Transfer Detection (Master)

Tizian Schulz or Roland Wittler or Leonard Bohnenkämper

Horizontal Gene Transfers (HGTs) are events that transfer genetic material from one lineage to another. HGTs are especially common in bacteria and particularly relevant for the spreading of (antibiotic) resistance factors among microbes. SANS is an efficient method for the construction of phylogenies and as a byproduct allows to find candidate sequences that might have been part of a HGT (see also Section "Drosophila"). There are some ideas how to automatize the process of finding and verifying such HGT candidates, which can be developed into a master thesis - or you can even bring your own!