This is an old revision of the document!


Bachelor and Master Thesis Topics within the AG Genome Informatics


The following is a (probably incomplete) list of thesis topics offered within the Genome Informatics group. If you are interested in working on one of these (or anything else in Genome Informatics), please contact the responsible group members or see Prof. Jens Stoye.

Medical Bioinformatics Projects in Australia (Master)

Jens Stoye, Lutz Krause (University of Queensland Diamantina Institute, Brisbane, Australia)

In collaboration with the Computational Clinical Genomics Group at the University of Queensland Diamantina Institute, Brisbane, Australia, we develop and apply bioinformatics and data-mining methods in the context of biomedical research. Student projects are available in the context of biomarker discovery for cancer, genome-wide epigenetic association studies, analysis of next-generation sequencing data, mining and visualizing the human microbiota, evolutionary genomics of human parasites, calling structural variants from next-generation sequencing data and genome-wide association studies (GWAS).

Alignment-free Phylogenomics with Copy Number Variations (Bachelor)

Andreas Rempel, Roland Wittler (Please also refer to the project page: SANS serif)

SANS serif is a software that allows the reconstruction of a phylogenetic network from a set of input genomes. For this purpose, the current implementation explores the presence/absence of k-mers in the DNA sequences. However, this method is not capable of detecting copy number variations, which would require counting and comparing the number of occurrences of each k-mer in each input sequence. The aim of this project is an efficient implementation of this extended method, but also an evaluation as to whether it improves the quality of the resulting phylogenetic network.
Good knowledge of C++ is highly recommended.

Alignment-free Phylogenomics using Amino Acid Sequences (Bachelor)

Andreas Rempel, Roland Wittler (Please also refer to the project page: SANS serif)

SANS serif is a software that allows the reconstruction of a phylogenetic network from a set of input genomes. For this purpose, the current implementation explores the presence/absence of k-mers in the DNA sequences. However, for annotated genomes or to explore the evolutionary relationship of individual genes, it might be beneficial to perform the study at the level of the amino acid sequences. The aim of this project is an efficient implementation of this extended method, but also an evaluation as to whether it improves the quality of the resulting phylogenetic network.
Good knowledge of C++ is highly recommended.