392217 | Wittler | Winter 2017 | Tuesday 14-16 in U10-146 | ekvv |
This semester, everybody is invited to give a tutorial on a selected topic, e.g., a method he/she is working with.
Date | Name | Topic | Details/Materials |
---|---|---|---|
10.10. | Administratives | ||
17.10. | |||
24.10. | |||
31.10. | – exceptional national holiday – | ||
07.11. | Omar | Introduction to Systems Biology: Dynamic mathematical modelling (with ODEs) | Requirements: Python with Scipy represilator.py.zip |
14.11. | Georges | Introduction to color and color usage | Short presentation about color, terminology, tips, and useful links. slides |
21.11. | |||
28.11. | Linda | Group Games - For Youth Groups or Your Next Flat Party | just be open to play some funny/silly games |
05.12. | Lukas | Data Science in Python - Crash Course | Python with juypter notebook, pandas, scikit-learn and more, Anaconda distribution of python highly recommended, Github |
12.12. | Roland | Basics in pictorial design with application in polarization photography | results in /vol/didy/Pictures/20171212_Polarization |
19.12. | Robert | Short introduction to analytic combinatorics (Christmas edition) | Mathematica or Wolfram Programming Lab (for hands-on experience, fully optional). slides, notebooks |
09.01. | Michel T. | TikZ | You'll find the latest version of the manual here. |
16.01. | Markus | Merits and Pitfalls of Clustering | |
23.01. | Guillaume | Bifrost: Highly Parallel and Memory Efficient Compacted de Bruijn Graph Construction | (abstract below) |
30.01. | Karsten | Brewing 101 | Basic introduction into brewing of beer. Tasting included if some beer is ready. |
De Bruijn graphs are the core data structure for a wide number of whole genome and transcriptome assemblers processing High Throughput Sequencing datasets. However, memory consumption of such assemblers is often prohibitive, due to the large number of vertices and edges in the graph, to the point of hindering the use of assemblers on large and complex genomes. Most short-read assemblers based on the de Bruijn graph paradigm reduce the assembly complexity and memory usage by compacting first all maximal non-branching paths of the graph into single vertices. Yet, such a compaction is challenging as it requires the uncompacted de Bruijn graph to be available in memory. We present a new parallel and memory efficient algorithm enabling the direct construction of the compacted de Bruijn graph without producing the intermediate uncompacted de Bruijn graph. Our method relies on a space and time efficient data structure, the Bloom filter, enhanced with minimizer hashing to increase cache performance. Despite making extensive use of a probabilistic data structure, our algorithm guarantees that the produced compacted de Bruijn graph is deterministic. Furthermore, the algorithm features de Bruijn graph simplification steps used by assemblers such as tip clipping and isolated unitig removal. In addition, as disk-based software performance is significantly affected by the discrepancy of speed among disk storage technologies, our method uses only main memory storage. Experimental results show that our algorithm is competitive with state-of-the-art de Bruijn graph compaction methods.