Code Reading Club

“Nobody would try to become an author without being an active reader. Why should becoming a programmer be different?”

– freely adapted from raichoo.

Why attend this course?

“Code is read much more often than it is written”, “The ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code […] Making it easy to read makes it easier to write.” - these are just some quotes for you to get an idea of how important reading and readability of code is, especially in industry jobs (see our github for sources). Also, any programmer who has ever worked on a project that consisted of more than some throwaway scripts knows the frustrating situation of spending inordinate amounts of time trying to understand code you or someone else wrote months ago - preferably to fix a bug right before an important deadline. Like any skill, we believe that becoming proficient (and efficient) at reading code can be trained by practice. A more than welcome side effect of reading other people's code is that you get to learn how they think and organize their software, becoming a better programmer on the way - and there are many open source repositories of great coders to learn from out there!

What we will do

In this course, we adapt the concept of a Journal Club to reading code. Instead of reading a new paper, we will be finding our way around a new code base every week. As a bioinformatics group, our focus is on bioinformatics tools and the languages C, C++ and python. However, you are welcome to bring projects in other languages or from other fields if you want.

We will start the course by learning about how to get into an unfamiliar code base, after which each of you can choose a repository to read in depth and present in one of the following weeks. You can either bring your own or choose from a list we will provide.

Every session after that will work just like a “normal” journal club: Preparing for the session everyone reads (parts of) the repository and tries to understand it while the “presenter” reads it more in depth and tries to become more of an expert on it than the others. In the actual session, we will develop an understanding of the code base together as well as re-read those parts of the code that we did not quite understand or that are simply critical/interesting/cool.

Who this is for

If you're thinking of attending this course, most likely you have all that is needed. If you have at least some knowledge about C(++) and python and have done at least done some bigger project in any language (experience from “Grundlagen des Softwareengineering” or equivalent is probably enough here), you should be fine :)

This course is for motivated beginners as well as more experienced programmers, who just want to start reading more code instead of telling themselves that they will get to it eventually. Incidentally, this is also how the idea for this course emerged.

Organizational Stuff

Github: https://github.com/lucaparmigiani/Coding-Reading-Club

Schedule:

Date	Topic
04.04.	Organizational Matters
~~11.04.~~	(Leonard on conference)
~~18.04.~~	(Leonard on conference)
~~25.04.~~	(Leonard sick)
02.05.	Topic Selection/Dabbling in edlib
09.05.	Myers 1986/edlib
16.05.	Shahriari et al. 2015
23.05.	Bayesian Optimization
30.05.	Solomon & Kingsford 2016 / (Sequence Bloom Tree)
06.06.	Sequence Bloom Tree teaser
13.06.	Sequence Bloom Tree
20.06.	Rempel & Wittler 2021
27.06.	SANS
04.07	KNN by chingisooinar and sklearn - KNN
11.07.	No session planned

Topics

Tool	Repository	Paper(s)	Presenting
edlib	https://github.com/Martinsos/edlib	Myers 1999	Luca & Leonard
sans-kc	https://gitlab.ub.uni-bielefeld.de/gi/sans/tree/kc	Rempel & Wittler 2021	Talgat
prodigal	https://github.com/hyattpd/Prodigal
columba	https://github.com/biointec/columba
gsa-is	https://github.com/felipelouza/gsa-is.git	TBA	Presian
wg-sim	https://github.com/lh3/wgsim
Bayesian Optimization	https://github.com/fmfn/BayesianOptimization	Shahriari et al. 2015	Adrian
Sequence Bloom Tree	https://github.com/Kingsford-Group/bloomtree	Solomon & Kingsford 2016	Andreas
sklearn - KNN	https://github.com/scikit-learn/scikit-learn/blob/364c77e04/sklearn/neighbors/_classification.py#L24	https://github.com/chingisooinar/KNN-python-implementation/tree/main	Ben

Genome Informatics