Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
teaching:alggrliterature [2020/11/12 21:48]
jstoye
teaching:alggrliterature [2022/11/21 09:57]
jstoye [Genome assembly IIb: Hybrid/long read assembly]
Line 35: Line 35:
  
 ==== Genome assembly Ib: Re-sequencing,​ comparative (reference-based) assembly ==== ==== Genome assembly Ib: Re-sequencing,​ comparative (reference-based) assembly ====
-A good introduction to comparative genome assembly is [1]. The main algorithmic challenge is to map millions of (most very short) sequence reads onto one or more referene geneome(s). Suitable mapping algorithms for this task are [[http://​bibiserv.cebitec.uni-bielefeld.de/​swift/​|SWIFT]] [2], [[http://​bowtie-bio.sourceforge.net/​index.shtml|Bowtie]] [6], ELAND (Cox, unpublished),​ [[http://​maq.sourceforge.net/​|MAQ]] [3], [[http://​rulai.cshl.edu/​rmap/​|RMAP]],​ [[http://​soap.genomics.org.cn/​|SOAP]] [4], [[http://​compbio.cs.toronto.edu/​shrimp/​|SHRiMP]],​ SeqMap [5], TAGGER [7], ZOOM [8], [[http://​bio-bwa.sourceforge.net/​bwa.shtml|BWA]] [9], GSNAP [10], SARUMAN [11], SSAHA2 [12] etc. Methods especially suited for mapping SOLiD reads are presented in [13,14]. +A good introduction to comparative genome assembly is [1]. The main algorithmic challenge is to map millions of (most very short) sequence reads onto one or more referene geneome(s). Suitable mapping algorithms for this task are [[http://​bibiserv.cebitec.uni-bielefeld.de/​swift/​|SWIFT]] [2], [[http://​bowtie-bio.sourceforge.net/​index.shtml|Bowtie]] [6], ELAND (Cox, unpublished),​ [[http://​maq.sourceforge.net/​|MAQ]] [3], [[http://​rulai.cshl.edu/​rmap/​|RMAP]],​ [[http://​soap.genomics.org.cn/​|SOAP]] [4], [[http://​compbio.cs.toronto.edu/​shrimp/​|SHRiMP]],​ SeqMap [5], TAGGER [7], ZOOM [8], [[http://​bio-bwa.sourceforge.net/​bwa.shtml|BWA]] [9], GSNAP [10], SARUMAN [11], SSAHA2 [12], NextGenMap ​[13], etc.
  
   - M. Pop, A. Phillippy, A. L. Delcher, and S. L. Salzberg. [[https://​doi.org/​10.1093/​bib/​5.3.237|Comparative genome assembly]]. //Briefings in Bioinformatics//​ **5**(3):​237-248,​ 2004.    - M. Pop, A. Phillippy, A. L. Delcher, and S. L. Salzberg. [[https://​doi.org/​10.1093/​bib/​5.3.237|Comparative genome assembly]]. //Briefings in Bioinformatics//​ **5**(3):​237-248,​ 2004. 
Line 49: Line 49:
   - J. Blom, T. Jakobi, D. Doppmeier, S. Jaenicke, J. Kalinowski, J. Stoye, A. Goesmann. [[https://​doi.org/​10.1093/​bioinformatics/​btr151|Exact and complete short read alignment to microbial genomes using GPU programming]]. //​Bioinformatics//​ **27**(10): 1351-1358, 2011.    - J. Blom, T. Jakobi, D. Doppmeier, S. Jaenicke, J. Kalinowski, J. Stoye, A. Goesmann. [[https://​doi.org/​10.1093/​bioinformatics/​btr151|Exact and complete short read alignment to microbial genomes using GPU programming]]. //​Bioinformatics//​ **27**(10): 1351-1358, 2011. 
   - Z. Ning, A.J. Cox. [[https://​doi.org/​10.1101/​gr.194201|SSAHA:​ A Fast Search Method for Large DNA Databases]]. //Genome Res.// **11**(10): 1725-1729, 2001.    - Z. Ning, A.J. Cox. [[https://​doi.org/​10.1101/​gr.194201|SSAHA:​ A Fast Search Method for Large DNA Databases]]. //Genome Res.// **11**(10): 1725-1729, 2001. 
-  - LNoéMGîrdeaGKucherov. [[https://​doi.org/​10.1007/978-3-642-12683-3_25|Seed Design Framework for Mapping SOLiD Reads]]. Proceedings of RECOMB 2010, LNBI 6044, 384-396, 2010.  +  - FJ. SedlazeckPReschenederAvon Haeseler. [[https://​doi.org/​10.1093/bioinformatics/​btt468|NextGenMap: fast and accurate read mapping in highly polymorphic genomes]]. //Bioinformatics// **29**(21): 2790-27912013.
-  - M. Csűrös, Sz. Juhos, A. Bérces. [[https://doi.org/10.1007/978-3-642-15294-8_15|Fast Mapping and Precise Alignment of AB SOLiD Color Reads to Reference DNA]]. Proceedings of WABI 2010, LNBI 6293, 176-1882010+
   - L. Oesper, A. Ritz, S. J. Aerni, R. Drebin, B. J. Raphael. [[https://​doi.org/​10.1186/​1471-2105-13-S6-S10|Reconstructing cancer genomes from paired-end sequencing data]]. //BMC Bioinformatics//​ **13**(Suppl. 6):S10, 2012.    - L. Oesper, A. Ritz, S. J. Aerni, R. Drebin, B. J. Raphael. [[https://​doi.org/​10.1186/​1471-2105-13-S6-S10|Reconstructing cancer genomes from paired-end sequencing data]]. //BMC Bioinformatics//​ **13**(Suppl. 6):S10, 2012. 
  
Line 74: Line 73:
   - C.-S. Chin, D. H. Alexander, P. Marks, A. A. Klammer, J. Drake, C. Heiner, A. Clum, A. Copeland, J. Huddleston, E. E. Eichler, S. W. Turner, J. Korlach. [[https://​doi.org/​10.1038/​nmeth.2474|Nonhybrid,​ finished microbial genome assemblies from long-read SMRT sequencing data]]. //Nature Methods// **10**:​563-569,​ 2013.   - C.-S. Chin, D. H. Alexander, P. Marks, A. A. Klammer, J. Drake, C. Heiner, A. Clum, A. Copeland, J. Huddleston, E. E. Eichler, S. W. Turner, J. Korlach. [[https://​doi.org/​10.1038/​nmeth.2474|Nonhybrid,​ finished microbial genome assemblies from long-read SMRT sequencing data]]. //Nature Methods// **10**:​563-569,​ 2013.
   - G. Myers. [[https://​doi.org/​10.1007/​978-3-662-44753-6_5|Efficient Local Alignment Discovery amongst Noisy Long Reads]]. //​Proceedings of WABI 2014//, LNBI 8701, 52-67, 2014.   - G. Myers. [[https://​doi.org/​10.1007/​978-3-662-44753-6_5|Efficient Local Alignment Discovery amongst Noisy Long Reads]]. //​Proceedings of WABI 2014//, LNBI 8701, 52-67, 2014.
 +  - F. J. Sedlazeck, P. Rescheneder,​ M. Smolka, H. Fang, M. Nattestad, A. von Haeseler, M. C. Schatz. [[https://​doi.org/​10.1038/​s41592-018-0001-7|Accurate detection of complex structural variations using single molecule sequencing]]. //Nat. Methods// **15**(6): 461–468, 2018.
   - E. Haghshenas, H. Asghari, J. Stoye, C. Chauve, F. Hach. [[https://​doi.org/​10.1016/​j.isci.2020.101389|HASLR:​ Fast Hybrid Assembly of Long Reads]]. //​iScience//​ **23**(8): 101389, 2020.   - E. Haghshenas, H. Asghari, J. Stoye, C. Chauve, F. Hach. [[https://​doi.org/​10.1016/​j.isci.2020.101389|HASLR:​ Fast Hybrid Assembly of Long Reads]]. //​iScience//​ **23**(8): 101389, 2020.
  
Line 205: Line 205:
  
   - E. Klipp, R. Herwig, A. Kowald, C. Wierling, H. Lehrach. [[https://​doi.org/​10.1002/​3527603603|Systems Biology in Practice - Concepts, Implementation and Application]]. Wiley-VCH, 2005.    - E. Klipp, R. Herwig, A. Kowald, C. Wierling, H. Lehrach. [[https://​doi.org/​10.1002/​3527603603|Systems Biology in Practice - Concepts, Implementation and Application]]. Wiley-VCH, 2005. 
 +
 +==== Computational pangenomics ====
 +The gene based method is considered here (for example):
 +
 +  - H. Tettelin et al. [[https://​doi.org/​10.1073/​pnas.0506758102|Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implicationsfor the microbial ‘‘pan-genome’’]]. //Proc. Natl. Academy. Sci. USA// **102**(39):​ 13950-13955,​ 2005.
 +  - J. Blom, S. P. Albaum, D. Doppmeier, A. Pühler, F.-J. Vorhölter, M. Zakrzewski, and A. Goesmann. [[https://​doi.org/​10.1186/​1471-2105-10-154|EDGAR:​ A software framework for the comparative analysis of prokaryotic genomes]]. //BMC Bioinformatics//​ 10:154, 2009.
 +  - J. Blom,  J. Kreis, ​ S. Spänig, ​ T. Juhre, ​ C. Bertelli, C. Ernst, and A. Goesmann. [[https://​doi.org/​10.1093/​nar/​gkw255|EDGAR 2.0: an enhanced software platform for comparative gene content analyses]]. //Nucleic Acids Res.// **44**(W1):​W22–W28,​ 2016.
 +  - J. Blom, S. P. Glaeser, T. Juhre, J. Kreis, P. H. G. Hanel, J. G. Schrader, P. Kämpfer, and A. Goesmann. [[https://​doi.org/​10.1002/​9781118960608.bm00038|EDGAR:​ A Versatile Tool for Phylogenomics]]. In: W. B. Whitman (ed.). Bergey'​s Manual of Systematics of Archaea and Bacteria, Wiley, 2019.
 +
 +A good overview of genome-based computational pangenomics gives the following review paper:
 +
 +  - The Computational Pan-Genomics Consortium. [[https://​doi.org/​10.1093/​bib/​bbw089|Computational pan-genomics:​ status, promises and challenges]]. //Brief. Bioinf.// **19**(1), 118–135, 2018.
 +
 +Some more specialized papers are the following.
 +
 +(A) Data structures
 +  - B. Paten, D. Earl, N. Nguyen, M. Diekhans, D. Zerbino, D. Haussler. [[https://​doi.org/​10.1101/​gr.123356.111|Cactus:​ Algorithms for genome multiple sequence alignment]]. //Genome Research// **21**, 1512–1528,​ 2011
 +  - C. Ernst, S. Rahmann. [[https://​drops.dagstuhl.de/​opus/​volltexte/​2013/​4231/​pdf/​p035-ernst.pdf|PanCake:​ A Data Structure for Pangenomes]]. Proc. of //GCB 2013//, 35-45, 2013.
 +  - G. Holley, R. Wittler, and J. Stoye. [[https://​doi.org/​10.1186/​s13015-016-0066-8 |Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage]]. //​Algorithms Mol. Biol.// **11**: 3, 2016.
 +  -  E. Garrison, J. Sirén, A. M. Novak, G. Hickey, J. M. Eizenga, E. T. Dawson, W. Jones, S. Garg, C. Markello, M. F Lin, B. Paten, and R. Durbin. [[https://​doi.org/​10.1038/​nbt.4227|Variation graph toolkit improves read mapping by representing genetic variation in the reference]]. //Nat. Biotechnol.//​ **36**, 875–879, 2018.
 +  - G. Holley and P. Melsted. [[https://​doi.org/​10.1186/​s13059-020-02135-8|Bifrost:​ highly parallel construction and indexing of colored and compacted de Bruijn graphs]]. //Genome Biol.// **21**: 249, 2020.
 +
 +(B) Sequence-to-graph mapping/​alignment
 +  - M. Rautiainen, V. Mäkinen, and T. Marschall. [[https://​doi.org/​10.1093/​bioinformatics/​btz162|Bit-parallel sequence-to-graph alignment]]. //​Bioinformatics//​ **35**(19), 3599-3607, 2019. 
 +  -  R. Martiniano, E. Garrison, E. R. Jones, A. Manica, and R. Durbin. [[ https://​doi.org/​10.1186/​s13059-020-02160-7|Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph]]. //Genome Biol.// **21**: 250, 2020.
 +  - M. Rautiainen and T. Marschall. [[https://​doi.org/​10.1186/​s13059-020-02157-2|GraphAligner:​ rapid and versatile sequence-to-graph alignment]]. //Genome Biol.// **21**: 253, 2020.
 +  - A. Kuhnle, T. Mun, C. Boucher, T. Gagie, B. Langmead, and G. Manzini. [[https://​doi.org/​10.1089/​cmb.2019.0309|Efficient Construction of a Complete Index for Pan-Genomics Read Alignment]]. //J. Comp. Biol.// **27**(4), 500-513, 2020.
 +  -  N. Luhmann, G. Holley, and M. Achtman. [[https://​doi.org/​10.1101/​2020.01.21.914168|BlastFrost:​ Fast querying of 100,000s of bacterial genomes in Bifrost graphs]]. //​BioRxiv//,​ 2020.
 +  -  T. Schulz, R. Wittler, S. Rahmann, F. Hach, and J. Stoye. [[https://​doi.org/​10.1093/​bioinformatics/​btab077|Detecting High Scoring Local Alignments in Pangenome Graphs]]. //​Bioinformatics//​ **37**(16), 2266–2274,​ 2021.
 +
 +(C) Phylogenomics:​
 +
 +  - R. Wittler. [[https://​doi.org/​10.1186/​s13015-020-00164-3|Alignment- and reference-free phylogenomics with colored de Bruijn graphs]]. //​Algorithms Mol. Biol.// **15**: 4, 2020.
 +  - A. Rempel, R. Wittler. [[https://​doi.org/​10.1093/​bioinformatics/​btab444|SANS serif: alignment-free,​ whole-genome-based phylogenetic reconstruction]]. //​Bioinformatics//​ **37**(24), 4868-4870, 2021.
 +
 +(D) Haplotype inference:
 +
 +See [[#​haplotype_inference|below]].
  
 ==== Comparative genomics I: Genome alignment, repeat analysis ==== ==== Comparative genomics I: Genome alignment, repeat analysis ====
Line 231: Line 269:
   - E. Tannier, C. Zheng, D. Sankoff. [[https://​doi.org/​10.1186/​1471-2105-10-120|Multichromosomal median and halving problems under different genomic distances]]. //BMC Bioinformatics//​ **10**:120, 2009.   - E. Tannier, C. Zheng, D. Sankoff. [[https://​doi.org/​10.1186/​1471-2105-10-120|Multichromosomal median and halving problems under different genomic distances]]. //BMC Bioinformatics//​ **10**:120, 2009.
  
-==== Comparative genomics III: Synteny Hierarchies and Gene clusters ====+==== Comparative genomics III: Gene clusters ====
 The following are the algorithmic papers in this area. Apart from that, many papers on applications of gene clusters and statistical properties exist, but are not listed here.  The following are the algorithmic papers in this area. Apart from that, many papers on applications of gene clusters and statistical properties exist, but are not listed here. 
  
 +(a.) Common intervals of permutations:​
   - T. Uno and M. Yagiura. [[https://​doi.org/​10.1007/​s004539910014|Fast algorithms to enumerate all common intervals of two permutations]]. //​Algorithmica//​ **26**(2):​290-309,​ 2000.    - T. Uno and M. Yagiura. [[https://​doi.org/​10.1007/​s004539910014|Fast algorithms to enumerate all common intervals of two permutations]]. //​Algorithmica//​ **26**(2):​290-309,​ 2000. 
   - S. Heber and J. Stoye. [[https://​doi.org/​10.1007/​3-540-48194-X_19|Finding all common intervals of k permutations]]. In A. Amir and G. Landau, editors, Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching, //CPM 2001//, volume 2089 of LNCS, pages 207-218, Berlin, 2001. Springer Verlag. ​   - S. Heber and J. Stoye. [[https://​doi.org/​10.1007/​3-540-48194-X_19|Finding all common intervals of k permutations]]. In A. Amir and G. Landau, editors, Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching, //CPM 2001//, volume 2089 of LNCS, pages 207-218, Berlin, 2001. Springer Verlag. ​
-  - S. Heber and J. Stoye. [[https://​doi.org/​10.1007/​3-540-44696-6_20|Algorithms for finding gene clusters]]. In O. Gascuel and B. Moret, editors,​Proceedings of the First International Workshop on Algorithms in Bioinformatics,​ //WABI 2001//, volume 2149 of LNCS, pages 252-263, Berlin, 2001. Springer Verlag. ​+  - S. Heber and J. Stoye. [[https://​doi.org/​10.1007/​3-540-44696-6_20|Algorithms for finding gene clusters]]. In O. Gascuel and B. Moret, editors,​Proceedings of the First International Workshop on Algorithms in Bioinformatics,​ //WABI 2001//, volume 2149 of LNCS, pages 252-263, Berlin, 2001. Springer Verlag.
   - A. Bergeron, S. Corteel, and M. Raffinot. [[https://​doi.org/​10.1007/​3-540-45784-4_36|The algorithmic of gene teams]]. In R. Guigó and D. Gusfield,​editors,​ Proceedings of the Second International Workshop on Algorithms in Bioinformatics,​ //WABI 2002//, volume 2452 of LNCS, pages 464-476, Berlin, 2002. Springer Verlag. ​   - A. Bergeron, S. Corteel, and M. Raffinot. [[https://​doi.org/​10.1007/​3-540-45784-4_36|The algorithmic of gene teams]]. In R. Guigó and D. Gusfield,​editors,​ Proceedings of the Second International Workshop on Algorithms in Bioinformatics,​ //WABI 2002//, volume 2452 of LNCS, pages 464-476, Berlin, 2002. Springer Verlag. ​
   - N. Luc, J.-L. Risler, A. Bergeron, and M. Raffinot. [[https://​doi.org/​10.1016/​S1476-9271(02)00097-X|Gene teams: a new formalization of gene clusters for comparative genomics]]. //Comp. Biol. Chem.// **27**:​59-67,​ 2003.    - N. Luc, J.-L. Risler, A. Bergeron, and M. Raffinot. [[https://​doi.org/​10.1016/​S1476-9271(02)00097-X|Gene teams: a new formalization of gene clusters for comparative genomics]]. //Comp. Biol. Chem.// **27**:​59-67,​ 2003. 
 +  - G. M. Landau, L. Parida, and O. Weimann. [[https://​doi.org/​10.1089/​cmb.2005.12.1289|Gene proximity analysis across whole genomes via PQ tree]]. //J. Comp. Biol.// **12**(10):​1289–1306,​ 2005.
 +  - A. Bergeron, C. Chauve, F. de Montgolfier,​ and M. Raffinot. [[https://​doi.org/​10.1137/​060651331|Computing common intervals of K permutations,​ with applications to modular decomposition of graphs]]. //SIAM J. Discrete Mathematics//​ **22**(3):​1022–1039,​ 2008.
 +  - S. Heber, R. Mayr, J. Stoye. [[https://​doi.org/​10.1007/​s00453-009-9332-1|Common Intervals of Multiple Permutations]]. //​Algorithmica//​ **60**(2):​175-206,​ 2011.
 +
 +(b.) Common intervals of sequences:
   - T. Schmidt and J. Stoye. [[https://​doi.org/​10.1007/​978-3-540-27801-6_26|Quadratic time algorithms for finding common intervals in two and more sequences]]. In S. C. Sahinalp, S. Muthukrishnan,​ and U. Dogrusoz, editors, Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, //CPM 2004//, volume 3109 of LNCS, pages 347-358, Berlin, 2004. Springer Verlag. ​   - T. Schmidt and J. Stoye. [[https://​doi.org/​10.1007/​978-3-540-27801-6_26|Quadratic time algorithms for finding common intervals in two and more sequences]]. In S. C. Sahinalp, S. Muthukrishnan,​ and U. Dogrusoz, editors, Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, //CPM 2004//, volume 3109 of LNCS, pages 347-358, Berlin, 2004. Springer Verlag. ​
 +  - G. Didier, T. Schmidt, J. Stoye, D. Tsur. [[https://​doi.org/​10.1016/​j.jda.2006.03.021|Character Sets of Strings]]. //J. Discr. Alg.// **5**(2):​330-340,​ 2007.
   - X. He and M. H. Goldwasser. [[https://​doi.org/​10.1089/​cmb.2005.12.638|Identifying conserved gene clusters in the presence of homology families]]. //J. Comp. Biol.// **12**(6):​638-656,​ 2005.   - X. He and M. H. Goldwasser. [[https://​doi.org/​10.1089/​cmb.2005.12.638|Identifying conserved gene clusters in the presence of homology families]]. //J. Comp. Biol.// **12**(6):​638-656,​ 2005.
-  - GM. Landau, L. Parida, and O. Weimann. [[https://​doi.org/​10.1089/​cmb.2005.12.1289|Gene proximity analysis across whole genomes via PQ tree]]. //J. Comp. Biol.// **12**(10):​1289–1306,​ 2005. + 
-  - A. Bergeron, C. Chauve, F. de Montgolfier,​ and M. Raffinot. [[https://​doi.org/​10.1137/​060651331|Computing ​common intervals of K permutations,​ with applications to modular decomposition of graphs.]] //SIAM J. Discrete Mathematics//​ **22**(3):1022–1039,​ 2008.+(c.) Approximate ​common intervals of sequences:
   - S. Böcker, K. Jahn, J. Mixtacki, J. Stoye. [[https://​doi.org/​10.1089/​cmb.2009.0098|Computation of Median Gene Clusters]]. //J. Comp. Biol.// **16**(8):​1085-1099,​ 2009.   - S. Böcker, K. Jahn, J. Mixtacki, J. Stoye. [[https://​doi.org/​10.1089/​cmb.2009.0098|Computation of Median Gene Clusters]]. //J. Comp. Biol.// **16**(8):​1085-1099,​ 2009.
   - F. Hufsky, L. Kuchenbecker,​ K. Jahn, J. Stoye, S. Böcker. [[https://​doi.org/​10.1186/​1471-2105-12-106|Swiftly Computing Center Strings]]. //BMC Bioinformatics//​ **12**:106, 2011.   - F. Hufsky, L. Kuchenbecker,​ K. Jahn, J. Stoye, S. Böcker. [[https://​doi.org/​10.1186/​1471-2105-12-106|Swiftly Computing Center Strings]]. //BMC Bioinformatics//​ **12**:106, 2011.
 +
 +(d.) Common intervals of indeterminate strings:
   - D. Doerr, J. Stoye, S. Böcker, K. Jahn. [[https://​doi.org/​10.1186/​1471-2164-15-S6-S2|Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings]]. //BMC Genomics// **15**(Suppl. 6): S2, 2014.   - D. Doerr, J. Stoye, S. Böcker, K. Jahn. [[https://​doi.org/​10.1186/​1471-2164-15-S6-S2|Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings]]. //BMC Genomics// **15**(Suppl. 6): S2, 2014.
  
Line 264: Line 311:
 A great overview of the combinatorial problems and algorithms in the following book chapter: ​ A great overview of the combinatorial problems and algorithms in the following book chapter: ​
  
-  - D. Gusfield, S. Hecht Orzack. [[https://​doi.org/​10.1201/​9781420036275.ch18|Haplotype Inference]]. In: Handbook of Computational Molecular Biology (Chapter 18), edited by S. Aluru, Chapman & Hall/CRC Computer and Information Science Series, 2006.+  - D. Gusfield, S. Hecht Orzack. [[https://​doi.org/​10.1201/​9781420036275|Haplotype Inference]]. In: Handbook of Computational Molecular Biology (Chapter 18), edited by S. Aluru, Chapman & Hall/CRC Computer and Information Science Series, 2006.
  
-A more recent ​paper on the topic is:+A more recent ​works on the topic, focussing on molecular haplotyping:
  
   - G. W. Klau, T. Marschall. [[https://​doi.org/​10.1007/​978-3-319-58741-7_6|A guided tour to computational haplotyping]]. In: Proc. of CiE 2017, LNCS 10307, Springer Verlag, 2017.   - G. W. Klau, T. Marschall. [[https://​doi.org/​10.1007/​978-3-319-58741-7_6|A guided tour to computational haplotyping]]. In: Proc. of CiE 2017, LNCS 10307, Springer Verlag, 2017.
 +  - M. Patterson, T. Marschall, N. Pisanti, L. v. Iersel, L. Stougie, G. W. Klau, A. Schönhuth. [[https://​doi.org/​10.1089/​cmb.2014.0157|WhatsHap:​ Weighted Haplotype Assembly for Future-Generation Sequencing Reads]]. //Journal of Computational Biology// **22**(6), 498-509, 2015.
 +
 +The ILP discussed in class is from the following textbook, Section 20.2:
 +
 +  - Dan Gusfield. [[https://​doi.org/​10.1017/​9781108377737|Integer Linear Programming in Computational and Systems Biology]]. Cambridge University Press, 2019.
  
 ==== SNP-disease associations ==== ==== SNP-disease associations ====