|
|
|
Computational
Biology Methods (COMP-462) Computational
Biology Methods and Research (COMP-561) Mon/Wed
10:05am-11:25am, |
|
|
3 credits
(462) and 4 credits (561) |
|
|
Jérôme Waldispühl and David Becerra |
|
|
Trottier 3106 and 3140 |
|
|
McGill Centre for Bioinformatics |
|
|
McGill University |
|
|
Montreal, Quebec, Canada |
|
|
|
|
|
<first name>.<last name>@mcgill.ca http://www.cs.mcgill.ca/~jeromew/comp462 http://www.cs.mcgill.ca/~jeromew/comp561 |
|
|
Telephone: 514-398-5018 |
|
|
|
|
|
Course Abstract: |
|
|
|
|
Computational biology is the sub-discipline of Bioinformatics that is closest in spirit to pure computer science. The main efforts in this field are two-fold. Firstly, we are concerned with creating models for problems from the biosciences (biology, biochemistry, medicine) that are both biologically and mathematically sound. Secondly, we are interested in the design and analysis of efficient, and accurate algorithms that solve these problems in practice and strategies for validation of results. |
|
|
|
|
|
This course is designed to introduce upper-year undergraduate students and graduate students to this area by examining several classic problems from the field. The intention of the course is to act as a gateway whereby, upon completion of the course, students will have the necessary biology, mathematics and computer science background to attend graduate level courses in bioinformatics geared towards specific topics (phylogenetics, genomic evolution, functional genomics, proteomics). The course is designed in such a manner that no previous formal training in biology is required of the students. |
|
|
|
|
|
The necessary mathematical background consists of the lower level discrete structures and probablity courses, since topics such as maximum likelihood estimation, hidden Markov models, and dynamic programming will be used repeatedly throughout the material. (Both maximum likelihood and hidden Markov models will be introduced at a basic level however.) Students will be required to have already taken the lower level algorithms/data-structures, numerical computing and theoretical computer science courses. |
Prequisities: |
|
|
|
|
308 - 251 Algorithms and data structures |
|
|
189 - 323 Probability Theory |
|
|
|
|
|
|
Important Note for
undergraduate students:
If a student does not have the prerequisities for this course, the Faculty of
Science will delete this course from their record. |
Office Hours: |
|
|
|
|
JW: Mon/Wed 11:30-12:30; DB: Tue: Tue/Thu 11:00-13:00. |
Book and Material: |
|
|
|
|
Bernhard Haubold and Thomas Wiehe. Introduction
to Computational Biology: An Evolutionary Approach , Burkhauser Basel, 2007 |
|
|
Not required. Probably one of the best introductory book
out there. Its level is ideal for the course, but it does not go much beyond this. |
|
|
|
|
|
Peter Clote and Rolf Backofen. Computational Molecular Biology: An introduction , Wiley, 2000 |
|
|
Not required. A good introductory book for anyone interested in the
mathematical fondations of computational molecular biology. |
|
|
|
|
|
Durbin, Eddy, Krogh, Michinson, Biological
Sequence Analysis,
Cambridge, 1998. |
|
|
|
|
|
Also not required, this book is particularly good
for learning some of the basics of statistical inference/machine learning. |
|
|
|
|
|
Jones, N.C. and Pevzner, P. An Introduction to Bioinformatics Algorithms, MIT press, 2004 |
|
|
You are not required to buy this book, however it is a good book for understanding some of the classic problems in computational biology and the algorithms used to solve these problems. |
|
|
|
|
|
Campbell, A. M., and Heyer, L.J. Discovering genomics, proteomics, and bioinformatics, Benjamin Cummings, 2002 |
|
|
Also not required, this is a good primary for computer scientists that covers the basics of genomics, genetics, and proteomics. |
|
|
|
|
|
Alberts, Johnson, Lewis, Raff, Roberts, Walter Molecular Biology of the Cell, Garland, 2002 |
|
|
This is a widely used and comprehensive book covering the biology of the cell. It is a good place to start when you want to explore a new topic. |
Evaluation for COMP 462: |
|
|
|
4 assignments |
40% total (10% each) |
|
Class participation |
5% |
|
Midterm |
20% |
|
Final Exam |
35% |
Evaluation for COMP 561: |
|
|
|
3 assignments (the first three) |
30% total (10% each) |
|
Class participation |
5% |
|
Project |
25% |
|
Midtermth |
15% |
|
Final Examrd |
25% |
Computer Science/mathematics topics: |
|
|
|
|
Basic probability and statistics (ubiquitous) |
|
|
Dynamic programming (sequence alignment) |
|
|
Approximation algorithms (string alignment) |
|
|
Advanced data structures (suffix trees) |
|
|
Numerical techniques (least squares fits) |
|
|
Experimental design |
|
|
Programming |
Concepts from biology and biotechnologies |
|
|
|
|
Models of evolution |
|
|
Sequence comparison |
|
|
Phylogenetics |
|
|
Gene expression and regulation |
|
|
Peptide identification |
|
|
RNA secondary structure |
|
|
Protein structure |
|
|
DNA sequencing |
|
|
Population genetics |
|
|
System biology |
Course
outline |
Lecture 1,2: Introduction to molecular biology and genomics. |
|
|
|
|
Topics: Basic Questions, Basic Strategies, Introduction to molecular biology and genomics. |
|
|
Background Reading: Chapter 1 of Artificial Intelligence and Molecular Biology , by L. Hunter |
|
|
On-line Resources: Lecture notes by Dudoit and Gentleman |
Lecture 3-6: Sequence evolution and sequence
alignment. |
|
|
|
|
Topics: Introduction to sequence evolution. Global and local alignment;
Gapping; Multiple Alignments. |
|
|
Background Reading: Chapter 6 of Jones, Pevzner; Chapter 6.2 of Ewans,
Grant. |
|
|
Math/Algorithms: Dynamic Programming |
|
|
Applications: Gene finding. |
|
|
On-line Resources: |
|
|
Additional material for COMP 561:
Chapter 6 of Durbin and Eddy. |
Lecture 7-8: Fast pairwise alignment methods and their statistics. |
|
|
|
|
Topics: The Blast algorithm and its variations . |
|
|
Background Reading: TBD |
|
|
Math/Algorithms: Prob. theory; Combinatorics; |
|
|
Applications: Genomic sequence alignment |
|
|
On-line Resources: |
|
|
Additional material for COMP 561: Original Blast paper |
Lecture 9-12: Evolutionary models and phylogenetic
Tree construction. |
|
|
|
|
Topics: Discrete and continuous nucleotide and amino acid substitution models Distance-based
methods; Parsimony; Maximum Likelihood. |
|
|
Background Reading: Chapters 7-8 of Durbin et al. |
|
|
Math/Algorithms: Discrete algorithm design; Maximum likelihood. |
|
|
On-line Resources: |
Midterm exam, October 30th. |
|
|
Lecture 14-16: Profile Hidden Markov Models. |
|
|
|
|
Topics: Forward, backward, Viterbi, Baum-Welch algorithms. |
|
|
Background Reading: Chapters 3 and 5 of Durbin et al. |
|
|
Math/Algorithms: Markov processes; Dynamic programming; Parameter estimation. |
|
|
Application: Gene finding |
|
|
On-line Resources: |
Lecture 17-18: Motif discovery. |
|
|
|
|
Topics: Modelling and searching for signals in DNA.. |
|
|
Background Reading: Chapter 5 of Ewans, Grant; Chapter 4 of Jones, Pevzner |
|
|
Math/Algorithms: Probability theory; Markov processes; exhaustive search; Gibb's sampling (intro) |
|
|
Applications: Searching for repeats. Identifying transcription factor binding sites. |
|
|
On-line Resources: |
Lecture 19-20: Gene Expression Analysis. |
|
|
|
|
Topics: Class distinction; Class prediction; Class discovery. |
|
|
Background Reading: Chapter 10 of Jones, Pevzner. |
|
|
Math/Algorithms: Differential expression; Principal Component Analysis; Clustering; Graph theory. |
|
|
On-line Resources: |
Lecture 21-22: Genetic algorithms. |
|
|
|
|
Topics: Evolution-inspired optimization algorithms and their applications to computational biology methods. |
|
|
Background Reading: |
|
|
Math/Algorithms: |
|
|
On-line Resources: |
Lecture 23: RNA secondary structure prediction |
|
|
|
|
Topics: Nussinov algorithm and Zuker algorithm |
|
|
Background Reading: Durbin and Eddy, Chapter 8 |
|
|
Math/Algorithms: Dynamic programming algorithms |
|
|
On-line Resources: |
|
|
Additional material for COMP 561: Chapter 10 of Durbin and Eddy. |
Lecture 24: Introduction to population genetics |
|
|
|
|
Topics: Polymorphisms, haplotypes |
|
|
Background Reading: TBD |
|
|
Math/Algorithms: |
Final exam (oral). |
|
|
McGill University values academic integrity. Therefore all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (see www.mcgill.ca/integrity for more information). Use of French in assignments and exams In accord with McGill University’s Charter of Students’ Rights, students in this course have the right to submit in English or in French any written work that is to be graded.
|
2011-09-05
|