COMP 598 Fall 2019: Statistical Genetics and Machine Learning
Genetics is instrumental in understanding complex human phenotypes ranging from human heights to common diseases and cancers. Large-scale molecular and phenotypic profiling technologies provide exciting opportunities for conducting genetic research in a data driven way, thereby linking common diseases to novel phenotypes and novel mutations via the lense of regulatory genomics. Meanwhile, there are tremendous opportunities for methdological innovations using statistical and machine learning approaches to address some of the most important problems in genetics that were not possible until recently.
In this topic course, we will gain a broad perspective on the current fields of computational biology with primary focus on the data-driven scalable approaches for genome-wide data and model interpretability. In particular, we will explore in-depth some of the recently developed and crucial computational methods conducted in large-scale statistical genetic analysis, multi-omics analysis, and electronic health records data mining.
The format of the class is the following. At each lecture, one pair of students will present a paper for 30 minutes followed by another pair of students presenting another paper on the same topic for 30 minutes. Then, the instructor will present a 30-minute overview of a new research topic so that in the next lecture the two student pairs will be prepared to present two research papers on that topic. In total, each student will present at least two times
There will be five small hands-on assignments for the course. In each of this assignments, students will implement the key components of some of the algorithms discussed in class and use them to analyze real dataset.
Students will be working on a course project in pair as a team. Provided this is research-oriented course, each student team will need to come up with a suitable project based on the research topics discussed in class. The last quarter of the class will mainly consist of students' presentations.
- Biology: BIOL 202 Basic Genetics
- Statistics: MATH 324 Statistics or MATH 423 regression and analysis of variance; MATH 680: Computation Intensive Statistics; MATH 783: Advanced Topics in Statistics: Machine Learning
- Machine learning: COMP 551 Applied Machine Learning; COMP 652 Machine Learning
InstructorYue Li <yueli[at]cs[dot]mcgill[dot]ca>
Office: Trottier 3105
Lecture ScheduleLectures: MW 11:30-1:00 PM
Location: ENGMC 103
- Student paper presentations* (20%)
- Assignments (20%)
- Final Project proposal (10%)
- Final Project presentation (10%)
- Final project report (40%)
- Pattern recognition and Machine Learning by Christopher Bishop
- Machine Learning by Kevin Murphy
- No need to purchase. Relevant contents will be available on the course website.