COMP 598 Fall 2019: Statistical Genetics and Machine Learning

Course Overview

Genetics is instrumental in understanding complex human phenotypes ranging from human heights to common diseases and cancers. Large-scale molecular and phenotypic profiling technologies provide exciting opportunities for conducting genetic research in a data driven way, thereby linking common diseases to novel phenotypes and novel mutations via the lense of regulatory genomics. Meanwhile, there are tremendous opportunities for methdological innovations using statistical and machine learning approaches to address some of the most important problems in genetics that were not possible until recently.

In this topic course, we will gain a broad perspective on the current fields of computational biology with primary focus on the data-driven scalable approaches for genome-wide data and model interpretability. In particular, we will explore in-depth some of the recently developed and crucial computational methods conducted in large-scale statistical genetic analysis, multi-omics analysis, and electronic health records data mining.

Class Format

The format of the class is the following. At each lecture, one pair of students will present a paper for 30 minutes followed by another pair of students presenting another paper on the same topic for 30 minutes. Then, the instructor will present a 30-minute overview of a new research topic so that in the next lecture the two student pairs will be prepared to present two research papers on that topic. In total, each student will present at least two times

There will be five small hands-on assignments for the course. In each of this assignments, students will implement the key components of some of the algorithms discussed in class and use them to analyze real dataset.

Students will be working on a course project in pair as a team. Provided this is research-oriented course, each student team will need to come up with a suitable project based on the research topics discussed in class. The last quarter of the class will mainly consist of students' presentations.

Recommended (not required) courses (taken or concurrently taken):


Yue Li <yueli[at]cs[dot]mcgill[dot]ca>
Office: Trottier 3105

Lecture Schedule

Lectures: MW 11:30-1:00 PM
Location: ENGMC 103


*Sign-up sheet on specific topics is here

Relevant Textbooks

Course Syllabus: