COMP 565 Fall 2021: Machine Learning in genomics and healthcare (4 credits)

Course Overview

Genetics is instrumental in understanding complex human phenotypes ranging from human heights to common diseases and cancers. Large-scale molecular and phenotypic profiling technologies provide exciting opportunities for conducting genetic research in a data driven way, thereby linking common diseases to novel phenotypes and novel mutations via the lense of regulatory genomics. Meanwhile, there are tremendous opportunities for methdological innovations using statistical and machine learning approaches to address some of the most important problems in genetics and healthcare that were not possible until recently.

In this topic course, we will gain a broad perspective on the current fields of computational biology with primary focus on the data-driven scalable approaches for genome-wide data and model interpretability. In particular, we will explore in-depth some of the recently developed and crucial computational methods conducted in large-scale statistical genetic analysis, multi-omics analysis, and electronic health records data mining.

Class location and time:

Class Format

There are participation mark taken at each lecture based on the questions and discussion.

Each student needs to write a 2-page review of five research papers chosen by the instructor based on the topics discussed in class.

There are five assignments. In each assignment, students will derive and/or implement the key components of some of the algorithms discussed in class and use them to analyze real or simulated dataset.

Students will be working on a course project on their own. Provided this is a research-oriented course, each student will need to come up with a suitable project based on the research topics discussed in class upon approval of the instructor. The last few lectures of the class will mainly consist of students' project presentations.

Prerequisite courses:


Yue Li <yue[dot]yl[dot]li[at]mcgill[dot]ca>

Teaching Assistant

Wenmin Zhang <wenmin[dot]zhang[at]mail[dot]mcgill[dot]ca>


Relevant Textbooks

Course Syllabus (Tentative):

  1. Statistical genetic approaches (15 hrs, 5 weeks):
  2. Multi-omic learning (3 hrs, 1 week)
  3. Computational approaches in single-cell analysis (6 hrs, 2 weeks):
  4. Mining big data in healthcare (9 hrs, 3 weeks)