(underlined names are trainnes or mentees from Li lab; †equal contribution; *co-corresponding authors)
2024
Preprints:
Vishvak Raghavan, Yue Li*, Jun Ding* Harnessing Agent-Based Modeling in CellAgentChat to Unravel Cell-Cell Interactions from Single-Cell Data doi: https://doi.org/10.1101/2023.08.23.554489
Ziyang Song, Qincheng Lu, He Zhu and Yue Li* Bidirectional Generative Pre-training for Improving Time Series Representation Learning. arXiv preprint arXiv:2402.09558
Ziyang Song, Qincheng Lu, Hao Xu and Yue Li*. Extrapolatable transformer pre-training for ultra long time-series forecasting. arXiv preprint arXiv:2312.00817
Pei Liu, Ying Liu, Jiawei Luo*, Yue Li*. MiRGraph: A transformer-based feature learning approach to identify miRNA-target interactions. Preprint at bioRxiv 2023.11.04.565620; https://doi.org/10.1101/2023.11.04.565620 (Under review at ISMB 2024)
Anjali Chawla, Doruk Cakmakci, Wenmin Zhang, Malosree Maitra, Reza Rahimian, Haruka Mitsuhashi, MA Davoli, Jenny Yang, Gary Gang Chen, Ryan Denniston, Deborah Mash, Naguib Mechawar, Matthew Suderman, Yue Li*, Corina Nagy*, Gustavo Turecki*. Differential Chromatin Architecture and Risk Variants in Deep Layer Excitatory Neurons and Grey Matter Microglia Contribute to Major Depressive Disorder. bioRxiv 2023.10.02.560567; doi: https://doi.org/10.1101/2023.10.02.560567 (revision at Nature Genetics)
Wenmin Zhang*, Tianyuan Lu, Robert Sladek, Yue Li, Hamed S. Najafabadi, and Josee Dupuis*. SharePro: an accurate and efficient genetic colocalization method accounting for multiple causal signals. bioRxiv (2023): 2023-07. (revision at Oxford Bioinformatics)
Wenmin Zhang*, Robert Sladek, Yue Li, Hamed S. Najafabadi, and Josee Dupuis*. Accounting for genetic effect heterogeneity in fine-mapping and improving power to detect gene-environment interactions with SharePro. bioRxiv (2023): 2023-07. (under review at Nature Communications)
Peer-reviewed:
Yixuan Li, Archer Y. Yang*, Ariane Marelli* and Yue Li*. (2024) MixEHR-SurG: a joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. Journal Biomedical Informatics. 153 (104638). doi.org/10.1016/j.jbi.2024.104638
Yimin Fan, Yu Li, Jun Ding*, Yue Li*. (2024) GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling. Preprint at bioRxiv (accepted at RECOMB 2024) (acceptance rate 16%) (invited for submission to Genome Research special issue)
Harry Moroz, Yue Li*, Ariane Marelli*. (2024) hART: Deep Learning-Informed Lifespan Heart Failure Risk Trajectories. International Journal of Medical Informatics (In press) https://doi.org/10.1016/j.ijmedinf.2024.105384
Ariane J. Marelli*, Chao Li, Aihua Liu, Hanh Nguyen, Harry Moroz, James M Brophy, Liming Guo, David L Buckeridge, Jian Tang, Joelle Pineau, Yi Yang, Yue Li. (2024) Machine learning informed diagnosis for Congenital Heart Disease in Large Claims Data Source. JACC: Advance 3(2), p.100801.
Shadi Zabad, Simon Gravel*, Yue Li*. (2023) Fast and Scalable Polygenic Risk Modeling with Variational Inference. American Journal of Human Genetics doi:10.1016/j.ajhg.2023.03.009. (available here)
Shin-Chieh Fuh, Laura M. Fiori, Gustavo Turecki, Corina Nagy*, Yue Li*. (2023) Multi-omic modeling of antidepressant response implicates dynamic immune and inflammatory changes in individuals who respond to treatment. PloS ONE 18(5), p.e0285123. https://doi.org/10.1371/journal.pone.0285123
Yuening Wang, Audrey Grant, and Yue Li*. (2023) Graph-embedded topic modelling of population-level electronic health records by leveraging biomedical knowledge graph. Star Protoc 4, 101966
Yue Li†, Gregory Fonseca†, Jun Ding†,*. (2023) Machine Learning Methods for Multi-Omics Data Integration. In Machine Learning Methods for Multi-Omics Data Integration, pp. 39-74. Cham: Springer International Publishing, 2023. doi:10.1007/978-3-031-36502-7_4.
2022
Yuesong Zou, Ahmad Pesaranghader, Aman Verma, David Buckeridge, and Yue Li*. (2022) Modeling electronic health record data using a knowledge-graph-embedded topic model. Scientific Reports12, 17868. doi.org/10.1038/s41598-022-22956-w
Yuri Anjuha*, Yuesong Zou, Aman Verma, David Buckeridge*, Yue Li*. (2022) MixEHR-Guided: A guided multi-modal topic modeling approach for large-scale automatic phenotyping using the electronic health record. Journal Biomedical Informatics 104190 doi:10.1016/j.jbi.2022.104190
Ziyang Song, Yuanyi Hu Aman Verma, David Buckeridge, and Yue Li*. (2022) Automatic phenotyping by a seed-guided topic model. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22), August 14–18, 2022, Washington, DC, USA. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3534678.3542675 (KDD HealthDay2022 Best Paper award)
Zhi Wen†, Jingfu Zhang†, Guido Powell, Imane Chafi, David Buckeridge*, Yue Li*. (2022) EpiTopics: A dynamic machine learning model to predict and inform non-pharmacological public health interventions from global news reports. Star Protocols 3, 101463
Xiangfei Meng*, Michelle Wang, Kieran O'Donnell, Jean Caron, MJ Meaney, Yue Li*. (2022) Integrative PheWAS analysis in risk categorization of major depressive disorder and identifying their associations with genetic variants using a latent topic model approach. Translational Psychiatry12, 240. https://doi.org/10.1038/s41398-022-02015-8
Yuening Wang, Rodrigo Benavides, Luda Diatchenko, Audrey Grant*, Yue Li*. (2022) A graph-embedded topic model enables characterization of diverse pain phenotypes among UK Biobank individuals. iScience 25, 104390 doi: https://doi.org/10.1016/j.isci.2022.104390
Zhi Wen†, Guido Powell†, Imane Chafi, David Buckeridge*, Yue Li*. (2022) Inferring global-scale temporal latent topics from news reports to predict public health interventions for COVID-19. Patterns 100435. doi:10.1016/j.patter.2022.100435.
Ziyang Song, Xavier Sumba Toral, Yixin Xu, Aihua Liu, Liming Guo, Guido Powell, Aman Verma, David Buckeridge*, Ariane Marelli*, and Yue Li*. (2021) Supervised Multi-Specialist Topic Model With Applications on Large-Scale Electronic Health Record Data. In 12th ACM Conference on Bioinformatics,Computational Biology, and Health Informatics (ACM-BCB) August 1–4, 2021, Virtual due to COVID-19. ACM, New York, NY, USA, 26 pages. https://doi.org/10.1145/1122445.11224561
Shadi Zabad, Aaron P. Ragsdale, Rosie Sun, Yue Li*, Simon Gravel*. (2021) Assumptions about frequency-dependent architectures of complex traits bias measures of functional enrichment. Biorxiv 2020.10.23.352427 (2021)
Gerhard-Paul Diller, Alexandra Arvanitaki, Alexander R. Opotowsky, Kathy Jenkins, Philip Moons, Alexander Kempny, Animesh Tandon, Andrew Redington, Paul Khairy, Seema Mital, Michael A. Gatzoulis, Yue Li, Ariane Marelli. (2021) Lifespan Perspective on Congenital Heart Disease Research JACC State-of-the-Art Review. J Am Coll Cardiol 77, 2219–2235.
Zhi Wen†, Pratheeksha Nair†, Chih-Ying Deng, Xing Han Lu, Edward Moseley, Naomi George, Charlotta Lindvall*, Yue Li* (2021) Mining heterogeneous clinical notes by multi-modal latent topic model. Plos ONE (†: equal contribution; *co-corresponding author)
Xing Han Lu†, Aihua Liu†, Shih-Chieh Fuh, Yi Lian, Liming Guo, Yi Yang, Ariane Marelli*, Yue Li* (2021). Recurrent disease progression networks for modelling risk trajectory of heart failure. PloS One, 16(1), e0245177–15. http://doi.org/10.1371/journal.pone.0245177
2020
Spreng, R. N., Dimas, E., Mwilambwe-Tshilobo, L., Dagher, A., Koellinger, P., Nave, G., et al.* (2020). The default network of the human brain is associated with perceived social isolation. Nature Communications, 1–11. http://doi.org/10.1038/s41467-020-20039-w (*contributing author)
Bahrami, M., Maitra, M., Nagy, C., Turecki, G., Rabiee, H. R., and Li, Y.* (2020). Deep feature extraction of single-cell transcriptomes by generative adversarial network. Bioinformatics (Oxford, England). http://doi.org/10.1093/bioinformatics/btaa976
Zhang, W., Li, S. Y., Liu, T., and Li, Y.* (2020). Partitioning gene-based variance of complex traits by gene score regression. PloS One, 15(8), e0237657–15. http://doi.org/10.1371/journal.pone.0237657
Marelli, A., Li, C., Liu, A., Nguyen, H., Brophy, J., Guo, L., Buckeridge, D., Tang, J., Pineau, J., Yang, Y., and Li, Y. (2020) Machine Learning to Automate Clinician Designed Empirical Manual for Congenital Heart Disease Identification in Large Claims Database. Machine Learning for Health Care (MLHC) 2020
Li, Y.*, Nair, P., Wen, Z., Chafi, I., Okhmatovskaia, A., Powell, G., Buckeridge, D.* (2020). Global Surveillance of COVID-19 by Mining News Media Using a Multi-Source Dynamic Embedded Topic Model. Presented at the Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, New York, NY, USA: Association for Computing Machinery. http://doi.org/10.1145/3388440.3412418 (*co-corresponding)
Layne, E., Dort, E., Hamelin, R., Li, Y., Blanchette, M. (2020) Supervised learning on phylogenetically distributed data. Bioinformatics Oxford Proceedings of the 19th European Conference on Computational Biology (ECCB) (in press)
ENCODE Project Consortium* (2020). Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature, 583(7818), 699–710. http://doi.org/10.1038/s41586-020-2493-4. (*contributing author)
Li, Y.*,
Nair, P., Lu, XH., Wen, Z., Wang,Y., Dehaghi, AAK., Miao, Y., Liu, W., Ordog, T., Biernacka, J., Ryu, E., Olson, J., Frye, M. A., Liu, A., Guo, L., Marelli, A., Ahuja, Y., Davila-Velderrain, J., and Kellis, M.* (2020). Inferring multimodal latent topics from electronic health records. Nature Communications 1–17. http://doi.org/10.1038/s41467-020-16378-3 (*co-corresponding author)
2019
Mudge, J. M., Jungreis, I., Hunt, T., Gonzalez, J. M., Wright, J. C., Kay, M., et al*. (2019). Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci. Genome Research, 29(12), 2073–2087. http://doi.org/10.1101/gr.246462.118 (*contributing author)
Liu, M.*, Jiang, Y.*, Wedow, R.*, Li, Y.*, ..., Liu, D., Vrieze, S. (2019). Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nature Genetics51, 237-244 (*equal contribution)
Li, Y.*, Davila-Velderrain, J., & Kellis, M.* (2017). A probabilistic framework to dissect functional cell-type-specific regulatory elements and risk loci underlying the genetics of complex traits. BioRxiv. https://doi.org/https://doi.org/10.1101/059345
Li, Y.*, Shi, A. H., Tewhey, R., Sabeti, P. C., Ernst, J., & Kellis, M.* (2017). Genome-wide regulatory model from MPRA data predicts functional regions, eQTLs, and GWAS hits. BioRxiv, 110171. https://doi.org/10.1101/110171 doi:10.1101/110171.
Kreimer, A., Zeng, H., Edwards, M. D., Guo, Y., Tian, K., Shin, S., Welch, R., Wainberg, M., Mohan, R., Sinnott-Armstrong, N. A., Li, Y., Eraslan, G., AMIN, T. B., Goke, J., Mueller, N. S., Kellis, M., Kundaje, A., Beer, M. A., Keles, S., Gifford, D. K. and Yosef, N. (2017), Predicting gene expression in massively parallel reporter assays: a comparative study. Human Mutation. doi:10.1002/humu.23197
2016
Li, Y. & Kellis, M. (2016). Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases. Nucleic Acids Research. http://doi.org/10.1093/nar/gkw627
Olsen, J. B., Wong, L., Deimling, S., Miles, A., Guo, H., Li, Y., Zhang, Z., Greenblatt, J., Emili, A., Tropepe., V. (2016). G9a and ZNF644 Physically Associate to Suppress Progenitor Gene Expression during Neurogenesis. Stem Cell Reports, 7(3), 454–470. http://doi.org/10.1016/j.stemcr.2016.06.012
Paul, J.*, Toosi, B.*, Vizeacoumar, F.*, Bhanumathy, K., Li, Y., Gerger, C., Zawily, A., Freywald, T., Anderson, D., Mousseau, D., Kanthan, R., Zhang, Z., Vizeacoumar, F., & Freywald, A. (2016). Targeting synthetic lethality between the SRC kinase and the EPHB6 receptor may benefit cancer treatment. Oncotarget. http://doi.org/10.1093/nar/gkw627
Zhao, D.Y., Gish, G.*, Braunschweig, U.*, Li, Y., Ni, Z., Schmitges, F.W., Zhong, G., Liu, K., Li, W., Moffat, J., Vedadi, M., Min, J., Pawson, T., Blencowe, B., and Greenblatt, J. (2016). SMN and symmetric arginine dimethylation of RNA polymerase II C-terminal domain control termination. Nature 529(7584), pp.48-53.
Wong, K. C., Li, Y., & Peng, C. A Comparison Study for DNA Motif Modeling on Protein Binding Microarray. IEEE/ACM Transactions on Computational Biology and Bioinformatics. doi:10.1109/TCBB.2015.2443782.(2016)
Wong, K. C., Peng, C., & Li, Y. Probabilistic Inference on Multiple Normalized Signal Profiles from Next Generation Sequencing: Transcription Factor Binding Sites. IEEE/ACM Transactions on Computational Biology and Bioinformatics. doi:10.1109/TCBB.2015.2424421. (2016)
2015
Wong, K. C., Li, Y., & Peng, C. (2015). Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells. Bioinformatics, btv555 (Advanced Online).
Wong, K. C., Li Y., Peng, C., Moses, A. M., & Zhang, Z. (2015). Computational learning on specificity-determining residue-nucleotide interactions. Nucleic acids research, gkv1134 (Advanced Online).
Li Y†, Wang Y†, Zhang Z, Zamudio AV, Zhao JC. Genome-wide detection of high abundance N6-methyladenosine sites by microarray. RNA. 2015. (Advanced Online)
Li Y and Zhang Z. Computational Biology in microRNA. WIREs RNA. 2015. doi: 10.1002/wrna.1286
Liang C†, Li Y†, Luo J, Zhang Z. A novel motif-discovery algorithm to identify co-regulatory motifs in large transcription factor and microRNA co-regulatory networks in human. Bioinformatics. 2015. (Advance Online)
2014
Li Y and Zhang Z. Potential microRNA-mediated oncogenic intercellular communication revealed by pan-cancer analysis. Scientific Report 2014;4:7097.
Li Y, Liang M, Zhang Z. Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia. PLoS Computational Biology. 2014;10(10):e1003908.
Li Y†, Liang C†, Easterbrook S, Luo J, Zhang Z. Investigating functional implication of reinforcing feedback loops in transcriptional regulatory network. Molecular BioSystem. 2014;10(12):3238-3248.
Wong K-C, Peng C, Li Y, Chan T-M. Herd Clustering: A synergistic data clustering approach using collective intelligence. Applied Soft Computing. 2014;23:61-75.
Wong, KC, Li, Y., Peng, C., Zhang, Z. SignalSpider: Probabilistic Pattern Discovery on Multiple Normalized ChIP-Seq Signal Profiles. Bioinformatics 2014;31(1):17-24.
Li Y†, Liang C†, Wong K-C, Luo J, Zhang Z. Mirsynergy: detecting synergistic miRNA regulatory modules by overlapping neighbourhood expansion. Bioinformatics. 2014;30(18):2627–2635.
Li Y, Liang C, Wong K-C, Jin K, Zhang Z. Inferring probabilistic miRNA-mRNA interaction in cancers: a role-switch approach. Nucleic Acids Research. 2014;42(9):e76.
Li Y, Goldenberg A, Wong K-C, Zhang Z. A probabilistic approach to explore human miRNA targetome by integrating miRNA-overexpression data and sequence information. Bioinformatics. 2014;30(5):621–628.
Wang Y, Li Y, Toth JI, Petroski MD, Zhang Z, Zhao JC. N(6)-methyladenosine modification destabilizes developmental regulators in embryonic stem cells. Nature Cell Biology. 2014;16(2):1-10.
Wong K-C, Chan T-M, Peng C, Li Y, Zhang Z. DNA motif elucidation using belief propagation. Nucleic Acids Research. 2013;41(16):e153.
2013
Li Y, Zhao DY, Greenblatt JF, Zhang Z. RIPSeeker: a statistical package for identifying protein-associated transcripts from RIP-seq experiments. Nucleic Acids Research. 2013;41(8):e94–e94.
2012
Arsenault RJ, Li Y, Potter A, Griebel PJ, Kusalik A, Napper S. Induction of ligand-specific PrP (C) signaling in human neuronal cells. Prion. 2012;6(5):477-488.
Arsenault RJ, Li Y, Bell, K., Doig, K., Potter, A., Griebel, P. J., Kusalik, A., and Napper, S. Mycobacterium avium subsp. paratuberculosis Inhibits Interferon Gamma-Induced Signaling in Bovine Monocytes. Insights into the Cellular Mechanisms of Johne's Disease. Infect Immunity 2012.
Arsenault RJ, Li Y, Maattanen, P., Scruten, E., Doig, K., Potter, A., Griebel, P., Kusalik, A., and Napper, S. Altered Toll-like receptor 9 signaling in Mycobacterium avium subsp. paratuberculosis-infected bovine monocytes reveals potential therapeutic targets. Infect Immunity. 2012;81(1):226-237.
Li Y, Arsenault RJ, Trost, B., Slind, J., Griebel, P. J., Napper, S., Kusalik, A. A Systematic Approach for Analysis of Peptide Array Kinome Data. Science Signaling. 2012;5(220):pl2-pl2.
(*=equal contribution)
Book chapters:
Wong, K. C., Li, Y., & Zhang, Z. (2015). Unsupervised Learning in Genome Informatics. arXiv preprint arXiv:1508.00459.
Zhao, D, Li Y, Greenblatt, J, Zhang, Z (2014). ncRNA–Protein Interactions in Development and Disease from the Perspective of High-Throughput Studies. In A. Emili, J. Greenblatt, & S. Wodak (Eds.), Systems Analysis of Chromatin-Related Protein Complexes in Cancer (pp. 87-115). Springer New York.