Skip to content. Skip to navigation
McGill Home SOCS Home
Personal tools
You are here: Home People Profile

Joelle Pineau


photo

Email: jpineau AT cs DOT mcgill DOT ca
Home Page: http://www.cs.mcgill.ca/~jpineau/
Office: MC106N
Phone: +1-514-398-5432
Fax: +1-514-398-3883
Address:

Research Description

Reinforcement Learning and POMDPs

  • Point-based value iteration in POMDPs: (with Stephane Ross)
    We propose a class of anytime approximate algorithms for solving discrete POMDPs.
  • Active learning in POMDPs: (with Amin Atrash and Doina Precup)
    Most approximate POMDP solutions assume a known model. In many domains, its hard to get an accurate model. We look at ways of formulating POMDP solutions such that we take into account model uncertainty, and improve model accuracy through careful selection of queries.
  • Fast function approximation for POMDPs: (with Keith Bush and Robert Kaplow)
    Function approximations have been used to accelerate reinforcement learning in fully observable domain. We are now looking at ways to do this in the POMDP framework to solve problems with large state spaces.

Robotics

  • Multi-modal wheelchair control: (with Amin Atrash, Robert Kaplow, Julien Villemure and Robert West)
    This project will look at customizing a robotic wheelchair such that it can be operated by a person with severe mobility impairements. The goal is to optimize a flexible multi-modal interface that allows high-level control of the wheelchair in a manner that is safe and effective.
  • Large-scale dialogue management: (with Amin Atrash and Robert West)
    The role of a dialogue manager is to pick appropriate actions and responses, when interacting with a person. In this project, we aim to build a large-scale dialogue manager for a robot interface, and to apply machine learning techniques (e.g. reinforcement learning, parameter estimation) to optimize the performance of the dialogue manager.
  • Robot-assisted care of elderly patients:
    The goal of the Nursebot project is to look into how robotic technology can be used to promote independence and quality of life in elderly people with physical and cognitive disabilities.

Adaptive treatment design

  • Adaptive treatment for the STAR*D trials: (with Mahdi Milani Fard, Susan Shortreed and Susan Murphy)
    The STAR*D trials consist of large multi-stage randomized treatment sequences for people with clinical depression. The goal for computer scientists is to automatically learn optimal treatment sequences. This can phrased as a reinforcement learning problem, where we use the data collected in the STAR*D trials to learn an (approximately) optimal policy.
  • Computational modelling and adaptive treatment of epilepsy: (with Massimo Avoli, Keith Bush, Arthur Guez and Robert Vincent)
    This project implements a mathematical model to synthesize time-series data that exhibits the multi-cell synchronous patterns that are characteristic of epilepsy. We also investigate the use of reinforcement learning to optimize treatment strategies for epilepsy.

Research Interests

Research Labs

Teaching

Selected Publications (click link in front of each publication to see bibtex in ASCII format)

[1] Fard, M. M., and Pineau, J. MDPs with non-deterministic policies. In Neural Information Processing Systems (NIPS), 2009. Publication in refereed conference proceedings (8 pages).
[2] Atrash, A., and Pineau, J. A bayesian reinforcement learning approach for customizing human-robot interfaces. In International Conference on Intelligent User Interfaces (IUI), 2009. Publication in refereed conference proceedings (5 pages).
[3] Ross, S., Chaib-draa, B., and Pineau, J. Bayes-adaptive POMDPs. In Neural Information Processing Systems (NIPS), 2008. Publication in refereed conference proceedings Co-authors: Brahim Chaib-draa (Universite Laval) (8 pages).
[4] Ross, S., Chaib-draa, B., and Pineau, J. Bayesian reinforcement learning in continuous POMDPs with application to robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2008. Publication in refereed conference proceedings Co-authors: Brahim Chaib-draa (Universite Laval) (7 pages).
[5] Ross, S., Pineau, J., and Chaib-draa, B. Theoretical analysis of heuristic search methods for online POMDPs. In Neural Information Processing Systems (NIPS), 2008. Publication in refereed conference proceedings Co-authors: Brahim Chaib-draa (Universite Laval) (8 pages).
[6] Doshi, F., Pineau, J., and Roy, N. Reinforcement learning with limited reinforcement: Using bayes risk for active learning in POMDPs. In International Conference on Machine Learning (ICML), 2008. Publication in refereed conference proceedings Co-authors: Finale Doshi (MIT), Nicholas Roy (MIT) (8 pages).
[7] Guez, A., Vincent, R., Avoli, M., and Pineau, J. Adaptive treatment of epilepsy via batch-mode reinforcement learning. In Innovative Applications of Artificial Intelligence (IAAI), 2008. Publication in refereed conference proceedings Co-authors: Massimo Avoli (Montreal Neurological Institute) (8 pages).
[8] Fard, M. M., Pineau, J., and Sun, P. A variance analysis for POMDP policy evaluation. In AAAI Conference on Artificial Intelligence, 2008. Publication in refereed conference proceedings Co-authors: Peng Sun (Duke University) (6 pages).
[9] Ross, S., and Pineau, J. Model-based bayesian reinforcement learning in large structured domains. In Conference on Uncertainty in Artificial Intelligence (UAI), 2008. Publication in refereed conference proceedings (8 pages).
[10] Ross, S., Pineau, J., Paquet, S., and Chaib-draa, B. Online planning algorithms for POMDPs. Journal of Artificial Intelligence Research (JAIR), 2008, v. 32, pp. 663-704. Co-authors: Sebastien Paquet (Unviersite Laval) and Brahim Chaib-draa (Universite Laval).
[11] Paduraru, C., Precup, D., Ross, S., and Pineau, J. Model-based bayesian reinforcement learning with tree-based state aggregation. (Abstract). NIPS workshop on Model Uncertainty and Risk in Reinforcement Learning, 2008.
[12] Paduraru, C., Kaplow, R., Precup, D., and Pineau, J. Model-based reinforcement learning with state aggregation. (Abstract). European Workshop on Reinforcement Learning, 2008.
[13] Guez, A., Vincent, R., Avoli, M., and Pineau, J. Adaptive treatment of epilepsy via batch-mode reinforcement learning. (Abstract). European Workshop on Reinforcement Learning, 2008.
[14] Pineau, J., Bellemare, M. G., Rush, A. J., Ghizaru, A., and Murphy, S. A. Constructing evidence-based treatment strategies using methods from computer science. Drug and Alcohol Dependence, 2007, pp. S52-S60.
[15] Jaulmes, R., Pineau, J., and Precup, D. Apprentissage actif dans les processus decisionnels de markov partiellement observables. Revue d'Intelligence Artificielle, 2007, v. 21, n. 1, pp. 9-34.
[16] Roy, N., and Pineau, J. Gerontechnology: Growing Old in a Technological Society. Charles C. Thomas Publisher Ltd., 2007.
[17] Jaulmes, R., Pineau, J., and Precup, D. A formal framework for robot learning and control under model uncertainty. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2007. Publication in refereed conference proceedings.
[18] Pineau, J., and Atrash, A. Smartwheeler: A robotic wheelchair test-bed for investigating new models of human-robot interaction. In Proceedings of the AAAI Spring Symposium on Multidisciplinary Collaboration for Socially Assistive Robotics, 2007.
[19] Pineau, J., Bellemare, M. G., Rush, A. J., Ghizaru, A., and Murphy, S. A. Constructing evidence-based treatment strategies using methods from computer science. Drug and Alcohol Dependence, 2007, v. 88S, pp. S52-S60. Co-authors: A. John Rush (University of Texas, Southwestern Medical Center), Susan A. Murphy (University of Michigan).
[20] Roy, N., and Pineau, J. Gerontechnology: Growing Old in a Technological Society, chapter Robotics and Independence for the Elderly, pp. 209-242. Charles C. Thomas Publisher Ltd., 2007. Co-authors: Nicholas Roy (MIT).
[21] Jaulmes, R., Pineau, J., and Precup, D. A formal framework for robot learning and control under model uncertainty. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2007, pp. 2104-2110. Publication in refereed conference proceedings.
[22] Vincent, R., Pineau, J., de Guzman, P., and Avoli, M. Recurrent boosting for classification of natural and synthetic time-series data. In Canadian Conference on Artificial Intelligence (CanAI), 2007, pp. 192-293. Publication in refereed conference proceedings. Co-authors: Philip de Guzman and Massimo Avoli (Montreal Neurological Institute).
[23] Pineau, J., and Atrash, A. Smartwheeler: A robotic wheelchair test-bed for investigating new models of human-robot interaction. In Proceedings of the AAAI Spring Symposium on Multidisciplinary Collaboration for Socially Assistive Robotics, 2007, pp. 59-64. Publication in refereed workshop proceedings.
[24] Ross, S., Chaib-draa, B., and Pineau, J. Bayes-adaptive POMDPs. 2007.
[25] Pineau, J., Gordon, G., and Thrun, S. Anytime point-based approximations for large POMDPs. Journal of Artificial Intelligence Research (JAIR), 2006, v. 27, pp. 335-380.
[26] Gavalda, R., Keller, P., Pineau, J., and Precup, D. PAC-learning of Markov models with hidden states. In European Conference on Machine Learning (ECML), 2006, pp. 150-161.
[27] Hundt, C., Panangaden, P., Pineau, J., and Precup, D. Representing systems with hidden state. In National Conference on Artificial Intelligence (AAAI), 2006.
[28] Burfoot, D., Pineau, J., and Dudek, D. RRT-Plan: a randomized algorithm for STRIPS planning. In International Conference on Automated Planning and Scheduling (ICAPS), 2006, pp. 362-365.
[29] Atrash, A., and Pineau, J. Efficient planning and tracking in POMDPs with large observation spaces. In AAAI-06 Workshop on Empirical and Statistical Approachs for Spoken Dialogue Systems, 2006, pp. 7-12.
[30] Vincent, R., Pineau, J., de Guzman, P., and Avoli, M. Recurrent boosting methods for time-dependent classification of epileptiform signals. 2006.
[31] Vlassis, N., Gordon, G., and Pineau, J. Planning under uncertainty in robotics. Robotics and Autonomous Systems, 2006, v. 11, pp. 885-886.
[32] Vincent, R., Pineau, J., and Courville, A. Modeling and control of neural network dynamics. Symposium on Computational Neurosciences, 2006.

Last Update:   2013/08/05 08:52:21.205 GMT-4