Jia Yuan Yu - Concordia University
Nov. 27, 2015, 2:30 p.m. - Nov. 27, 2015, 3:30 p.m.
Whereas classical Markov decision processes maximize the expected
reward, we consider minimizing the risk. We propose to evaluate the
risk associated to a given policy over a long-enough time horizon
with the help of a central limit theorem. The proposed approach
works whether the transition probabilities are known or not. We also
provide a gradient-based policy improvement algorithm that converges
to a local optimum of the risk objective.
Jia Yuan Yu is an associate professor at the Concordia Institute of Information System Engineering. He graduated from each of his Bachelor's, Master's, and Ph.D. from McGill University. He was previously a faculty member in Computer Science at Dublin City University, and completed postdocs at Stanford and HEC Paris. He's also spent 4 years as a research scientist at IBM in Dublin, where his main interest was applying machine learning, statistics, and game theory to multi-agent systems, and designing smart cities to mitigate climate change.