Research Interests

My interests are mainly in fully and partially observable Markov Decision Processes, Reinforcement Learning and Artificial Intelligence in general, although I am also interested in control, operations research and verification.

I really enjoy math for its sheer beauty; some fields I have dabbled in, greatly enjoyed and many times used for my research are linear programming (in finite and infinite spaces), topology, measure theory and category theory.

Publications

[P.S. Castro, Daqing Zhang, Chao Chen, Shijian Li and Gang Pan] (2013): From Taxi GPS Traces to Social and Community Dynamics: A Survey. To appear in ACM Computing Surveys.
[C. Chen, D. Zhang, P.S. Castro, N. Li, S. Li, Z. Wang] (2013): iBOAT: Isolation-based On-line Anomalous Trajectory Detection. To appear in Transactions on Intelligent Transportation Systems.
[L. Sun, D. Zhang, C. Chen, P.S. Castro, S. Li, Z. Wang] (2012): Real Time Anomalous Trajectory Detection and Analysis. To appear in Mobile Networks and Applications.
[P.S. Castro, D. Zhang, S. Li] (2011): Urban traffic modelling and prediction using large scale taxi GPS traces. In Procedings of the 10th International Conference on Pervasive Computing.
[C. Chen, D. Zhang, P.S. Castro, N. Li, L. Sun, S. Li] (2011): Real-time Detection of Anomalous Taxi Trajectories from GPS Traces. In Procedings of the 8th International ICST Conference on Mobile and Ubiquitous Systems. Runner up for best paper award.
[P.S. Castro, D. Precup] (2011): Automatic construction of temporally extended actions for MDPs using bisimulation metrics. In Proceedings of the 9th European Workshop on Reinforcement Learning(EWRL9 2011)
[P.S. Castro, D. Precup] (2010): Smarter Sampling in Model-Based Bayesian Reinforcement Learning. In Proceedings of the European Conference on Machine Learning and Practice of Knowledge Discovery in Databases (ECML-PKDD 2010)
[P.S. Castro, D. Precup] (2010): Using bisimulation for policy transfer in MDPs. In Proceedings of the 24th AAAI Conference (AAAI-10)
[P.S. Castro, D. Precup] (2010): Using bisimulation for policy transfer in MDPs (Extended abstract). In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-10)
[P.S. Castro, D. Precup] (2010): Using bisimulation for policy transfer in MDPs. In 10th Adaptive and Learning Agents Workshop (ALA-10).
[P.S. Castro, P. Panangaden, D. Precup] (2009): Equivalence relations in Fully and Partially Observable Markov Decision Processes In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09). pp. 1653--1658.
[P.S. Castro, D. Precup] (2007): Using linear programming for Bayesian exploration in Markov Decision Processes In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-07). pp. 2437--2442. Poster.
[N. Ferns, P.S. Castro, D. Precup, P. Panangaden] (2006): Methods for Computing State Similarity in MDPs In Proceedings of 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006). pp. 174--181. AUAI Press.

Theses

[P.S. Castro] (2011): On planning, prediction and knowledge transfer in Fully and Partially Observable Markov Decision Processes. PhD thesis, McGill University.
[P.S. Castro] (2007): Bayesian exploration in Markov Decision Processes. Masters thesis, McGill University.

Postdoc work

I was in charge of a project that aimed to extract the underlying behaviours and dynamics of a city's road network, by using the GPS logs of around 5000 taxis over a year in a large city in China. I developed a novel algorithm based on inverse Reinforcement Learning for learning behaviours from multiple ranked experts; this algorithm was used for extracting good passenger-finding strategies.

I developed a mechanism for predicting future traffic conditions, as well as automatically detecting the capacity of the different roads in a city.

Finally, I co-supervised two Ph.D. students working on characterizing and automatically detecting anomalous taxi routes.

PhD work

During my PhD, I investigated equivalence notions and state metrics for fully and partially observable Markov Decision Processes (MDPs and POMDPs, respectively). Most of this work has revolved around bisimulation, which is a well-known equivalence relation coming from concurrency theory. I constructed a hierarchy relating many equivalence notions (some of which were introduced for the first time) in MDPs and POMDPs.

I have also investigated using bisimulation metrics for performing policy transfer from a small MDP to a larger one. Given that bisimulation metrics are so expensive to compute, we provided some approximations which scale relatively well to larger spaces and still provide good empirical performance.

I have also worked on methods for sampling in Bayesian reinforcement learning that are both fast and have very good empirical performance.

You can download my thesis here. I defended it on September 16th, 2011.

Masters work

In my Masters thesis I worked on Bayesian exploration in MDPs. It has been shown that an optimal exploratory policy is definable by constructing a tree of all possible futures, starting from an initial prior. Not surprisingly, this tree is infinite and so computing this optimal exploratory policy is infeasible, except for special cases such as the gambling problem.

Using the theory of Linear Programming I developed a series of approximations that converge to the optimal policy in the limit. It was shown empirically that they outperform state-of-the-art exploratory algorithms.

You can download my thesis here. It was accepted in September, 2007.

Contact

Email me at: pcastr AT cs DOT mcgill DOT ca

Here is my CV (in PDF format).