Temporal-difference Learning and the Coming of Artificial Intelligence
- University of Alberta
Oct. 3, 2014, 2:30 p.m. - Dec. 31, 9999, 11:59 p.m.
When mankind finally comes to understand the principles of intelligence, and how
they can be embodied in machines, it will be the most important discovery of our
age, perhaps of any age. The coming of AI is not imminent, but for almost
everybody now alive the chance of it happening in their lifetimes is
nonnegligible. For AI researchers, it is a great prize; though we rarely talk
about it, we should be discussing now how our efforts might contribute to
attaining it. In this talk, I review these considerations and how they have led me
to focus on general learning algorithms for real-time prediction and control. In
particular, I focus on temporal-difference (TD) learning, a method specialized for
making long-term predictions from unprepared data, such as could be obtained by a
robot interacting with its environment without human supervision. I present recent
results that have deepened our understanding of TD learning and that suggests how
it may be relevant to perception and to the acquisition of world knowledge
generally. TD learning may not be key to the coming of AI, but it is a good
example of the kind of research that could make a fundamental contribution to it.
Richard S. Sutton is a professor and iCORE chair in the department of computing science at the University of Alberta. He is a fellow of the Association for the Advancement of Artificial Intelligence and co-author of the textbook Reinforcement Learning: An Introduction from MIT Press. Before joining the University of Alberta in 2003, he worked in industry at AT&T and GTE Labs, and in academia at the University of Massachusetts. He received a PhD in computer science from the University of Massachusetts in 1984 and a BA in psychology from Stanford University in 1978. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world.