|
We consider the problem of learning the behavior of a POMDP (Partially Observable Markov Decision Process) with deterministic actions and observations. This is a challenging problem due to the fact that the observations can only partially identify the states. Recent work by Holmes and Isbell offers an approach for inferring the hidden states from experience in deterministic POMDP environments. We want to find an alternative algorithm which ensures more accurate predictions, and the size of its learned machine is close to minimal. |
|
Dorna Kashef Haghighi |
|
LINKS |
|
Project Description |
|
Home | About Me | My Mentor | Project Outline | Journal | Final Report |