|
MAY 30th— June 5th |
|
June 6th— June 12th |
|
June 13th— June 19th |
|
After discussing our results of previous week in the meeting we concluded that it doesn’t make sense to find a map from primal states to double dual in POPA and POMDP. I also understood that an automaton is minimal when it has no two equivalent states. Therefore the double dual is still minimal by having infinite number of states, and the reason for having finite number of states in primal but infinite number of states in double dual is that they represent two different kind of machines (The primal is nondeterministic while the double dual is deterministic.). Doina Precup said that the double dual with infinite states is not useful for learning and what we should do in the future is to study the ways to approximate it.
Now that we have gained sufficient background in duality it’s the time to use them in the learning algorithms. For the next week I should read an ECML paper entitled “PAC-Learning of Markov Models with Hidden State” |
|
This week was spent mainly on reading the assigned paper. The paper first approximates an HMM (hidden Markov Model) with a PDFA, and then uses PAC-learning algorithm to learn the structure of the model by using the given trajectories. The main idea of the algorithm is based on splitting, merging and promoting states. It keeps a list of safe and candidate states. The safe states are those that will appear in the final model, but the candidate states are those which we are not sure yet if they represent unique states in the model or should be merged with other safe states. The deciding of either merge or promote a candidate state will be done when we have seen enough trajectories (in this case the state will be declared large). The candidate state will be merged with a safe state if the difference between the probability distribution over the trajectories observed from both states is in the given desired bound. The problem of this algorithm is that the number of trajectories that it needs to learn the model is very large. And we want to find a way for improving it.
|
|
Ricard Gavalda, the co-writer of HMMs PAC-Learning paper, joined our meeting this week. He had some interesting ideas about how to make the current algorithm more efficient. His main idea was to eliminate the distinguishability parameter of the algorithm, and at each step merge the candidate state with the most similar safe state. But we should be able to undo this operation later, if by observing more trajectories we find out that merging was wrong. So he suggested to keep the information of the merged candidate state separately, to avoid rebuilding the model from scratch in this case. The complexity of the algorithm is still a function of the hidden distinguishability of the environment, and the computation time increases by this modification, but for deterministic environments we might get better complexity. Doina is going to send me the code of the current PAC-Learning algorithm, so that I can look and understand how it works and make these modifications afterwards. We are also interested to know if the current algorithm can be simplified for PDFAs. In particular we want to see if we can eliminate the largeness condition for deterministic models. This is what I’m going to explore during next week. |
|
June 20th— June 26th |
|
Working on some examples we found out that distinguishability parameter and largeness condition are still necessary for DFAs. If we eliminate the distinguishability parameter then two states will be merged if and only if their trajectories sets are exactly the same. But this is problematic, and this example shows why. Although these parameters are not negligible, but we might be able to simplify their definitions for deterministic systems. For the next week I’m going to read an ICML paper entitled “Looping Suffix Tree-Based Inference of Partially Observable Hidden State” , written by Michael P. Holmes and Charles Lee Isbell This paper presents an algorithm for inferring hidden state from experience for the environments of the POMDP form but with deterministic transition and observation functions. What we are interested in is comparing the merge-split algorithm with this looping PST algorithm and see which one is more appropriate for modifying to learn DASOs. |
|
| |
|
| |
|
| |
|
Week 1-4 | Week 5-8 | Week 9-12 | Week 13-16 |
|
Dorna Kashef Haghighi |
|
LINKS |
|
Home | About Me | My Mentor | Project Outline | Journal | Final Report |