Cassandra's survey summarizes some uses for partially observable Markov decision problems. MDPs are useful in artificial intelligence and planning applications. The overall structure of these problems involves states and transitions between the states, with costs associated with the transitions and states. The goal of a robot/problem is to find an optimal solution (policy) to a problem in the least number of transitions.
The POMDP model consists of:
- States
- Actions
- Observations
- A state transition function
- An observation function
- An immediate reward function
Some example applications include:
- Machine maintenance - parts of the machine are modeled as states, and the goal is to minimize the repair costs or maximize the up-time on the machine.
- Autonomous robots - robots need to navigate or accomplish a goal with a set of actions, and the world is not always observable
- Machine vision - determining where to focus higher resolution (i.e., fovea) of the computer image to focus on specific parts such as hands and heads of people.
Discussion:
This paper had little to nothing to do with what we've been currently discussing in class. Although POMDPs are interesting from a theoretical standpoint, their intractability is a huge factor for avoiding them in any practical domain. I've been trying to think of how to even apply them to gesture recognition, and one idea I came up with included modeling hand positions as states for a single gesture, but then it just becomes an HMM with a reward function, and I'm not sure how beneficial a reward function is when taking the computation costs into account.
1 comment:
yeah the only reward i can think of is possibly one based on context. for example, maybe reward a sign language recognizer if it recognizes a sequence of letters that form an actual word over a sequence that recognizes a non-word.
Post a Comment