To choose the right action, we need accurate state estimation. We need information-gathering tasks, such as robot exploration, for precise state estimation (i.e. reduce uncertainty). There are two types of uncertainty: uncertainty in action, and uncertainty in perception.
- Uncertainty in action) Deterministic versus stochastic action effects: Uncertainty arising from the stochastic nature of the robot and its environments mandate that the robot senses at execution time, and reacts to unanticipated situations -even if the environment state is fully observable.
- Uncertainty in perception) Fully observable versus partially observable systems: Classical robotics often assumes that sensors can measure the full state of the environment, which is an unrealistic assumption.
On MDP, among the two types of uncertainty, only uncertainty in action is considered. MDPs assume that the state of the environment can be fully measured at all times. In other words, the perceptual model P(o|s) is deterministic and bijective. However, if allows for stochastic action effects, that is, the action model P(s′|s,a) may be non-deterministic.
How can one devise an algorithm for action selection that can cope with this type of uncertainty?
One solution might be solve the problem of what to do by analyzing each possible situation that might be the case under the current state of knowledge. But, the planning problem in partially observable environment cannot be solved by considering all possible environments and averaging the solution.
Instead, the key idea is to generate plans in belief space. The belief space comprises the space of all belief distributions b that the robot might encounter. It reflects what the robot knows about the state of the world.
By conditioning the action on the belief state -as opposed to the most likely actual state- the robot can actively pursue information gathering. In fact, the optimal plan in belief state optimally gathers information, in that it only seeks new information to the extent that it is actually beneficial to the expected utility of that robot's action.
'Robotics & Perception > Probabilistic Robotics' 카테고리의 다른 글
(작성중) Optimal Estimation Algorithms: Kalman and Particle Filters (0) | 2022.11.05 |
---|---|
[Probabilistic Robotics] Planning and Control: Partially Observable Markov Decision Processes (1) | 2022.09.10 |
[Robotics] MDP, POMDP 정리 (0) | 2021.12.18 |