- Robot Decision making: Choosing the actions a robot performs in the physical world
- Mathematical Framework of Sequential Decision Making
- Markov Decision Process
- Learning for Decision Making
- reinforcement learning (model-free vs. model-based, online vs offline)
- Optimizes the policy by trial and error in an MDP.
- Goal: To maximize the long-term rewards
- imitation learning (behavior cloning, DAgger, IRL, and adversarial learning)
- Optimizes policy by imitating the expert in an MDP
- Goal: To match the behavioral distributions
- Types
- Direct estimation of the expert policy from expert data (behavioral cloning, supervised learning version)
- Reconstruct a reward function (inverse RL) and then learn a policy from the reward (RL)
- reinforcement learning (model-free vs. model-based, online vs offline)
'Robotics & Perception > Basic' 카테고리의 다른 글
PDDL (Planning Domain Definition Language) (0) | 2022.08.06 |
---|---|
Imitation learning (0) | 2022.07.08 |
[CS391R] Overview of Robot perception (0) | 2022.05.15 |
[CS391R] Introduction of Robot Learning (0) | 2022.05.14 |
[Modern Robotics] Contents (0) | 2022.04.02 |