Robotics & Perception

    Seminar Series on Artificial General Intelligence (AGI) #1] Dr. Jim Fan: Generalist Agents in Open-Ended Worlds

    이 글은 필자가 231117 일에 들은 카이스트 안성진 교수님께서 주최하신 AGI 세미나를 듣고 필자 위주로 정리한 글입니다. 미래지향적인!! 굉장히 인상깊었던 세미나였습니다. 1년 중에 들은 세미나 중에 퀄리티가 굉장히 높은 매우 행복한 세미나였습니다. Seminars Open-ended environment: How can we comba the world knowledge? Foundation model for agents --> Issue: How can we ground into real-life? We should pursue in language-manner. Because prompting and delivering the concept are both straightforward. Emb..

    [Advanced Topics] 02. Representation Learning for RL

    이 글은 2023년 가을학기 AI707 이기민 교수님 수업을 듣고 복습차 필자의 이해를 위해 정리한 글입니다. RL from pixel is difficult. Poor sample-efficiency Poor generalization on different environment Representation learning resolves these issue. What is representation learning; Learning the represenation of the data, which contain useful information for ML methods. However, this is also true that optimizing a main task objective might n..

    [Advanced Topics] 01. RL with human feedback

    이 글은 2023년 가을학기 AI707 이기민 교수님 수업을 듣고 복습차 필자의 이해를 위해 정리한 글입니다. (세계적으로 유명한 교수님들의 직강을 들을 수 있어서 영광이라고 생각한다!!) Rough introduction of Reinforcement Learning Reinforcement Learning: Finding an optimal policy for sequential decision making problem through interactions and learning By interacting the environment, the agent generates roll-outs of the form $\tau = \{(s_0, a_0, r_0), \cdots, (s_H, a_H, r_H)..

    Debugging tips (항상 업데이트 중)

    문제 자체가 성립이 안되는 걸 수도 있다. 데이터셋이 너무 vague하다거나 ex. classification task에 success나 fail이 input에 따른 구별이 어려운 데이터셋 noisy한 데이터가 추가되었거나 ex. 데이터셋을 모으는 과정에서 문제 해결을 위한 데이터셋이 아니거나 ex. classification을 해결해야 하는데 데이터셋이 비중이 맞지 않거나 해결 방법: 데이터셋을 제대로 모으거나 (ex. 데이터셋을 모으는 것에 제약을 주어서 확실한 데이터셋을 모으도록 하거나) 데이터셋의 noisy 네트워크 자체에 버그가 있을 수도 있다. 판단방법: 네트워크가 쉽게 답을 예측할 수 있도록 돕는 indicator를 추가하여 이를 올바르게 예측하는지 찾아보자. --> 이 방법에서 올바르게 예..

    (작성중) Optimal Estimation Algorithms: Kalman and Particle Filters

    Kalman filter: Gaussian distribution Particle filter: Sampling-based algorithm

    [AI614] 03. Introduction to Task planning

    학교 수업을 듣고 복습차 lec 8-10까지 정리한 review 글입니다. Outline: What does this review contain? Representations for task planning -Logic-based AI, First-order logic, and STRIPS Heuristic search for task planning 🚧 Representations for task planning -Logic-based AI, First-order logic, and STRIPS 🚧 Heuristic search for task planning

    [AI614] 02. Introduction to Motion planning

    학교 수업을 듣고 복습차 lec 5-7까지 정리한 review 글입니다. Outline: What does this review contain? Problem statement Discretization-based methods -A* Sampling-based algorithms -RRTs, PRMs Probabilistic completeness -RRT* Basic notion Configuration: A specification of the position of all points of a robot 로봇 모든 부위 위치 -센터 위치인데 shape알면 다 되는건가? robot shape랑 robot configuration이랑 무슨 차이지? If we know the joint angles (e..

    [AI614] 01. Introduction to TAMP, Basics of robot manipulation

    학교 수업을 듣고 복습차 Lecture 1-4까지 정리한 review용 글입니다. 내용의 간결성과 전달력을 위해 수식 설명은 제하였습니다. Outline: What does this review contain? Introduction of TAMP Planning vs. Learning based approach Planning -TAMP Positions and orientations of rigid bodies Manipulator forward and inverse kinematics Manipulator velocities and dynamics 🛝 Introduction of TAMP Difference btw industrial robot and intelligent robot agents i..

    [Probabilistic Robotics] Planning and Control: Partially Observable Markov Decision Processes

    이 글은 Probabilistic Robotics Chapter 16. Partially Observable Markov Decision Processes를 읽고 정리한 글입니다. To choose the right action, we need accurate state estimation. We need information-gathering tasks, such as robot exploration, for precise state estimation (i.e. reduce uncertainty). There are two types of uncertainty: uncertainty in action, and uncertainty in perception. Uncertainty in action) D..

    [Probabilistic Robotics] Planning and Control: Uncertainty in action/Belief space

    blockquote data-ke-style="style2">이 글은 Probabilistic Robotics Chapter 15. Markov Decision Processes를 읽고 정리한 글입니다. To choose the right action, we need accurate state estimation. We need information-gathering tasks, such as robot exploration, for precise state estimation (i.e. reduce uncertainty). There are two types of uncertainty: uncertainty in action, and uncertainty in perception. Uncertainty..

    [AI614] Overview of Robot Task and Motion planning

    Introduction to Robotics: Mechanics and Control Positions and orientations of rigid bodies Manipulator forward and inverse kinematics Manipulator velocities and dynamics Manipulator equation Motion planning Discretization-based methods -RRTs and PRMs Sampling-based algorithms -RRT* Probabilistic completeness Task planning Representations for task planning -Logic-based AI, First-order logic, and ..

    Sim2Real transfer: Domain randomization, domain adaptation, System identification

    이 글은 domain randomization의 이해를 돕기 위해 작성된 글입니다. 참고자료1를 해석한 글입니다. 더 자세히 나와있으니 읽어보길 추천합니다! 로보틱스에서 어려운 문제 중 하나는 모델 자체를 실제 환경에서 어떻게 돌아가게 하는지이다. 강화학습 알고리즘의 sample inefficiency와 실제 로봇의 data collection의 문제로 우리는 simulator에 많은 양의 데이터를 제공하여 훈련하도록 하애 한다. 하지만, simulator와 실제 환경 사이의 간극은 로봇을 실제 상황에서 돌아가게 할 때 많이 발생된다. 이러한 간극은 physical parameter, 예를 들어 마찰, kp, dampling, mass, density)나 더 치명적인 비물리적인 모델링 (i.e. 표면사이..

    PDDL (Planning Domain Definition Language)

    이 글은 필자의 이해를 돕기 위해 작성된 글입니다. 참고자료1, 참고자료2 PDDL은 "classical" planning task의 표준 encoding language입니다. PDDL planning task는 다음과 같이 이루어져 있습니다. problem.pddl Objects: 존재하는 물체들 Initial state: 시작 state Goal specification: 우리가 원하는 goal # gripper-four.pddl (define (problem ) (:domain ) [...] ) domain .pddl Predicates: 물체들에 대한 제약 Actions/Operators: world를 바꾸기 위한 방식들 각 action 당 description, precondition, 과 e..

    [Paper Review] Inner Monologue

    Methodology: Closed-loop (with feedback) Semantic constraints and objectives Intractable number of rules Limitations Question: Human needs to reason instead. Not considering the scene. Cannot solve Long Horizon problem Only command consists of pick and place Fail to ground contextually dependent action: How to plan well while being contextually aware For example, Awareness of geometric constrain..

    Imitation learning

    The simplest version of imitation learning is behavior cloning.

    [CS330] 03. Supervised solution of Meta-learning problem: Black-Box vs. Optimization-based vs. Non-Parametric

    The Meta-Learning Problem Given data from $\mathcal{T}_1, \cdots, \mathcal{T}_n$, quickly solve new task $\mathcal{T}_\textrm{test}$. Assume that meta-training tasks and meta-test task drawn i.i.d. from same task distribution. $\mathcal{T}_1, \cdots, \mathcal{T}_n \sim p(\mathcal{T}), \mathcal{T}_j \sim p(\mathcal{T})$ For example, the task can be: a robot performing different tasks or giving fe..

    [CS330] 02. Multi-Task Learning & Transfer learning Basics

    What is "Task"? More formally, a task can be described as this format: $\mathcal{T}_i \equiv \{p_i(\textbf{x}), p_i (\textbf{y|x}), \mathcal{L}_i\}$, based on data generating distribution. Multi-task learning: Learn $\mathcal{T}_1, \mathcal{T}_2, \cdots, \mathcal{T}_T$ at once Transfer learning: Solve target task $\mathcal{T}_b$ after solving source task $\mathcal{T}_a$ by transferring knowledge..

    [CS330] 01. Course Introduction

    Their point of view of why multi-task learning and meta-learning are important Robots can teach us things about intelligence. Faced with the real world Must generalize across tasks, objects, environments, etc Need some common sense understanding to do well Supervision can't be taken for granted Specialists vs. Generalists Specialist: Learn one task in one environment, starting from scratch using..

    [CS391R] Overview of Robot Decision Making

    Robot Decision making: Choosing the actions a robot performs in the physical world Mathematical Framework of Sequential Decision Making Markov Decision Process Learning for Decision Making reinforcement learning (model-free vs. model-based, online vs offline) Optimizes the policy by trial and error in an MDP. Goal: To maximize the long-term rewards imitation learning (behavior cloning, DAgger, I..

    [CS391R] Overview of Robot perception

    Robot perception: seeing and understanding the physical world by multimodal robot sensors Robot vision vs. Computer vision Robot vision is embodied, active and environmentally situated. Embodied: Robots have physical bodies and experience the world directly. Their actions are part of a dynamic with the world and have immediate feedback on their own sensation. Active: Robots are active perceivers..

    Proximal Policy Optimization Algorithms (PPO) Hyper-parameters

    🔖 Questions What is the difference between advantage function and reward and value function? PPO-clip은 KL divergence를 안한다는데 approx_kl이 왜 있는거지? 🔖 파악해야 할 notions Reward Loss entropy loss: entropy bonus that ensures sufficient exploration. value loss $L_t^{VF} (\theta) = {(V_\theta(s_t)-V_t^{targ})}^2$ Policy gradient loss $L_t^{CLIP}$ Procedure Epoch: 전체 데이터셋 돌리기 Mini batch/one batch: 하나의 mini bat..

    [CS391R] Introduction of Robot Learning

    Course website Types of robot automation custom-built robots -> human expert programming -> Special-purpose behaviors General-purpose robots -> Robot learning -> General-purpose behaviors What is robot learning? The study of methods and principles that make robots learn from data -> Learning is critical for taking robots to the real world. Robot perception: seeing and understanding the physical ..

    [Policy Gradient] Vanilla Policy Gradient, Trust region policy optimization (TRPO), Proximal Policy Optimization Algorithms (PPO)

    paper의 수식을 정리한 글입니다. (도출과정 없음). document를 참조했습니다. 🔖 Simplest Policy Gradient We consider a case of stochastic, parameterized policy $\pi_\theta$. We aim to maximize the expected return $J(\pi_\theta) = \mathbb{E}_{\tau \sim \pi_\theta} [R(\tau)]$. For this, we want to optimize the policy by gradient descent. $\theta_{k+1} = \theta_k + \alpha \nabla_\theta J(\pi_\theta)|_{\theta_k}$ The gradient ..

    [CS294 Pieter Abbeel] 5. Implicit Models - GANs

    이 글은 필자가 Pieter Abbeel 의 Deep Unsupervised Learning 2020을 듣고 정리한 글입니다. 🗿Implicit Models? 🗿Original GAN 🗿GAN Progression

    [CS294 Pieter Abbeel] 4. Latent Variable Models - Variational AutoEncoder (VAE)

    이 글은 필자가 Pieter Abbeel 의 Deep Unsupervised Learning 2020을 듣고 정리한 글입니다. 🖲️ Training Latent Variable Models 🖲️ Variations of VAE 🖲️ Related Ideas

    [CS294 Pieter Abbeel] 3. Likelihood Models: Flow Models

    이 글은 필자가 Pieter Abbeel 의 Deep Unsupervised Learning 2020을 듣고 정리한 글입니다. This lecture deal with a latent representation to get a density model $p_\theta(x)$. 🪸 Foundations of Flows (1-D) 🪸 2-D Flows 🪸 N-D Flows 🪸 Dequantization

    [CS294 Pieter Abbeel] 2. Likelihood Models: Autoregressive Models

    이 글은 필자가 Pieter Abbeel 의 Deep Unsupervised Learning 2020을 듣고 정리한 글입니다. This lecture is about how to get the data distribution, which means the primitive way of generative models. Generative models first came up with histograms, which is the basic model of the Likelihood-based model. And then for the neural approach, they use autoregressive models. 🫠 Likelihood-based models 🫠 Sampling-based: Hist..

    [CS330 Chelsea Finn] Deep Multi-task learning and Meta-learning Contents

    Goal: Check a higher version of the perception for robotics Contents Course introduction & start of multi-task learning 43m Supervised multi-task learning, transfer learning 1h 19m Meta-learning problem statement, black-box meta-learning 1h 18m Optimization-based meta-learning 1h 18m Few-shot learning via metric learning 1h 25m Advanced meta-learning topics 1h 28m Bayesian meta-learning 1h 27m R..

    [CS294 Pieter Abbeel] 1. Intro

    이 글은 필자가 Pieter Abbeel 의 Deep Unsupervised Learning 2020을 듣고 정리한 글입니다. This lecture shares what is the goal, pursuit of deep unsupervised learning By Deep Unsupervised Learning, Capture rich patterns in raw data with deep networks in a label-free way → But how? Recreate raw data distribution → Generative models "Puzzle" tasks that require semantic understanding → Self-supervised Learning With Pu..