Events

[Takeaways] Robot Brains Podcast S3 E20: Jitendra Malik on Building AI from the ground up: Sensorimotor learning before language

Notes after checking the impressing interview: https://youtu.be/k_Wrd1kI1B0?si=QIqUl3Qrcx7y1FEs
These are my personal thoughts.
  1. [Grounding LLM] is open to being proven incorrect.
    1. According to the brain development, words are identified later.
    2. I actually agree with this. This implies [corporating control with LLM] is the key, now this area is kinda hot.
  2. Instead, brain development starts with the progression of hand development.
    Physical interaction:
    - Begins with control -> By age 5, skills are acquired
    - During this, Language is learned in context
    • Integrating such latent format skills into LLM is important.
  3. Skills
    1. Skill acquisition leads to reused skills, which then focus on meta-phrasing tasks.
      Skill: Refers to a type of motion behavior.
      Effective rapid motor adaptation occurs based on simulation. --> Rollouts for the child
    2. Both top-down (concept-generative model) and bottom-up components are vital.
      1. I totally agree with this. I am thinking how to integrate such naturally discovered skill into robot learning / and also connecting to the long horizon tasks at the same time.
  4.  Computer Vision:
    - Classical CV (3R): recognition, reconstruction, reorganization (such as segmentation and grouping)
        - Currently, there's a scaling up and application across different areas.
        - 3D reconstruction remains unsolved to a degree. (At a human level: only partially)
    The next step is integration!
    1. Vision in robotics
    2. Vision leading to cognition: NLP
    Lately, there's a lot of interest in healthcare and medicine.
  5. Advice
    1. It's important to stay narrow in focus (connecting fields, being adaptive) and concise in information, but passion is crucial.