[Lex Computer & Tech Group/LCTG] Robots are learning how to learn
Ted Kochanski
tedpkphd at gmail.com
Thu Aug 8 16:16:23 PDT 2024
All, Robots are learning how to learn -- should we be very afraid or just
afraid?
The following video is a brief summary of a paper
*Practice Makes Perfect: Planning to Learn Skill Parameter Policies*
Tom Silver
23 subscribers
75 views Feb 15, 2024
Nishanth Kumar∗, Tom Silver∗ Willie McClinton, Linfeng Zhao, Stephen
Proulx, Tomás Lozano-Pérez, Leslie Pack Kaelbling and Jennifer Barry
The AI Institute, MIT CSAIL, Northeastern University
RSS 2024
https://youtu.be/123DXatw1V8?si=yaS8UKoVZvXi_Nle
The technical paper
Practice Makes Perfect: Planning to Learn Skill Parameter Policies
Nishanth Kumar∗†‡, Tom Silver∗†‡, Willie McClinton†‡, Linfeng Zhao†§ ,
Stephen Proulx† , Tomas Lozano-Perez ´ ‡ , Leslie Pack Kaelbling‡ and
Jennifer Barry†
†The AI Institute,
‡MIT CSAIL,
§Northeastern University
*Indicates equal contribution.
We enable a robot to rapidly and autonomously *specialize* parameterized
skills by *planning to practice* them. The robot decides *what* skills to
practice and *how* to practice them. The robot is left alone for hours,
repeatedly practicing and improving.
Abstract
One promising approach towards effective robot decision making in complex,
long-horizon tasks is to sequence together *parameterized skills*. We
consider a setting where a robot is initially equipped with (1) a library
of parameterized skills, (2) an AI planner for sequencing together the
skills given a goal, and (3) a very general prior distribution for
selecting skill parameters. Once deployed, the robot should rapidly and
autonomously learn to improve its performance by specializing its skill
parameter selection policy to the particular objects, goals, and
constraints in its environment. In this work, we focus on the active
learning problem of choosing which skills to *practice* to maximize
expected future task success. We propose that the robot should *estimate* the
competence of each skill, *extrapolate* the competence (asking: "how much
would the competence improve through practice?"), and *situate* the skill
in the task distribution through competence-aware planning. This approach
is implemented within a fully autonomous system where the robot repeatedly
plans, practices, and learns without any environment resets. Through
experiments in simulation, we find that our approach learns effective
parameter policies more sample-efficiently than several baselines.
Experiments in the real-world demonstrate our approach's ability to handle
noise from perception and control and improve the robot's ability to solve
two long-horizon mobile-manipulation tasks after a few hours of autonomous
practice.
https://arxiv.org/pdf/2402.15025
After you've watched the video a couple of times .....
Should we plan to be very afraid or just afraid?
Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.toku.us/pipermail/lctg-toku.us/attachments/20240808/81c4d6f6/attachment.htm>
More information about the LCTG
mailing list