240119 AA289 Annie Chen
03 Feb 2024 (over 1 year ago)

Reinforcement Learning for Autonomous Robots
- Recent advances in autonomous robots have led to robots that can perform tasks in controlled environments.
 - However, these robots often struggle to adapt to unexpected circumstances and novel scenarios during real-world deployment.
 - Reinforcement learning provides a framework for robots to adapt autonomously, but it is challenging to apply directly during deployment due to the need for feedback, retries, and the ability to learn from scratch.
 
Reset-Free Reinforcement Learning
- Reset-free reinforcement learning addresses some of these challenges by allowing robots to practice both learning the task and undoing it without human intervention.
 - Single-life reinforcement learning is introduced as a paradigm where the agent is given prior experience and must adapt to a new scenario without human intervention or supervision within a single episode.
 
Robust Autonomous Modulation (REALM)
- The proposed method, Robust Autonomous Modulation (REALM), leverages the expressive power of each behavior's value function to guide behavior selection during adaptation.
 - REALM fine-tunes the value functions of pre-trained behaviors to correct for overestimation in out-of-distribution states.
 - The selection mechanism in REALM quickly identifies appropriate behaviors in a given situation, eliminating the need for a separate high-level controller or adaptation module.
 - REALM is agnostic to how the policies and value functions of the prior behaviors are trained and can provide improvements in new situations with either a small or large number of pre-trained behaviors.
 - The adaptation process in REALM happens within a single episode at test time, allowing robots to adapt to a variety of situations without the need for extensive online training.
 
Rome: A Simple Algorithm for Autonomous Deployment-Time Adaptation
- Rome is a simple algorithm for autonomous deployment-time adaptation.
 - Rome outperforms prior methods in simulated and real-world experiments.
 - Rome can adapt to novel situations within a single episode.
 - Rome can handle dynamic changing payloads and unseen objects.
 - Rome can leverage parts of each relevant behavior to complete tasks.
 - Rome provides a mechanism for single-life test-time adaptation to unseen situations.