Adapt On-the-Go:Behavior Modulation for Single-Life Robot Deployment
Adapt On-the-Go: Behavior Modulation
for Single-Life Robot Deployment


Annie S. Chen*, Govind Chada*, Laura Smith, Archit Sharma, Zipeng Fu,
Sergey Levine, Chelsea Finn

Paper | Code

Abstract. To succeed in the real world, robots must cope with situations that differ from those seen during training. We study the problem of adapting on-the-fly to such novel scenarios during deployment, by drawing upon a diverse repertoire of previously learned behaviors. Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors to select and adapt pre-trained behaviors to the situation at hand. Crucially, this adaptation process all happens within a single episode at test time, without any human supervision. We provide theoretical analysis of our selection mechanism and demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet. Our approach adapts over 2x as efficiently compared to existing methods when facing a variety of out-of-distribution situations during deployment by effectively choosing and adapting relevant behaviors on-the-fly.


On-The-Go Adaptation via Robust Autonomous Modulation


Method Overview

Key Idea: Use the value functions of the behaviors to identify an appropriate behavior at every timestep during deployment. With proper regularization, value functions provide a good indication of how well different behaviors will perform in a given situation.

(1) Fine-tune each behavior's value function to encourage identifiability in familiar states with an additional behavior classification loss.

(2) At deployment time, sample behavior with respect to classification probability, execute action, optionally fine-tune further.

Benefits of ROAM: (1) Doesn't require learning a separate high-level controller, (2) Agnostic to how pre-trained policies and value functions are obtained, (3) Provides simple mechanism for adapting within a single episode to a variety of situations.


Real-World Trials on the Go1

Here we show evaluation trials where we find that ROAM can adapt on-the-go to OOD situations in the real world. ROAM enables the robot to slide forward on roller skates without ever having seen roller skates during training. ROAM can also pull heavy luggage and pull loads with changing weights without having been trained to pull any object before.

Roller Skates

Walking (No Behavior Modulation)

High-Level Classifier

ROAM (ours)

Heavy Luggage (13.6 lb)

Walking (No Behavior Modulation)

High-Level Classifier

ROAM (ours)

Dynamic Load

Walking (No Behavior Modulation)

High-Level Classifier

ROAM (ours)


ROAM reacts quickly to changing situations

In our simulated experiments, we find that ROAM is over 2x as efficient as prior methods that are designed for fast adaptation. Here we plot the behavior distribution for ROAM over the course of a single-life trial, where the agent is tasked with adapting to different stiffness on-the-go. Green bars indicate relevant behaviors to the current situations while red bars indicate irrelevant behaviors. We find that ROAM can quickly react to changing situations by choosing and adapting relevant behaviors on-the-fly.

Simulation