Annie S. Chen

Hi! I am a third-year computer science PhD student at Stanford University advised by Prof. Chelsea Finn and affiliated with the Stanford Artificial Intelligence Laboratory (SAIL). My research goal is to create robust and adaptable machine learning systems that are prepared for distribution changes and efficiently respond to new information. My work spans both supervised learning and reinforcement learning settings, with a focus on creating models that can generalize or adapt to changing environments. I am supported by an NSF Graduate Research Fellowship.

Previously, I received a B.S. in math and M.S. in computer science, both also at Stanford. I have also been a research intern at Google Brain, where I learned a lot working with Pete Florence.

I am originally from Boulder, Colorado, and outside of research, I enjoy spending time outdoors, playing tennis, and learning to play the guitar. I care about creating an inclusive research culture and co-organize the Stanford CS Undergraduate Mentoring Program, which matches undergraduate students with graduate student mentors and aims to increase the participation of underrepresented minorities in computer science research.

Please feel free to reach out about research or any advice I can help with!

[Email] [CV] [Google Scholar] [Twitter] [LinkedIn] [GitHub]

Selected Research

Please see my CV or Google Scholar for a full list of work.

Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
Annie S. Chen*, Govind Chada*, Laura Smith, Archit Sharma, Zipeng Fu, Sergey Levine, Chelsea Finn
NeurIPS Robot Learning Workshop, 2023
[PDF] [Website] [Code]
We propose Robust Autonomous Modulation (ROAM), a simple framework for efficiently leveraging pre-trained behaviors to adapt to changing situations at deployment time.
Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts
Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn
NeurIPS DistShift Workshop, 2023
[PDF]
We propose COnfidence-baSed MOdel Selection (COSMOS), where we adaptively choose among models with different strengths to achieve high performance on both majority and minority subpopulations. COSMOS does not require any target labels or group annotations and can even be used for hyperparameter tuning.
Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features
Annie S. Chen*, Yoonho Lee*, Amrith Setlur, Sergey Levine, Chelsea Finn
International Conference on Learning Representations (ICLR), 2024 (Spotlight (top 5%))
[PDF]
We propose Project and Probe (Pro^2), a lightweight + data-efficient approach for domain adaptation. Pro^2 first learns a linear projection that maps a pre-trained embedding onto orthogonal directions while being predictive of labels in the source dataset. The goal of this step is to learn a variety of predictive features, so that at least some of them remain useful after distribution shift. Pro^2 then learns a linear classifier on top of these projected features using a small target dataset.
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang
Robotics: Science and Systems (RSS), 2023 (Best Paper Finalist)
[PDF] [Website] [Code]
We propose Voltron, which uses language to learn better visual representations for a diverse range of robotics problems by trading off conditioning and generation.
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Yoonho Lee*, Annie S. Chen*, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, Chelsea Finn
International Conference on Learning Representations (ICLR), 2023
[PDF] [Code]
We show that selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms fine-tuning all layers. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best.
You Only Live Once: Single-Life Reinforcement Learning
Annie S. Chen, Archit Sharma, Sergey Levine, Chelsea Finn
Neural Information Processing Systems (NeurIPS), 2022
[PDF] [Code]
Agents operating in the real world must often contend with novel situations that differ from their prior experience. In these situations, the agent only has one trial to complete the given task and must adapt on-the-fly to novelty without human interventions. To model such settings more formally, we study single-life reinforcement learning (SLRL) where given prior data, an agent must complete a task in a single trial in a domain with a novel distribution shift without any human interventions or supervision.
Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos
Annie S. Chen, Suraj Nair, Chelsea Finn
Robotics Science and Systems (RSS), 2021
[PDF] [Website] [Code]
We propose a simple approach, Domain-agnostic Video Discriminator (DVD), that learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task. These reward functions can generalize to unseen environments and tasks by learning from a small amount of robot data and a large, diverse dataset of in-the-wild human videos.
Just Train Twice: Improving Group Robustness without Training Group Information
Evan Z. Liu*, Behzad Haghgoo*, Annie S. Chen*, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, Chelsea Finn
International Conference on Machine Learning (ICML), 2021 (Long Talk (top 3%))
[PDF] [Code]
A simple method that improves worst-group classification performance on datasets with spurious correlations without requiring training group annotations. JTT first detects informative training examples, which are often minority examples, by training an initial ERM classifier and extracting the misclassified examples. It then trains a final classifier by upsampling the selected examples.
Batch Exploration with Examples for Scalable Robotic Reinforcement Learning
Annie S. Chen*, Hyunji Nam*, Suraj Nair*, Chelsea Finn
Robotics and Automation Letters (RA-L), 2021
[PDF] [Website] [Code]
We propose a framework for leveraging weak human supervision to enable better robotic exploration for scalable data collection. Under this framework, the robot autonomously collects high quality data with a few minutes of human supervision, providing better data for downstream offline RL.

Website template from here.