Policy Learning

On this page you'll learn to train a policy on your processed data (or on an existing dataset) through Behavior Cloning (BC), with an algorithm like VQ-BeT. If you'd like to learn more about supervised policy learning, this tutorial serves as a nice introduction.

Preliminary Steps

  1. SSH into Greene. I recommend doing so using the VS Code IDE. If it's your first time, you can see instructions for setting up SSH on VS Code here.
  2. ssh netid@greene.hpc.nyu.edu
    Note: You'll need to either be on NYU's network or connected to the NYU VPN to SSH into Greene. Your password is your standard NYU password.
  3. Optional, but highly recommended: sign up for WandB for logging your training runs.

Clone and Enter Repository in Greene

  1. In your terminal, enter the scratch directory
    cd $SCRATCH
  2. Clone
    git clone https://github.com/NYU-robot-learning/min-stretch.git
  3. Change directory
    cd min-stretch
  4. Run setup script (this sets a config in the configs/env_vars/env_vars.yaml file)
    ./setup.sh

Request Resources and Set Up Environment on Greene

  1. Enter your scratch directory
    cd $SCRATCH
  2. Request CPU resources in an interactive session
    srun --nodes=1 --tasks-per-node=1 --cpus-per-task=16 --mem=64GB --time=2:00:00 --pty /bin/bash
  3. Setup a Mamba environment
      • Copy an overlay filesystem to your Scratch:
        cp /scratch/work/public/overlay-fs-ext3/overlay-50G-10M.ext3.gz $SCRATCH/overlay-home-robot-env.ext3.gz
      • Unzip the overlay filesystem
        gunzip overlay-home-robot-env.ext3.gz
      • Note: This may take ~5 minutes
      • Enter singularity container
        singularity exec --overlay $SCRATCH/overlay-home-robot-env.ext3:rw /scratch/work/public/singularity/cuda11.8.86-cudnn8.7-devel-ubuntu22.04.2.sif /bin/bash
      • Install Mamba: Miniforge
          • Instructions from the website above pasted below for convenience:
          • curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
            bash Miniforge3-$(uname)-$(uname -m).sh
          • Install it in /ext3/miniforge3 when prompted
          • You will also be prompted with Do you wish to update your shell profile to automatically initialize conda?. The default is no but I would recommend entering yes so that the mamba initialization script gets added to you ~/.bashrc.
      • Enter min-stretch
        cd min-stretch
      • Create environment from config file
        mamba env create -f conda_env.yaml
      • Note: This can take ~10 minutes

Additional Resources

Setting up the Dataset

If you'd like to follow along with an existing dataset, and don't have your own already, you can download the existing "Bag Pick Up" dataset here.


  1. Enter the min-stretch directory
    cd $SCRATCH/min-stretch
  2. Set the dataset configs in min-stretch/imitation-in-homes/configs/env_vars/env_vars.yaml to appropriate paths.
      • data_root.train: The path to the training dataset.
      • data_root.val: The path to the test dataset (if you don't have a split, you can just use the training dataset).
      • data_original_root.train: The original path to the training dataset (in the folder's r3d_files.txt file). This may be the same as data_root.train if you haven't moved the data after creating r3d_files.txt.
      • data_original_root.val: The original path to the test dataset (in the folder's r3d_files.txt file). This may be the same as data_root.train if you haven't moved the data after creating r3d_files.txt.
      • Example:
        data_root: 
            train: /scratch/hre7290/test/min-stretch/Stick_Data
            val: /scratch/hre7290/test/min-stretch/Stick_Data_val
        
        data_original_root: 
            train: /vast/hre7290/Stick_Data
            val: /vast/hre7290/Stick_Data_val

Training

  1. Set wandb.entity config in min-stretch/imitation-in-homes/configs/env_vars/env_vars.yaml to your WandB username.
  2. Recommended, especially the first time: test whether code runs on CPU before submitting a GPU job
      • Activate the Mamba environment
        mamba activate home_robot
      • Note: If this is your first time you may need to run mamba init, then exit Singularity, and then re-enter Singularity.
      • Test RVQ training
        • Set "include_task" and "wandb" configs in test_rvq_cpu.sh. If you're using the "Bag Pick Up" dataset, you can set "include_task" to "bag_pick_up" (it should be the same as the task folder name in the dataset).
        • Run the script
          ./test_rvq_cpu.sh
        • Quit program (ctrl+c) once first epoch begins
      • Test VQ-BeT training
        • Set "include_task" and "wandb" configs in test_vqbet_cpu.sh.
        • Run the script
          ./test_vqbet_cpu.sh
        • Quit program (ctrl+c) once first epoch begins
  3. If step 2 runs without errors, set "include_task" in train_vqbet_model.sh
  4. Submit GPU training job
    sbatch train_vqbet.slurm
  5. Note: You can see the status of your jobs with squeue -u <netid>