Policy Learning Wiki

On this page you'll learn to train a policy on your processed data (or on an existing dataset) through Behavior Cloning (BC), with an algorithm like VQ-BeT. If you'd like to learn more about supervised policy learning, this tutorial serves as a nice introduction.

Preliminary Steps

SSH into Greene. I recommend doing so using the VS Code IDE. If it's your first time, you can see instructions for setting up SSH on VS Code here.

ssh netid@greene.hpc.nyu.edu

Optional, but highly recommended: sign up for WandB for logging your training runs.

Clone and Enter Repository in Greene

In your terminal, enter the scratch directory
```
cd $SCRATCH
```

Clone

git clone https://github.com/NYU-robot-learning/min-stretch.git

Change directory
```
cd min-stretch
```
Run setup script (this sets a config in the configs/env_vars/env_vars.yaml file)
```
./setup.sh
```

Request Resources and Set Up Environment on Greene

Enter your scratch directory
```
cd $SCRATCH
```

Request CPU resources in an interactive session

srun --nodes=1 --tasks-per-node=1 --cpus-per-task=16 --mem=64GB --time=2:00:00 --pty /bin/bash

Setup a Mamba environment

Copy an overlay filesystem to your Scratch:

cp /scratch/work/public/overlay-fs-ext3/overlay-50G-10M.ext3.gz $SCRATCH/overlay-home-robot-env.ext3.gz

Unzip the overlay filesystem
```
gunzip overlay-home-robot-env.ext3.gz
```

Enter singularity container

singularity exec --overlay $SCRATCH/overlay-home-robot-env.ext3:rw /scratch/work/public/singularity/cuda11.8.86-cudnn8.7-devel-ubuntu22.04.2.sif /bin/bash

Install Mamba: Miniforge

Instructions from the website above pasted below for convenience:

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Install it in /ext3/miniforge3 when prompted
You will also be prompted with Do you wish to update your shell profile to automatically initialize conda?. The default is no but I would recommend entering yes so that the mamba initialization script gets added to you ~/.bashrc.

Enter min-stretch
```
cd min-stretch
```
Create environment from config file
```
mamba env create -f conda_env.yaml
```

Additional Resources

Setting up the Dataset

If you'd like to follow along with an existing dataset, and don't have your own already, you can download the existing "Bag Pick Up" dataset here.

Enter the min-stretch directory
```
cd $SCRATCH/min-stretch
```
Set the dataset configs in min-stretch/imitation-in-homes/configs/env_vars/env_vars.yaml to appropriate paths.

data_root.train: The path to the training dataset.
data_root.val: The path to the test dataset (if you don't have a split, you can just use the training dataset).
data_original_root.train: The original path to the training dataset (in the folder's r3d_files.txt file). This may be the same as data_root.train if you haven't moved the data after creating r3d_files.txt.
data_original_root.val: The original path to the test dataset (in the folder's r3d_files.txt file). This may be the same as data_root.train if you haven't moved the data after creating r3d_files.txt.

Example:

data_root: 
    train: /scratch/hre7290/test/min-stretch/Stick_Data
    val: /scratch/hre7290/test/min-stretch/Stick_Data_val

data_original_root: 
    train: /vast/hre7290/Stick_Data
    val: /vast/hre7290/Stick_Data_val

Training

Set wandb.entity config in min-stretch/imitation-in-homes/configs/env_vars/env_vars.yaml to your WandB username.
Recommended, especially the first time: test whether code runs on CPU before submitting a GPU job

Activate the Mamba environment
```
mamba activate home_robot
```

mamba init

exit

Test RVQ training

Set "include_task" and "wandb" configs in test_rvq_cpu.sh. If you're using the "Bag Pick Up" dataset, you can set "include_task" to "bag_pick_up" (it should be the same as the task folder name in the dataset).
Run the script
```
./test_rvq_cpu.sh
```
Quit program (ctrl+c) once first epoch begins

Test VQ-BeT training

Set "include_task" and "wandb" configs in test_vqbet_cpu.sh.
Run the script
```
./test_vqbet_cpu.sh
```
Quit program (ctrl+c) once first epoch begins

If step 2 runs without errors, set "include_task" in train_vqbet_model.sh
Submit GPU training job
```
sbatch train_vqbet.slurm
```

squeue -u <netid>