AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks

Ryan Diaz¹, Adam Imdieke¹, Vivek Veeriah², and Karthik Desingh¹

University of Minnesota¹, Google DeepMind²

[Paper] [Video] [Project Page]

Overview

This repository contains code for data collection, data augmentation, policy training, and policy evaluation on our object assembly task. Our work builds off of and borrows code from the Robosuite and Robomimic repositories, and we are grateful for their provided open-sourced resources.

The environment will automatically terminate and reset upon a successful demonstration. If you want to skip the current demonstration, press q and the simulation will reset without recording the current trajectory. To end the collection process, use Ctrl+C (note that the output hdf5 file will update for each new demonstration you record).

After recording the dataset, you must convert it into a format compatible with the Robomimic training framework (done in-place):

python robomimic/scripts/conversion/convert_robosuite.py --dataset PATH_TO_DATASET

If you would like to skip the demo recording process, we have also provided our own set of 57 human expert trajectories (50 for training, 7 for validation) off of which simulation datasets can be collected. This dataset can be found in ctb_data/datasets/demo_exp.hdf5. These trajectories were used in the experiments reported in our paper, so they are provided for reproducibility purposes.

Extracting (Augmented) Observations

Once a dataset of human expert trajectories is collected, we need to extract observations from these trajectories to create a dataset compatible with our training pipeline. This is also the step where online augmentation with subsets of our task variations can be applied. We do this using the robomimic/scripts/ctb_trajectory_cloning.py script. For example commands, refer to Step 1 of the pipeline_helper.sh file.

The full list of arguments to apply task variations to the environment is provided below; note that these arguments are also applicable to the policy evaluation script (discussed in a later section):

--obj_vars: Grasp Pose variations. Can be a subset of [xt, zt, yr, zr] (default None)
--obj_shape_vars: Peg/Hole Shape variations. Can be a subset of [arrow, line, pentagon, hexagon, diamond, u, key, cross, circle] (default key)
--obj_body_shape_vars: Object Body Shape variations. Can be a subset of [cube, cylinder, octagonal, cube-thin, cylinder-thin, octagonal-thin] (default cube)
--visual_vars: Scene Appearance and Camera Pose variations. Can be a subset of [lighting, texture, camera, arena-train, arena-eval] (note that either arena-train or arena-eval can be provided, but not both) (default None)
--ft_noise_std: Force-Torque Noise (part of Sensor Noise) variations. Given in the form FORCE_STD TORQUE_STD, which represent the standard deviations of Gaussian noise added to the corresponding input dimensions (default 0.0 0.0)
--prop_noise_std: Proprioceptive Noise (part of Sensor Noise) variations. Given in the form POSITION_STD ROTATION_STD, which represent the standard deviations of Gaussian noise added to the corresponding input dimensions (default 0.0 0.0)

Visualizing Collected Datasets

After extracting observations, you can visualize the collected observations using the robomimic/scripts/ctb_visualize_dataset.py script. For an example command, refer to Step 1.5 of the pipeline_helper.sh file.

Policy Training

To train a policy using a collected dataset of extracted observations, use the robomimic/scripts/train.py script. We have provided the configs/ctb_base.json file containing training parameters that can be modified. For an example command, refer to Step 2 of the pipeline_helper.sh file.

Policy Evaluation

To evaluate a trained policy on a subset of task variations, use the robomimic/scripts/ctb_rollout.py script. For an example command, refer to Step 3 of the pipeline_helper.sh file. Note that the task variation arguments detailed in the Extracting Observations section also applies here.

You can also visualize attention maps during rollouts (as was done in a supplemental experiment shown on the website) by adding the --visualize_attns flag.

Reference

If you find this project useful, consider citing our work:

@inproceedings{diaz2025auginsert,
  title={AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks},
  author={Diaz, Ryan and Imdieke, Adam and Veeriah, Vivek and Desingh, Karthik},
  booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
configs		configs
ctb_data/datasets		ctb_data/datasets
ctb_env		ctb_env
robomimic		robomimic
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
environment.yml		environment.yml
pipeline_helper.sh		pipeline_helper.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks

Ryan Diaz¹, Adam Imdieke¹, Vivek Veeriah², and Karthik Desingh¹

Overview

Table of Contents

Setup

Data Collection

Human Expert Demonstrations

Extracting (Augmented) Observations

Visualizing Collected Datasets

Policy Training

Policy Evaluation

Reference

About

Uh oh!

Releases

Packages

Languages

License

RyangDiaz/auginsert

Folders and files

Latest commit

History

Repository files navigation

AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks

Ryan Diaz1, Adam Imdieke1, Vivek Veeriah2, and Karthik Desingh1

Overview

Table of Contents

Setup

Data Collection

Human Expert Demonstrations

Extracting (Augmented) Observations

Visualizing Collected Datasets

Policy Training

Policy Evaluation

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Ryan Diaz¹, Adam Imdieke¹, Vivek Veeriah², and Karthik Desingh¹

Packages