🦾 ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu, Shiyi Zhang, Ziwei Wang, Changliu Liu, Jiwen Lu, Yansong Tang
[Project Page] | [Paper]
ManiGaussian is an end-to-end behavior cloning agent that learns to perform various language-conditioned robotic manipulation tasks, which consists of a dynamic Gaussian Splatting framework and a Gaussian world model to model scene-level spatiotemporal dynamics. The dynamic Gaussian Splatting framework models the propagation of semantic features in the Gaussian embedding space for manipulation, and the Gaussian world model parameterizes distributions to provide supervision by reconstructing the future scene.
- Release pretrained checkpoints.
- Provide a Dockerfile for installation.
NOTE: ManiGaussian is mainly built upon the GNFactor repo by Ze et al.
See INSTALL.md for installation instructions.
See ERROR_CATCH.md for error catching.
The following steps are structured in order.
To generate demonstrations for all 10 tasks we use in our paper, run:
bash scripts/gen_demonstrations_all.sh
We use wandb to log some curves and visualizations. Login to wandb before running the scripts.
wandb login
To train our ManiGaussian without semantic features and deformation predictor (the fastest version), run:
bash scripts/train_and_eval_w_geo.sh ManiGaussian_BC 0,1 12345 ${exp_name}
where the exp_name
can be specified as you like. You can also train other baselines such as GNFACTOR_BC
and PERACT_BC
.
To train our ManiGaussian without semantic features, run:
bash scripts/train_and_eval_w_geo_dyna.sh ManiGaussian_BC 0,1 12345 ${exp_name}
To train our ManiGaussian without deformation predictor, run:
bash scripts/train_and_eval_w_geo_sem.sh ManiGaussian_BC 0,1 12345 ${exp_name}
To train our vanilla ManiGaussian, run:
bash scripts/train_and_eval_w_geo_sem_dyna.sh ManiGaussian_BC 0,1 12345 ${exp_name}
- We train our ManiGaussian on two NVIDIA RTX 4090 GPUs for <2 days.
- If there is an unexpected crash, enter 'q' to exit the python debugger (pdb) smoothly.
To evaluate the checkpoint, you can use:
bash scripts/eval.sh ManiGaussian_BC ${exp_name} 0
NOTE: The performances on push_buttons
and stack_blocks
may fluctuate slightly due to different variations.
After evaluation, the following command is used to compute the average success rates. For example, to compute the average success rate of our provided csv files, run:
python scripts/compute_results.py --file_paths ManiGaussian_results/w_geo/0.csv ManiGaussian_results/w_geo/1.csv ManiGaussian_results/w_geo/2.csv --method last
Checkpoint.
After downloading it to your logs/
folder, you can run the following command to check the result:
python scripts/compute_results.py --file_paths logs/gs_rgb_emb_001_dyna_01_0305/seed0/eval_data.csv --method last
This repository is released under the MIT license.
Our code is built upon GNFactor, LangSplat, GPS-Gaussian, splatter-image, PerAct, RLBench, pixelNeRF, ODISE, and CLIP. We thank all these authors for their nicely open sourced code and their great contributions to the community.
If you find this repository helpful, please consider citing:
@article{lu2024manigaussian,
title={ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation},
author={Lu, Guanxing and Zhang, Shiyi and Wang, Ziwei and Liu, Changliu and Lu, Jiwen and Tang, Yansong},
journal={arXiv preprint arXiv:2403.08321},
year={2024}
}