Skip to content

A unified framework for embodied visual navigation

License

Notifications You must be signed in to change notification settings

Wzey2000/EmbodiedUniT

Repository files navigation

EmbodiedUNIT: Embodied Multi-task Learning for Visual Navigation

Requirements

The source code is developed and tested in the following setting.

  • Python 3.7
  • pytorch 1.7.1
  • habitat-sim 0.2.1
  • habitat 0.2.1

Please refer to habitat-sim and habitat-lab for installation instructions.

To install requirements:

pip install -r requirements.txt

Data Setup

Scene Datasets

The scene datasets and task datasets used for training should be organized as follows:

Any path
  └── data
      └── scene_datasets
          └── gibson_habitat
          |   └── *.glb, *.navmeshs
          └── mp3d
              └── 1LXtFkjw3qL
              |    └── 1LXtFkjw3qL.glb
              |    └── 1LXtFkjw3qL.navmesh
              └── ... (other scenes)           

Then modify the task configuration file as follows so that the Habitat simulator can load these datasets:

For example, in objectnav_mp3d_il.yaml:
SCENES_DIR: "path/to/data/scene_datasets"

Training and Evaluation Dataset Setup

Any path
  └── data
      └── datasets
      │   └── pointnav
      │   |   └── gibson
      │   |       └── v1
      │   |           └── train
      │   |           └── val
      |   └── objectnav
      |   │       └── mp3d
      |   │           └── mp3d_70k
      |   │                └── train
      |   │                |   └── train.json.gz
      |   │                |   └── content 
      |   │                |        └── 1LXtFkjw3qL.json.gz 
      |   │                └── val
      |   │                └── sample
      |   └── imagenav
      │   |   └── gibson
      │   |       └── v1
      │   |           └── train
      │   |           └── val
      |   └── VLN
      |   │   └── VLNCE_R2R
      |   │       └── sample
      |   │       └── train
      |   │       └── val
      |   └── imagenav
      └── scene_datasets

Then modify the task configuration file as follows so that the Habitat simulator can load these datasets:

For example, in objectnav_mp3d_il.yaml:
DATA_PATH: "path/to/data/datasets/objectnav/mp3d/mp3d_70k/{split}/{split}.json.gz"

ImageNav

ImageNav datasets originate from the Habitat-Web human demonstration dataset, which is originally used for ObjectNav. We transform the target format from object category to image in this dataset to generate ImageNav training datasets.

ObjectNav

We use the Habitat-web 70k demonstrations for training and this official dataset for evaluation.

Training

Model Definition

(1) Baseline Model Architecture

The policy used in this project is the CNNRNN used in the Habitat-web paper and adapted for ImageNav and VLN.

You can find the pretrained RedNet semantic segmentation model weights here and the pretrained depth encoder weights here.

Please modify SEMANTIC_ENCODER.rednet_ckpt and DEPTH_ENCODER.ddppo_checkpoint in the config accordingly.

(2) Define your own model

Every navigation policy needs to be defined in custom_habitat_baselines/il/env_based/policy and must contains a class method named def act(self, *args).

To specify the policy class you wiil use, please modify the entry POLICY: *** in your model configuration file in the configs directory.

Simulator Settings

The navigation environment is defined in custom_habitat_baselines/common/environments.py.

The agent's sensors, measures, actions, tasks, and goals are defined in ./custom_habitat/tasks/nav/nav.py.

Training Pipeline

The imitation learning pipieline is defined in ./custom_habitat_baselines/il/env_based/il_trainer.py

Use this command to train an agent for ImageNav:

python run.py --cfg ./configs/ImageNav/CNNRNN/CNNRNN_woPose_envbased.yaml --split train [--debug 1]

Use this command to train an agent for ObjectNav:

python run.py --cfg ./configs/ObjectNav/CNNRNN/Objnav_wopano.yaml --split train

Evaluation

Use this command to evalluate an agent for ObjectNav:

python run.py --cfg ./configs/ObjectNav/CNNRNN/Objnav_wopano.yaml --run-type eval --split val --ckpt path/to/ckpt

Pre-trained Models

TODO

Results

TODO

About

A unified framework for embodied visual navigation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published