Skip to content

zhoubohan0/NOLO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NOLO: Navigate Only Look Once

Arxiv Website BibTex

0. Overview 🙌

We aim to tackle the video navigation problem, whose goal is to train an in-context policy to find objects included in the context video in a new scene. After watching an 30-second egocentric video, an agent is expected to reason how to reach the target objetct specified by a goal image. Please refer to our website for videos of real-world deployment.

1. Install 🚀

1.1 Install required packages 🛰️

conda create -n nolo python=3.9
conda activate nolo
cd nolo
pip install -r docs/requirements.txt

1.2 Install RoboTHOR and Habitat 🍔

Refer to RoboTHOR to install RoboTHOR and Habitat-sim to install Habitat simulator.

1.3 Download Pretrained Models 📑

SuperGlue: https://github.com/magicleap/SuperGluePretrainedNetwork. Place the downloaded checkpoints in scripts/superglue/weights. GMFlow: https://github.com/haofeixu/gmflow. Place the downloaded checkpoints in scripts/gmflow/gmflow-pretrained. Detic: https://github.com/facebookresearch/Detic. Place the whole repository in scripts/Detic.

Dataset 📚

1. Download Scene Dataset in Habitat 🗂️

Refer to Habitat-lab to install Matterport3D datasets. Change the path in scripts/collect_habitat_all.py to where the dataset stores.

2. Create Dataset 📥

Domain can be chosen from 'robothor' or 'habitat'.

python -m scripts.collect_$domain$_all

The generated offline datasets will be in the following structure:

offline-dataset
├── robothor-dataset
│   ├── 900
│   │   ├── train
│   │   │   ├── FloorPlan1_1
│   │   │   ├── FloorPlan1_2
│   │   │   ├── ...
│   │   ├── val
│   │   │   ├── FloorPlan1_5
│   │   │   ├── FloorPlan2_5
│   │   │   ├── ...
├── mp3d-dataset
│   ├── 900
│   │   ├── train
│   │   │   ├── 1LXtFkjw3qL
│   │   │   ├── ...
│   │   ├── val
│   │   │   ├── 2azQ1b91cZZ
│   │   │   ├── ...

3. Offline Reinforcement Learning 🎮

Train a VN-Bert policy using BCQ in 'robothor' or 'habitat'.

python -m recbert_policy.train_vnbert --exp_name bcq_rank_0.5_9_SA --domain $domain$

4. Run Evaluation! 🏆

  • Evaluate Random policy in 'robothor' or 'habitat':
bash bash/eval_$domain$_random.sh
  • Evaluate baseline LMM policy 'gpt4o' or 'videollava' in 'robothor' or 'habitat'.
bash bash/eval_$domain$_baseline.sh  $baseline$

​ Notice to provide a API-KEY if use gpt4o for evaluation.

  • Evaluate VN-Bert policy (NOLO) in 'robothor' or 'habitat'. Ablation variants and cross-domain evaluation are also supported.
bash bash/eval_habitat_policymode.sh  "nolo-bert" $checkpoint_path$ "Q" "SA"

5. Real World Experiments! 🤖

  • Collect random RGB and action sequence
python scripts/collect_maze.py
  • Decode actions from recorded video:
python scripts/label_maze.py
  • Train a policy using BCQ in real world maze environment.
python -m recbert_policy.train_vnbert_real --exp_name maze
  • Evaluate the trained policy
python -m scripts.inference_maze_transformer

About

[IROS 2025 oral] Official implementation of NOLO: Navigate Only Look Once

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors