We are still gradually removing redundant code and refining the implementation. More documentation and examples will be added soon.
1. Install TALO (Based on VGGT-SLAM)
sudo apt-get install git python3-pip libboost-all-dev cmake gcc g++ unzip # required by VGGT-SLAM
git clone https://github.com/TODO/talo
cd talo
conda create -n talo python=3.11
conda activate talo
./setup.sh # installs third-party dependencies required by VGGT-SLAM
TALO currently supports the following 3D Vision Foundation Models as interchangeable backbones.
| Backbone | Installation |
|---|---|
| VGGT | Already installed through VGGT-SLAM |
| Pi3 | Clone the repo into the TALO directory (e.g., TALO/Pi3/pi3) |
| MapAnything | Install mapanything as a package into the created talo conda environment (instructions) |
It is also easy to integrate more advanced 3DVFMs by only formatting the prediction as a python dictionary containing the following keys
(see VFMs_adaptor.py for example implementations):
"org_images""images""cam2world""intrinsic""world_points""world_points_conf"
TALO currently supports:
- nuScenes
- Waymo Open Dataset
Download raw .tfrecord sequences:
https://waymo.com/open/
Download the full dataset: https://www.nuscenes.org/download
After downloading, convert datasets using our extraction scripts:
python dataset/extract_waymo.py
python dataset/extract_nuscenes.py
Note that parsing Waymo requires
waymo-open-dataset-tf-2-6-0, which depends on older package versions (e.g.,NumPy 1.x) and is not compatible with the TALO environment. Therefore, please create a separate Python environment specifically for extracting Waymo.
Please modify data_root and save_root accordingly in each script.
These scripts will:
- Extract RGB images (as model input)
- Extract camera intrinsics, extrinsics, and LiDAR (used as GT for evaluation)
scene_dir/
image/
FRONT/
000.jpg
...
cam2world/
FRONT/
000.txt # 4x4 matrix
...
...
intrinsic/
FRONT.txt # 3x3 matrix
...
lidar/
000.bin
...
To run the system on your own data, format it as follows:
custom_data
example_scene/
image/
cam0/
000.jpg
...
...
Then run:
python main.py --data_folder ./Data/custom_data/example_scene/ --log_path ./Save/custom_data/example_scene/VGGT+60+tps
for reconstruction and
python eval_vis_pcd_traj.py --GT ./Data/custom_data/example_scene/ --pred ./Save/custom_data/example_scene/VGGT+60+tps --vis
for visualization.
We provide a quick-start script that runs TALO on both Waymo and nuScenes, and summarizes results as reported in the paper.
bash run.sh
| Argument | Description |
|---|---|
--data_folder |
Path to the prepared scene directory |
--log_path |
Directory to save logs/results |
--model |
Choose from {VGGT, Pi3, MapAnything} |
--conf_threshold |
Confidence threshold for filtering |
--interframe_solver_choice |
Choose from {sim3, sl4, tps} |
--submap_size |
Number of frames per submap |
--cam_num |
Number of cameras to use |
--disable_sky_mask |
Disable sky mask (e.g., for indoor scenes) |
TALO provides both online and offline visualization modes.
Online VGGT-SLAM visualization by adding:
--vis_map
to main.py (inside run.sh).
Enable offline reconstruction visualization by adding:
--vis
to eval_vis_pcd_traj.py (inside run.sh).
To ensure fair comparisons between different submap alignment methods (SL4 from VGGT-SLAM and Sim3 from VGGT-Long), TALO is built upon the same framework (VGGT-SLAM) and extended to support multi-camera settings as well as additional 3D Vision Foundation Models (3DVFMs), including VGGT, Pi3, and MapAnything. All rights of these projects are fully reserved by their respective authors.
We sincerely thank the authors and maintainers of these outstanding open-source projects. If you find TALO useful, please consider citing and starring our work, and supporting the projects that made it possible.



