Visual SLAM System for TUM RGB-D Dataset

A modular Python implementation of Visual SLAM (Simultaneous Localization and Mapping) focused on monocular visual odometry. This project efficiently processes TUM RGB-D benchmark datasets to estimate camera trajectory with robust feature tracking and scale estimation.

Features

Feature-based Visual Odometry: Uses ORB features and robust matching to track camera motion
Automatic Scale Initialization: Computes scale factor from initial frames for proper trajectory scaling
Comprehensive Visualization: 2D/3D trajectory plots and feature matching visualizations
TUM Dataset Integration: Full support for TUM RGB-D benchmark datasets
Robust Error Handling: Graceful handling of insufficient features and matching failures
Configurable via YAML: Easily customize parameters through configuration files

Results

Example of estimated trajectory (blue) compared with ground truth (red)

Project Structure

.
├── config/
│   └── config.yaml         # Configuration file with system parameters
├── src/
│   ├── slam/               # Main SLAM package
│   │   ├── core/           # Core SLAM components (VO, camera, frame)
│   │   ├── utils/          # Utility functions (config, data handling)
│   │   └── visualization/  # Visualization tools
│   └── main.py             # Main script for running the system
├── data/
│   ├── raw/                # Raw input data (TUM RGB-D datasets)
│   └── processed/          # Processed output data (trajectories, visualizations)
├── scripts/                # Utility scripts (evaluation, dataset preparation)
├── docs/                   # Documentation
├── requirements.txt        # Project dependencies
└── README.md               # This file

Installation

Clone the repository:

git clone https://github.com/yourusername/visual-slam-tum.git
cd visual-slam-tum

Create and activate a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate     # Windows

Install dependencies:

pip install -r requirements.txt

Dataset Preparation

This system is designed to work with the TUM RGB-D benchmark datasets. To prepare the data:

Download one of the TUM RGB-D datasets (e.g., rgbd_dataset_freiburg2_pioneer_360)
Extract the dataset to data/raw/tum_rgbd/
Update the dataset path in config/config.yaml if needed

Example directory structure:

data/raw/tum_rgbd/rgbd_dataset_freiburg2_pioneer_360/
├── rgb/               # RGB images
├── depth/             # Depth images (not used in monocular mode)
├── groundtruth.txt    # Ground truth trajectory
├── rgb.txt            # RGB image timestamps
└── depth.txt          # Depth image timestamps

Usage

Running the SLAM System

python src/main.py

This will:

Load the dataset specified in the configuration
Process frames sequentially to estimate camera motion
Generate visualizations and trajectory files in the output directory

Evaluating Results

python evaluate_slam.py data/processed/DATASET_NAME/trajectories/txt/estimated_trajectory.txt data/processed/DATASET_NAME/trajectories/txt/groundtruth.txt --gt-has-timestamps --output-dir data/processed/DATASET_NAME/evaluation

This will:

Align and compare the estimated trajectory with ground truth
Calculate Absolute Pose Error (APE) and Relative Pose Error (RPE)
Generate visualizations of trajectory comparison and error metrics

Configuration

The system is configured through config/config.yaml. Key parameters include:

camera:
  # Camera intrinsics
  fx: 520.9      # Focal length x (for TUM Freiburg2)
  fy: 521.0      # Focal length y
  cx: 325.1      # Principal point x
  cy: 249.7      # Principal point y
  width: 640
  height: 480

dataset:
  path: "data/raw/tum_rgbd/rgbd_dataset_freiburg2_pioneer_360"
  max_frames: 500  # Maximum number of frames to process

feature_detection:
  n_features: 1000
  scale_factor: 1.2
  n_levels: 8

matching:
  distance_threshold: 0.75
  min_matches: 8
  matcher: "flann"  # Options: flann, bf
 
motion_estimation:
  min_inliers: 15
  ransac_threshold: 2.0
  
visualization:
  save_matches_video: true
  save_trajectory_2d: true
  save_trajectory_3d: true

Output

The system generates several outputs in data/processed/DATASET_NAME/:

Trajectories:
- trajectories/txt/estimated_trajectory.txt: Camera trajectory in TUM format
- trajectories/txt/groundtruth.txt: Ground truth trajectory
- trajectories/ply/pointcloud.ply: Sparse point cloud of camera positions
Visualizations:
- visualizations/trajectory_comparison.png: Comparison of estimated and ground truth trajectories
- visualizations/trajectory_2d.png: 2D top-down view of trajectory
- visualizations/feature_matches.avi: Video showing feature matches between frames

Implementation Details

Visual Odometry Process

Feature Detection: ORB features are detected in each frame
Feature Matching: Features are matched between consecutive frames
Pose Estimation: The essential matrix is computed and decomposed to recover camera motion
Scale Initialization: Scale is initialized from the first few frames
Trajectory Building: Camera poses are accumulated to build the complete trajectory

Scale Estimation

Monocular visual odometry inherently suffers from scale ambiguity. To address this:

The system initializes the absolute scale using ground truth in the first few frames
This scale factor is then applied to all subsequent motion estimations
The median scale is computed to be robust against outliers

Evaluation Metrics

The system uses the following metrics for evaluation:

Absolute Pose Error (APE): Measures the global error between estimated poses and ground truth
Relative Pose Error (RPE): Measures local accuracy and drift
Scale Factor: The estimated scale compared to ground truth

Contributing

Contributions are welcome! To contribute:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual SLAM System for TUM RGB-D Dataset

Features

Results

Project Structure

Installation

Dataset Preparation

Usage

Running the SLAM System

Evaluating Results

Configuration

Output

Implementation Details

Visual Odometry Process

Scale Estimation

Evaluation Metrics

Contributing

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
data		data
docs		docs
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluate_slam.py		evaluate_slam.py
requirements.txt		requirements.txt

License

pvdsan/Visual_PySLAM_Engine

Folders and files

Latest commit

History

Repository files navigation

Visual SLAM System for TUM RGB-D Dataset

Features

Results

Project Structure

Installation

Dataset Preparation

Usage

Running the SLAM System

Evaluating Results

Configuration

Output

Implementation Details

Visual Odometry Process

Scale Estimation

Evaluation Metrics

Contributing

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages