Skip to content

pvdsan/Visual_PySLAM_Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual SLAM System for TUM RGB-D Dataset

A modular Python implementation of Visual SLAM (Simultaneous Localization and Mapping) focused on monocular visual odometry. This project efficiently processes TUM RGB-D benchmark datasets to estimate camera trajectory with robust feature tracking and scale estimation.

Features

  • Feature-based Visual Odometry: Uses ORB features and robust matching to track camera motion
  • Automatic Scale Initialization: Computes scale factor from initial frames for proper trajectory scaling
  • Comprehensive Visualization: 2D/3D trajectory plots and feature matching visualizations
  • TUM Dataset Integration: Full support for TUM RGB-D benchmark datasets
  • Robust Error Handling: Graceful handling of insufficient features and matching failures
  • Configurable via YAML: Easily customize parameters through configuration files

Results

Trajectory Comparison

Example of estimated trajectory (blue) compared with ground truth (red)

Project Structure

.
├── config/
│   └── config.yaml         # Configuration file with system parameters
├── src/
│   ├── slam/               # Main SLAM package
│   │   ├── core/           # Core SLAM components (VO, camera, frame)
│   │   ├── utils/          # Utility functions (config, data handling)
│   │   └── visualization/  # Visualization tools
│   └── main.py             # Main script for running the system
├── data/
│   ├── raw/                # Raw input data (TUM RGB-D datasets)
│   └── processed/          # Processed output data (trajectories, visualizations)
├── scripts/                # Utility scripts (evaluation, dataset preparation)
├── docs/                   # Documentation
├── requirements.txt        # Project dependencies
└── README.md               # This file

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/visual-slam-tum.git
cd visual-slam-tum
  1. Create and activate a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate     # Windows
  1. Install dependencies:
pip install -r requirements.txt

Dataset Preparation

This system is designed to work with the TUM RGB-D benchmark datasets. To prepare the data:

  1. Download one of the TUM RGB-D datasets (e.g., rgbd_dataset_freiburg2_pioneer_360)
  2. Extract the dataset to data/raw/tum_rgbd/
  3. Update the dataset path in config/config.yaml if needed

Example directory structure:

data/raw/tum_rgbd/rgbd_dataset_freiburg2_pioneer_360/
├── rgb/               # RGB images
├── depth/             # Depth images (not used in monocular mode)
├── groundtruth.txt    # Ground truth trajectory
├── rgb.txt            # RGB image timestamps
└── depth.txt          # Depth image timestamps

Usage

Running the SLAM System

python src/main.py

This will:

  1. Load the dataset specified in the configuration
  2. Process frames sequentially to estimate camera motion
  3. Generate visualizations and trajectory files in the output directory

Evaluating Results

python evaluate_slam.py data/processed/DATASET_NAME/trajectories/txt/estimated_trajectory.txt data/processed/DATASET_NAME/trajectories/txt/groundtruth.txt --gt-has-timestamps --output-dir data/processed/DATASET_NAME/evaluation

This will:

  1. Align and compare the estimated trajectory with ground truth
  2. Calculate Absolute Pose Error (APE) and Relative Pose Error (RPE)
  3. Generate visualizations of trajectory comparison and error metrics

Configuration

The system is configured through config/config.yaml. Key parameters include:

camera:
  # Camera intrinsics
  fx: 520.9      # Focal length x (for TUM Freiburg2)
  fy: 521.0      # Focal length y
  cx: 325.1      # Principal point x
  cy: 249.7      # Principal point y
  width: 640
  height: 480

dataset:
  path: "data/raw/tum_rgbd/rgbd_dataset_freiburg2_pioneer_360"
  max_frames: 500  # Maximum number of frames to process

feature_detection:
  n_features: 1000
  scale_factor: 1.2
  n_levels: 8

matching:
  distance_threshold: 0.75
  min_matches: 8
  matcher: "flann"  # Options: flann, bf
 
motion_estimation:
  min_inliers: 15
  ransac_threshold: 2.0
  
visualization:
  save_matches_video: true
  save_trajectory_2d: true
  save_trajectory_3d: true

Output

The system generates several outputs in data/processed/DATASET_NAME/:

  • Trajectories:

    • trajectories/txt/estimated_trajectory.txt: Camera trajectory in TUM format
    • trajectories/txt/groundtruth.txt: Ground truth trajectory
    • trajectories/ply/pointcloud.ply: Sparse point cloud of camera positions
  • Visualizations:

    • visualizations/trajectory_comparison.png: Comparison of estimated and ground truth trajectories
    • visualizations/trajectory_2d.png: 2D top-down view of trajectory
    • visualizations/feature_matches.avi: Video showing feature matches between frames

Implementation Details

Visual Odometry Process

  1. Feature Detection: ORB features are detected in each frame
  2. Feature Matching: Features are matched between consecutive frames
  3. Pose Estimation: The essential matrix is computed and decomposed to recover camera motion
  4. Scale Initialization: Scale is initialized from the first few frames
  5. Trajectory Building: Camera poses are accumulated to build the complete trajectory

Scale Estimation

Monocular visual odometry inherently suffers from scale ambiguity. To address this:

  1. The system initializes the absolute scale using ground truth in the first few frames
  2. This scale factor is then applied to all subsequent motion estimations
  3. The median scale is computed to be robust against outliers

Evaluation Metrics

The system uses the following metrics for evaluation:

  1. Absolute Pose Error (APE): Measures the global error between estimated poses and ground truth
  2. Relative Pose Error (RPE): Measures local accuracy and drift
  3. Scale Factor: The estimated scale compared to ground truth

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published