Skip to content

dgymjol/MaskCLIP_SegFormer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

413 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improvement MaskCLIP and MaskCLIP+ with Class Weight and SegFormer

code base : official MaskCLIP repo, mmsegmentation

This repository contains the implementation and results of two improved version of MaskCLIP

Improve method 1 : incorporates a new classifier that places greater weight on classes predicted by CLIP.

Improve method 2 : uses the Segformer backbone instead of DeepLabv2-ResNet101.

Results of Improve method 1

MaskCLIP Performance

MaskCLIP(RN50) mIoU config json
base 18.46 config json
+ class weight (tau=0.25) 20.54 config json
MaskCLIP(ViT16) mIoU config json
base 21.68 config json
+ class weight (tau=1) 24.96 config json

Data

MaskCLIP+ Annotation-Free Segmentation Performance

MaskCLIP+(RN50) mIoU config log
base 24.82 config log
+ class weight (tau=0.25) 25.96 config json
MaskCLIP+(ViT16) mIoU config log
base 31.56 config log
+ class weight (tau=1) 32.42 config json

Results of Improve method 2

MaskCLIP+ Annotation-Free Segmentation Performance

CLIP backbone Segmentor mIoU Total Params config log
CLIP(ResNet50) DeepLabv2-ResNet101 24.82 156M config log
SegFormer-b5 22.87 125M config log
CLIP(ViT16) DeepLabv2-ResNet101 31.56 166M config log
SegFormer-b5 33.88 169M config log

Data

Setup

Step 0. Make a conda environment

bash env_install.sh

Step 1. Dataset Preparation (ref : dataset_prepare.md)

bash pascal_context_preparation.sh

Step 2. Download and convert the CLIP models & Prepare the text embeddings

bash download_weights.sh

Step 3. Download the SegFormer weights pretrained on ImageNet-1 at here and locate them in pretrain folder

Step 4. Convert pretrained mit models to MMSegmentation style

python tools/model_converters/mit2mmseg.py pretrain/mit_b0.pth pretrain/mit_b0_weight.pth

MaskCLIP

ONLY Inference.

Get quantitative results (mIoU):

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --eval mIoU

Get qualitative results:

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show-dir ${OUTPUT_DIR}

MaskCLIP+

MaskCLIP+ trains another segmentation model(SegFormer) with pseudo labels extracted from MaskCLIP.

Train. (please refer to train.md

# if single GPUs, (examples in exp_1.sh)
python tools/train.py ${CONFIG_FILE}

# if multiple GPUs, (examples in exp_2.sh)
bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}

Inference.

Get quantitative results (mIoU):

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --eval mIoU

Get qualitative results:

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show-dir ${OUTPUT_DIR}

Troubleshooting

Error 0. ImportError: libGL.so.1: cannot open shared object file: No such file or directory

sudo apt-get update
sudo apt-get install libgl1

Error 1. ImportError: MagickWand shared library not found.

sudo apt-get update
sudo apt-get install libmagickwand-dev

Error 2. ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29 not found

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get install --only-upgrade libstdc++6

Citation

the code base is MaskCLIP

@InProceedings{zhou2022maskclip,
    author = {Zhou, Chong and Loy, Chen Change and Dai, Bo},
    title = {Extract Free Dense Labels from CLIP},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2022}
}

About

MaskCLIP+ with SegFormer

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 98.3%
  • Shell 1.6%
  • Dockerfile 0.1%