We present VenusWSL, a weakly supervised learning framework that addresses label noise in protein property prediction by leveraging a Gaussian Mixture Model to separate clean and noisy labels, enabling more robust training through a teacher-student approach.

Please make sure you have installed Anaconda3 or Miniconda3.
conda env create -f environment.yaml
conda activate proteinWe recommend using a GPU with at least 12GB memory.
bash script/get_plm_embed.shbash script/prepare_dataset.shbash script/train.shif you find this work useful, please cite:
This project is licensed under the terms of the CC-BY-NC-ND-4.0 license.