Updated on 2024.07.24
This repository provides the official implementation of Prime (Protein language model for Intelligent Masked pretraining and Environment (temperature) prediction).
Key feature:
- Zero-shot mutant effect prediction.
- OGT Prediction
Pro-Prime, a novel protein language model, has been developed for predicting the Optimal Growth Temperature (OGT) and enabling zero-shot prediction of protein thermostability and activity. This novel approach leverages temperature-guided language modeling.
Main Requirements
biopython==1.81
torch (2.4)
Installation
pip install -r requirements.txthttps://drive.google.com/file/d/1AEpK3TmgFNszZXJQWwRPkHUugrdHrTgk/view?usp=sharing
- Run ProtienGym Benchmark or Zero-shot mutant Effect Prediction, see in this notebook.
- OGT prediction, see in this notebook.
- Tm prediction, see in this notebook.
- Topt prediction, see in this notebook.
This project is under the MIT license. See LICENSE for details.
A lot of code is modified from 🤗 transformers and esm.
If you find this repository useful, please consider citing this paper:
@article{jiang2024general,
title={A general temperature-guided language model to design proteins of enhanced stability and activity},
author={Jiang, Fan and Li, Mingchen and Dong, Jiajun and Yu, Yuanxi and Sun, Xinyu and Wu, Banghao and Huang, Jin and Kang, Liqi and Pei, Yufeng and Zhang, Liang and others},
journal={Science Advances},
volume={10},
number={48},
pages={eadr2641},
year={2024},
publisher={American Association for the Advancement of Science}
}
This project is licensed under the terms of the CC-BY-NC-ND-4.0 license.

