Fine-tuning-UMLFFS

Installation

To perform installation, create a conda environment and perform the following steps

    git clone https://github.com/M3RG-IITD/Fine-tuning-UMLFFS.git
    cd ./Fine-tuning-UMLFFS
    conda env create -f ./environment.yml
    git clone https://github.com/ishanthewizard/MLFF-distill.git #MLFF-distill Repository

For any issues pertaining to certain libraries, consult environment.yml for library versions.

Running the main tasks

Preprocessing

To perform dataset split, use
```
    python data_split.py
```
Update the relevant paths within the script itself.

Dataset Label Generation

To sort xyz files by groups, execute
```
    python sort_xyz_by_group.py --input_path </path/to/xyz/file> \
                                --output_path </path/to/save/dataset/labels> \
                                --dataset_type dataset-type \
                                --group_name group-name
```
- dataset_type can be chosen from three types: train, test and val.
- group_name is the property using which the data is sorted. It is assumed to be provided in the xyz file. In sample MPMorph Data, chemical_system is saved to be used as group name.

To convert xyz files into lmdb, use:

    python xyz_to_lmdb.py --xyz_path <path/to/xyz/file> \
                          --output_dir <path/to/output/dir>

Hessian Label Generation

To generate hessian labels for the teacher model, use:

      python get_maceMPA0_labels.py --labels_folder </path/to/save/hessian/labels> \
                                    --dataset_path <path/to/specific/dataset/label> \
                                    --model_path <path/to/model/checkpoint> \
                                    --device 'cuda'

Model Distillation

Run ./MLFF-Distill/main.py as follows

python main.py --mode train --config-yml <path/to/config/file>

For performing model distillation, the config files are available in ./configs/MPMorph/Li/hessian Some of the commandline arguments available are as follows:

mode: informs the mode of operation
- train: for model training. Mlff-distill uses a custom DistillTrainer class for model distillation.
- predict: performs inference on provided data for the loaded model checkpoint.
- run-relaxation
config: path to config file
- has the following configs
  - ./configs/base_wandb.yml: base information for running any disitllation. change properties such as data paths and optimizer settings
  - ./configs/gemnet-dt-small.yml: model specific config
  - ./configs/hessian/gemnet-dt-small.yml: model specific config needed for model distillation
- example config files are provided in ./configs
run_dir: specify working directory to keep relevant logs, results and checkpoints in one place
debug: to run in debug mode
print-every: specify no. of epochs after which metrics are to be printed (default = 10)
checkpoint: specify path of saved distilled model checkpoint to be loaded

The current implementation uses MACE MPA-0 as the teacher model and GemNet-dt-small as the student model. PaiNN is also available to be used as a student model.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs/MPMorph		configs/MPMorph
student_models/allegro		student_models/allegro
teacher_labels/Li_01		teacher_labels/Li_01
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuning-UMLFFS

Installation

Running the main tasks

Preprocessing

Dataset Label Generation

Hessian Label Generation

Model Distillation

About

Uh oh!

Releases

Packages

Languages

AravindNair430/Finetuning_UMLIPs

Folders and files

Latest commit

History

Repository files navigation

Fine-tuning-UMLFFS

Installation

Running the main tasks

Preprocessing

Dataset Label Generation

Hessian Label Generation

Model Distillation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages