Skip to content

AravindNair430/Finetuning_UMLIPs

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-tuning-UMLFFS

Installation

To perform installation, create a conda environment and perform the following steps

    git clone https://github.com/M3RG-IITD/Fine-tuning-UMLFFS.git
    cd ./Fine-tuning-UMLFFS
    conda env create -f ./environment.yml
    git clone https://github.com/ishanthewizard/MLFF-distill.git #MLFF-distill Repository

For any issues pertaining to certain libraries, consult environment.yml for library versions.

Running the main tasks

Preprocessing

  • To perform dataset split, use
        python data_split.py
    
    Update the relevant paths within the script itself.

Dataset Label Generation

  • To sort xyz files by groups, execute

        python sort_xyz_by_group.py --input_path </path/to/xyz/file> \
                                    --output_path </path/to/save/dataset/labels> \
                                    --dataset_type dataset-type \
                                    --group_name group-name
    
    • dataset_type can be chosen from three types: train, test and val.
    • group_name is the property using which the data is sorted. It is assumed to be provided in the xyz file. In sample MPMorph Data, chemical_system is saved to be used as group name.
  • To convert xyz files into lmdb, use:

        python xyz_to_lmdb.py --xyz_path <path/to/xyz/file> \
                              --output_dir <path/to/output/dir>
    

Hessian Label Generation

To generate hessian labels for the teacher model, use:

      python get_maceMPA0_labels.py --labels_folder </path/to/save/hessian/labels> \
                                    --dataset_path <path/to/specific/dataset/label> \
                                    --model_path <path/to/model/checkpoint> \
                                    --device 'cuda'

Model Distillation

Run ./MLFF-Distill/main.py as follows

python main.py --mode train --config-yml <path/to/config/file>

For performing model distillation, the config files are available in ./configs/MPMorph/Li/hessian Some of the commandline arguments available are as follows:

  • mode: informs the mode of operation
    • train: for model training. Mlff-distill uses a custom DistillTrainer class for model distillation.
    • predict: performs inference on provided data for the loaded model checkpoint.
    • run-relaxation
  • config: path to config file
    • has the following configs
      • ./configs/base_wandb.yml: base information for running any disitllation. change properties such as data paths and optimizer settings
      • ./configs/gemnet-dt-small.yml: model specific config
      • ./configs/hessian/gemnet-dt-small.yml: model specific config needed for model distillation
    • example config files are provided in ./configs
  • run_dir: specify working directory to keep relevant logs, results and checkpoints in one place
  • debug: to run in debug mode
  • print-every: specify no. of epochs after which metrics are to be printed (default = 10)
  • checkpoint: specify path of saved distilled model checkpoint to be loaded

The current implementation uses MACE MPA-0 as the teacher model and GemNet-dt-small as the student model. PaiNN is also available to be used as a student model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%