With this code, one can perform the hybrid machine learning, unsupervised learnig with Autoencoder (AE) and predict storage capacity via a MLP for a hydrogen storage data given in the file train.dat. Here are the descriptions of the programs:
- B1.py: Uses AE+MLP approach
- MLP-B1.py: property predicts directly with the MLP without any feature transformation.
- Corr.py: Studies the Pearson correlation between the features in the latent space and the real features.
- train.dat: Training data for 1483 materials [First column target, next 36 columns features]
The directory Unknown-materials contains:
- Program U.py to predict to the hydrogen storage capacity for the materials from only features:
- set1.dat:[ TiAlN2, V2H2, Zr2TiAl, MgC, NLi]
- set2.dat: [NMn2Ti,MgCHF, CAlB, MgCHF, MgMnVTi ] ->copy these files to unknown.dat and run U.py
The directory LLM contains:
- Script to train the GPT-2 model and to save in a directory called TrainedModel. (python GPT-2.py)
- Generate chemical formulas based on the loaded model (python Generators.py)
- It should be noted that the generated materials depend on the parameters used.
- To generate more materials at a time, change the parameters.