Merge pull request #9 from usnistgov/dev

Forward model fix.
usnistgov · Oct 4, 2024 · a37ce71 · a37ce71
2 parents ebd19af + f171615
commit a37ce71
Show file tree

Hide file tree

Showing 5 changed files with 47 additions and 37 deletions.
diff --git a/README.md b/README.md
@@ -40,7 +40,7 @@ pip install atomgpt
 
 ## Forward model example (structure to property)
 
-Forwards model are used for developing surrogate models for atomic structure to property predictions. It requires text input which can be either the raw POSCAR type files or a text description of the material. After that, we can use Google-T5/ OpenAI GPT2 etc. models with customizing langauage head for accomplishing such a task. The description of a material is generated with [ChemNLP/describer](https://github.com/usnistgov/jarvis/blob/master/jarvis/core/atoms.py#L1567) function. If you turn [`convert`](https://github.com/usnistgov/atomgpt/blob/main/atomgpt/forward_models/forward_models.py#L64) to `False`, you can also train on bare POSCAR files.
+Forwards model are used for developing surrogate models for atomic structure to property predictions. It requires text input which can be either the raw POSCAR type files or a text description of the material. After that, we can use Google-T5/ OpenAI GPT2 etc. models with customizing langauage head for accomplishing such a task.
 
 ```
 atomgpt_forward --config_name atomgpt/examples/forward_model/config.json
@@ -61,7 +61,8 @@ More detailed examples/case-studies would be added here soon.
 
 | Notebooks                                                                                                                                      | Google&nbsp;Colab                                                                                                                                        | Descriptions                                                                                                                                                                                                                                                                                                                                                                                              |
 | ---------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [Forward/Inverse Model training](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example.ipynb)                                                       | [![Open in Google Colab]](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example.ipynb)                                 | Example of installing AtomGPT, inverse model training for 5 sample materials, using the trained model for inference, relaxing structures with ALIGNN-FF, generating a database of atomic structures, train a forward prediction model.                                                                                                                                                                                                                                                                       |
+| [Forward Model training](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_forward_example.ipynb)                                                       | [![Open in Google Colab]](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_forward_example.ipynb)                                 | Example of forward model training for exfoliation energy.                                                                                                                                                                                                                                                                       |
+| [Inverse Model training](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example.ipynb)                                                       | [![Open in Google Colab]](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example.ipynb)                                 | Example of installing AtomGPT, inverse model training for 5 sample materials, using the trained model for inference, relaxing structures with ALIGNN-FF, generating a database of atomic structures.                                                                                                                                                                                                                                                                       |
 | [HuggingFace model inference](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example_huggingface.ipynb)                                                  | [![Open in Google Colab]](https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/atomgpt_example_huggingface.ipynb)                            | AtomGPT Structure Generation/Inference example with a model hosted on Huggingface.                                                                                                                                                                                                                                                                                                                                 |
 
 

diff --git a/atomgpt/__init__.py b/atomgpt/__init__.py
@@ -1,3 +1,3 @@
 """Version number."""
 
-__version__ = "2024.9.18"
+__version__ = "2024.9.30"
diff --git a/atomgpt/forward_models/forward_models.py b/atomgpt/forward_models/forward_models.py
@@ -29,7 +29,6 @@
 import pprint
 import sys
 import argparse
-from alignn.pretrained import get_figshare_model
 
 parser = argparse.ArgumentParser(
     description="Atomistic Generative Pre-trained Transformer."
@@ -287,10 +286,10 @@ def run_atomgpt(config_file="config.json"):
     pprint.pprint(config)
     id_prop_path = config.id_prop_path
     convert = config.convert
-    if convert:
-        model = get_figshare_model(
-            model_name="jv_formation_energy_peratom_alignn"
-        )
+    # if convert:
+    #    model = get_figshare_model(
+    #        model_name="jv_formation_energy_peratom_alignn"
+    #    )
     if ".zip" in id_prop_path:
         zp = zipfile.ZipFile(id_prop_path)
         dat = json.loads(zp.read(id_prop_path.split(".zip")[0]))
@@ -310,7 +309,8 @@ def run_atomgpt(config_file="config.json"):
             )
             if convert:
                 atoms = Atoms.from_poscar(pth)
-                lines = atoms.describe(model=model)[config.desc_type]
+                lines = atoms.describe()[config.desc_type]
+                # lines = atoms.describe(model=model)[config.desc_type]
             else:
 
                 with open(pth, "r") as f:
@@ -529,7 +529,9 @@ def run_atomgpt(config_file="config.json"):
             train_loss = 0
             # train_result = []
             input_ids = batch[0]["input_ids"].squeeze()  # .squeeze(0)
+            # print('input_ids',input_ids.shape)
             if "t5" in model_name:
+                input_ids = batch[0]["input_ids"].squeeze(1)  # .squeeze(0)
                 predictions = (
                     model(
                         input_ids.to(device),
@@ -571,7 +573,8 @@ def run_atomgpt(config_file="config.json"):
         f.write("id,target,predictions\n")
         with torch.no_grad():
             for batch in val_dataloader:
-                input_ids = batch[0]["input_ids"].squeeze()  # .squeeze(0)
+                # input_ids = batch[0]["input_ids"].squeeze()  # .squeeze(0)
+                input_ids = batch[0]["input_ids"].squeeze(1)  # .squeeze(0)
                 ids = batch[1]
                 if "t5" in model_name:
                     predictions = (
@@ -645,6 +648,9 @@ def run_atomgpt(config_file="config.json"):
                 for batch in test_dataloader:
                     input_ids = batch[0]["input_ids"].squeeze()  # .squeeze(0)
                     if "t5" in model_name:
+                        input_ids = batch[0]["input_ids"].squeeze(
+                            1
+                        )  # .squeeze(0)
                         predictions = (
                             model(
                                 input_ids.to(device),
@@ -721,6 +727,7 @@ def run_atomgpt(config_file="config.json"):
         optimizer.zero_grad()
         input_ids = batch[0]["input_ids"].squeeze()  # .squeeze(0)
         if "t5" in model_name:
+            input_ids = batch[0]["input_ids"].squeeze(1)  # .squeeze(0)
             predictions = (
                 model(
                     input_ids.to(device),

diff --git a/atomgpt/inverse_models/inverse_models.py b/atomgpt/inverse_models/inverse_models.py
@@ -19,6 +19,8 @@
 from pydantic_settings import BaseSettings
 import sys
 import argparse
+from peft import PeftModelForCausalLM
+
 
 parser = argparse.ArgumentParser(
     description="Atomistic Generative Pre-trained Transformer."
@@ -38,7 +40,7 @@ class TrainingPropConfig(BaseSettings):
     prefix: str = "atomgpt_run"
     model_name: str = "unsloth/mistral-7b-bnb-4bit"
     batch_size: int = 2
-    num_epochs: int = 5
+    num_epochs: int = 2
     seed_val: int = 42
     num_train: Optional[int] = 2
     num_val: Optional[int] = 2
@@ -164,7 +166,7 @@ def text2atoms(response):
     return atoms
 
 
-def gen_atoms(prompt="", max_new_tokens=512, model="", tokenizer=""):
+def gen_atoms(prompt="", max_new_tokens=2048, model="", tokenizer=""):
     inputs = tokenizer(
         [
             alpaca_prompt.format(
@@ -179,9 +181,7 @@ def gen_atoms(prompt="", max_new_tokens=512, model="", tokenizer=""):
     outputs = model.generate(
         **inputs, max_new_tokens=max_new_tokens, use_cache=True
     )
-    response = tokenizer.batch_decode(outputs)
-    print("response", response)
-    response = response[0].split("# Output:")[1]
+    response = tokenizer.batch_decode(outputs)[0].split("# Output:")[1]
     atoms = None
     try:
         atoms = text2atoms(response)
@@ -204,7 +204,7 @@ def run_atomgpt_inverse(config_file="config.json"):
     num_train = config.num_train
     num_test = config.num_test
     num_val = config.num_val
-    id_prop_path = os.path.join(run_path, id_prop_path)
+    # id_prop_path = os.path.join(run_path, id_prop_path)
     with open(id_prop_path, "r") as f:
         reader = csv.reader(f)
         dt = [row for row in reader]
@@ -263,26 +263,28 @@ def run_atomgpt_inverse(config_file="config.json"):
         # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
     )
 
-    model = FastLanguageModel.get_peft_model(
-        model,
-        r=16,  # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
-        target_modules=[
-            "q_proj",
-            "k_proj",
-            "v_proj",
-            "o_proj",
-            "gate_proj",
-            "up_proj",
-            "down_proj",
-        ],
-        lora_alpha=16,
-        lora_dropout=0,  # Supports any, but = 0 is optimized
-        bias="none",  # Supports any, but = "none" is optimized
-        use_gradient_checkpointing=True,
-        random_state=3407,
-        use_rslora=False,  # We support rank stabilized LoRA
-        loftq_config=None,  # And LoftQ
-    )
+    if not isinstance(model, PeftModelForCausalLM):
+
+        model = FastLanguageModel.get_peft_model(
+            model,
+            r=16,  # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
+            target_modules=[
+                "q_proj",
+                "k_proj",
+                "v_proj",
+                "o_proj",
+                "gate_proj",
+                "up_proj",
+                "down_proj",
+            ],
+            lora_alpha=16,
+            lora_dropout=0,  # Supports any, but = 0 is optimized
+            bias="none",  # Supports any, but = "none" is optimized
+            use_gradient_checkpointing=True,
+            random_state=3407,
+            use_rslora=False,  # We support rank stabilized LoRA
+            loftq_config=None,  # And LoftQ
+        )
 
     EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN
 

diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setuptools.setup(
     name="atomgpt",
-    version="2024.9.18",
+    version="2024.9.30",
     author="Kamal Choudhary",
     author_email="[email protected]",
     description="atomgpt",