axinc-ai · YToleubay · Nov 14, 2023 · Nov 14, 2023 · Nov 14, 2023 · Nov 14, 2023
diff --git a/README.md b/README.md
@@ -331,6 +331,7 @@ The collection of pre-trained, state-of-the-art AI models.
 |[bert_insert_punctuation](/natural_language_processing/bert_insert_punctuation) | [bert-japanese](https://github.com/cl-tohoku/bert-japanese) | Pytorch | 1.2.15 and later |
 |[t5_whisper_medical](/natural_language_processing/t5_whisper_medical) | error correction of medical terms using t5 | Pytorch | 1.2.13 and later | |
 |[t5_base_summarization](/natural_language_processing/t5_base_japanese_summarization) | [t5-japanese](https://github.com/sonoisa/t5-japanese) | Pytorch | 1.2.13 and later |
+|[elyza-japanese-llama-2-7b](/natural_language_processing/elyza-japanese-llama-2-7b) | [ELYZA-japanese-Llama-2-7b](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b) | Pytorch | 1.2.16 and later |
 
 ## Neural Rendering
 

diff --git a/natural_language_processing/elyza-japanese-llama-2-7b/LICENSE_Elyza.txt b/natural_language_processing/elyza-japanese-llama-2-7b/LICENSE_Elyza.txt
@@ -0,0 +1,125 @@
+LLAMA 2 COMMUNITY LICENSE AGREEMENT	
+Llama 2 Version Release Date: July 18, 2023
+
+"Agreement" means the terms and conditions for use, reproduction, distribution and 
+modification of the Llama Materials set forth herein.
+
+"Documentation" means the specifications, manuals and documentation 
+accompanying Llama 2 distributed by Meta at ai.meta.com/resources/models-and-
+libraries/llama-downloads/.
+
+"Licensee" or "you" means you, or your employer or any other person or entity (if 
+you are entering into this Agreement on such person or entity's behalf), of the age 
+required under applicable laws, rules or regulations to provide legal consent and that 
+has legal authority to bind your employer or such other person or entity if you are 
+entering in this Agreement on their behalf.
+
+"Llama 2" means the foundational large language models and software and 
+algorithms, including machine-learning model code, trained model weights, 
+inference-enabling code, training-enabling code, fine-tuning enabling code and other 
+elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-
+libraries/llama-downloads/.
+
+"Llama Materials" means, collectively, Meta's proprietary Llama 2 and 
+Documentation (and any portion thereof) made available under this Agreement.
+
+"Meta" or "we" means Meta Platforms Ireland Limited (if you are located in or, if you 
+are an entity, your principal place of business is in the EEA or Switzerland) and Meta 
+Platforms, Inc. (if you are located outside of the EEA or Switzerland). 
+
+By clicking "I Accept" below or by using or distributing any portion or element of the 
+Llama Materials, you agree to be bound by this Agreement.
+
+1. License Rights and Redistribution. 
+
+      a. Grant of Rights. You are granted a non-exclusive, worldwide, non-
+transferable and royalty-free limited license under Meta's intellectual property or 
+other rights owned by Meta embodied in the Llama Materials to use, reproduce, 
+distribute, copy, create derivative works of, and make modifications to the Llama 
+Materials.  
+
+      b. Redistribution and Use.  
+
+            i. If you distribute or make the Llama Materials, or any derivative works 
+thereof, available to a third party, you shall provide a copy of this Agreement to such 
+third party. 
+            ii.  If you receive Llama Materials, or any derivative works thereof, from 
+a Licensee as part of an integrated end user product, then Section 2 of this 
+Agreement will not apply to you. 
+
+            iii. You must retain in all copies of the Llama Materials that you 
+distribute the following attribution notice within a "Notice" text file distributed as a 
+part of such copies: "Llama 2 is licensed under the LLAMA 2 Community License, 
+Copyright (c) Meta Platforms, Inc. All Rights Reserved."
+
+            iv. Your use of the Llama Materials must comply with applicable laws 
+and regulations (including trade compliance laws and regulations) and adhere to the 
+Acceptable Use Policy for the Llama Materials (available at 
+https://ai.meta.com/llama/use-policy), which is hereby incorporated by reference into 
+this Agreement.
+
+            v. You will not use the Llama Materials or any output or results of the 
+Llama Materials to improve any other large language model (excluding Llama 2 or 
+derivative works thereof).  
+
+2. Additional Commercial Terms. If, on the Llama 2 version release date, the 
+monthly active users of the products or services made available by or for Licensee, 
+or Licensee's affiliates, is greater than 700 million monthly active users in the 
+preceding calendar month, you must request a license from Meta, which Meta may 
+grant to you in its sole discretion, and you are not authorized to exercise any of the 
+rights under this Agreement unless or until Meta otherwise expressly grants you 
+such rights.
+
+3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE 
+LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE 
+PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, 
+EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY 
+WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR 
+FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE 
+FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING 
+THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR 
+USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
+
+4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE 
+LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, 
+NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS 
+AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, 
+CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN 
+IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF 
+ANY OF THE FOREGOING.
+
+5. Intellectual Property.
+
+      a. No trademark licenses are granted under this Agreement, and in 
+connection with the Llama Materials, neither Meta nor Licensee may use any name 
+or mark owned by or associated with the other or any of its affiliates, except as 
+required for reasonable and customary use in describing and redistributing the 
+Llama Materials.
+
+      b. Subject to Meta's ownership of Llama Materials and derivatives made by or 
+for Meta, with respect to any derivative works and modifications of the Llama 
+Materials that are made by you, as between you and Meta, you are and will be the 
+owner of such derivative works and modifications.
+
+      c. If you institute litigation or other proceedings against Meta or any entity 
+(including a cross-claim or counterclaim in a lawsuit) alleging that the Llama 
+Materials or Llama 2 outputs or results, or any portion of any of the foregoing, 
+constitutes infringement of intellectual property or other rights owned or licensable 
+by you, then any licenses granted to you under this Agreement shall terminate as of 
+the date such litigation or claim is filed or instituted. You will indemnify and hold 
+harmless Meta from and against any claim by any third party arising out of or related 
+to your use or distribution of the Llama Materials.
+
+6. Term and Termination. The term of this Agreement will commence upon your 
+acceptance of this Agreement or access to the Llama Materials and will continue in 
+full force and effect until terminated in accordance with the terms and conditions 
+herein. Meta may terminate this Agreement if you are in breach of any term or 
+condition of this Agreement. Upon termination of this Agreement, you shall delete 
+and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the 
+termination of this Agreement. 
+
+7. Governing Law and Jurisdiction. This Agreement will be governed and 
+construed under the laws of the State of California without regard to choice of law 
+principles, and the UN Convention on Contracts for the International Sale of Goods 
+does not apply to this Agreement. The courts of California shall have exclusive 
+jurisdiction of any dispute arising out of this Agreement. 
diff --git a/natural_language_processing/elyza-japanese-llama-2-7b/README.md b/natural_language_processing/elyza-japanese-llama-2-7b/README.md
@@ -0,0 +1,46 @@
+# Text Generation Japanese LLama Elyza
+
+### input
+A `SENTENCE`in japanese, example: クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。
+
+### output
+A text, consisting of a number of tokens equivalent to the "--outlength" parameter, is stored as a TXT file at the specified "SAVE_PATH" location.
+
+### Usage
+Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.
+
+For the sample file use following command. 
+```
+pip3 install -r requirements.txt
+
+python3 elyza.py 
+```
+
+You can use an orbitary sentence with:
+
+```
+python3 elyza.py --input クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。
+```
+
+If you want to specify the maximum number of generated tockens (e.g. 300 tockens), please use following command:
+
+```
+python3 elyza.py  --outlength 300
+```
+
+if you want to run a benchmark of the model inference, you can use :
+
+```
+python3 elyza.py  --benchmark
+```
+### Framework
+PyTorch, HuggingFace Transformers
+
+### Model Format
+ONNX opset = 13
+
+### Netron
+
+[Elyza LLama model](LICENSE_LLama)
+
+- [decoder_model.onnx.prototxt](https://netron.app/?url=https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/decoder_model.onnx.prototxt)
diff --git a/natural_language_processing/elyza-japanese-llama-2-7b/elyza.py b/natural_language_processing/elyza-japanese-llama-2-7b/elyza.py
@@ -0,0 +1,116 @@
+import time
+import sys
+import os
+from transformers import  AutoTokenizer
+from utils_elyza import *
+import numpy
+import platform
+
+#from utils_rinna_gpt2 import *
+import ailia
+
+sys.path.append("../../util")
+from arg_utils import get_base_parser, update_parser  # noqa: E402
+from model_utils import check_and_download_models, check_and_download_file  # noqa: E402
+
+# logger
+from logging import getLogger   # noqa: E402
+logger = getLogger(__name__)
+
+# ======================
+# PARAMETERS
+# ======================
+WEIGHT_PATH = "decoder_model.onnx"
+MODEL_PATH = "decoder_model.onnx.prototxt"
+WEIGHT_PB_PATH = "decoder_model.onnx_data"
+tokenizer_path= "elyza/ELYZA-japanese-Llama-2-7b-instruct"
+REMOTE_PATH = "https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/"
+SAVE_PATH="output.txt"
+
+
+DEFAULT_SYSTEM_PROMPT = "あなたは誠実で優秀な日本人のアシスタントです。"
+DEFAULT_TEXT =  "クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。"
+
+
+
+# ======================
+# Arguemnt Parser Config
+# ======================
+
+parser = get_base_parser("elyza text generation", None, SAVE_PATH)
+# overwrite
+parser.add_argument(
+    "--input", "-i", default=DEFAULT_TEXT, 
+    help="input text"
+)
+parser.add_argument(
+    "--outlength", "-o", default=256,
+    help="number of tokens to generate"
+)
+parser.add_argument(
+    "--onnx",
+    action="store_true",
+    help="By default, the ailia SDK is used, but with this option, you can switch to using ONNX Runtime"
+)
+args = update_parser(parser, check_input_type=False)
+
+
+
+
+
+# ======================
+# Main function
+# ======================
+def main():
+    check_and_download_models(WEIGHT_PATH, MODEL_PATH, REMOTE_PATH)
+    check_and_download_file(WEIGHT_PB_PATH, REMOTE_PATH)
+
+    pf = platform.system()
+    if pf == "Darwin":
+        logger.info("This model not optimized for macOS GPU currently. So we will use BLAS (env_id = 1).")
+        args.env_id = 1
+
+    if args.onnx:
+        import onnxruntime
+        # Create ONNX Runtime session with GPU as the execution provider
+        options = onnxruntime.SessionOptions()
+        options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
+        options.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL
+
+        # Specify the execution provider (CUDAExecutionProvider for GPU)
+        providers = ["CUDAExecutionProvider"] if "CUDAExecutionProvider" in onnxruntime.get_available_providers() else ["CPUExecutionProvider"]
+        ailia_model = onnxruntime.InferenceSession(WEIGHT_PATH, providers=providers, sess_options=options)
+    else:
+        memory_mode = ailia.get_memory_mode(True, True, False, True)
+        ailia_model = ailia.Net(MODEL_PATH, WEIGHT_PATH, env_id=args.env_id, memory_mode=memory_mode)
+
+    tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
+    #Generate prompt
+    logger.info("Input : "+args.input)
+    input_promt=generate_prompt(tokenizer, DEFAULT_SYSTEM_PROMPT, args.input)
+
+
+    # inference
+    print("Start of text generation")
+    if args.benchmark:
+        logger.info("BENCHMARK mode")
+        for i in range(5):
+            start = int(round(time.time() * 1000))
+            output = generate_text(tokenizer, ailia_model, input_promt, int(args.outlength), args.onnx)
+            end = int(round(time.time() * 1000))
+            logger.info("\tailia processing time {} ms".format(end - start))
+    else:
+
+        output = generate_text(tokenizer, ailia_model, input_promt, int(args.outlength), args.onnx)
+
+    logger.info("output : "+output)
+    with open(SAVE_PATH, "w") as fo:
+        fo.write(output)
+    logger.info("Script finished successfully.")
+
+
+if __name__ == "__main__":
+    # model files check and download
+    check_and_download_models(WEIGHT_PATH, MODEL_PATH, REMOTE_PATH)
+    main() 
+
diff --git a/natural_language_processing/elyza-japanese-llama-2-7b/requirements.txt b/natural_language_processing/elyza-japanese-llama-2-7b/requirements.txt
@@ -0,0 +1 @@
+transformers
diff --git a/natural_language_processing/elyza-japanese-llama-2-7b/utils_elyza.py b/natural_language_processing/elyza-japanese-llama-2-7b/utils_elyza.py
@@ -0,0 +1,66 @@
+import numpy as np
+
+
+
+def generate_prompt(tokenizer, DEFAULT_SYSTEM_PROMPT, text):
+    B_INST, E_INST = "[INST]", "[/INST]"
+    B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
+
+    prompt = "{bos_token}{b_inst} {system}{prompt} {e_inst} ".format(
+            bos_token=tokenizer.bos_token,
+            b_inst=B_INST,
+            system=f"{B_SYS}{DEFAULT_SYSTEM_PROMPT}{E_SYS}",
+            prompt=text,
+            e_inst=E_INST)
+
+    return prompt
+
+def generate_text(tokenizer, model, span, outputlength, onnx_runtime=False):
+
+    #produce the initial tokens.
+    encoding=tokenizer.encode_plus(span, return_tensors="np", add_special_tokens=False)
+    input_ids = encoding["input_ids"]
+    attention_mask = encoding["attention_mask"]
+    print("Tokenizer is done")
+
+    # Initialize the generated tokens list
+    generated_tokens = input_ids.copy()
+    min_token_idx = -32000
+    max_token_idx = 31999
+    eos_token_id = tokenizer.eos_token_id 
+    attention_mask = attention_mask.reshape(1, -1)
+
+    for _ in range(outputlength):
+        # Create input dictionary
+        input_dict = {
+            "input_ids": np.array(generated_tokens, dtype=np.int64),
+            "attention_mask": np.array(attention_mask, dtype=np.int64)
+        }
+
+        # Run the inference to get logits
+        if onnx_runtime:
+            logits = model.run(None,input_dict)
+        else:
+            logits = model.run(input_dict)
+        logits = np.array(logits[0])
+
+        # Get the logits for the next token
+        next_token_logits = logits[0, -1, :]
+
+        # Sample the next token using the logits (you may use different strategies for sampling)
+        next_token_id = np.argmax(next_token_logits)
+        next_token_id = max(min_token_idx, min(max_token_idx, next_token_id))
+        generated_tokens = np.concatenate((generated_tokens, [[next_token_id]]), axis=-1)
+
+        # Update the attention_mask to consider the newly generated token
+        attention_mask = np.concatenate((attention_mask, np.ones((1, 1), dtype=np.int64)), axis=1)
+
+        if next_token_id == eos_token_id:
+            break
+
+    out_str = tokenizer.decode(generated_tokens[0][input_ids.shape[1]: ],  skip_special_tokens=True)
+
+
+    return out_str 
+
+
diff --git a/scripts/download_all_models.sh b/scripts/download_all_models.sh
@@ -201,6 +201,7 @@ cd ../../natural_language_processing/t5_base_japanese_title_generation; python3
 cd ../../natural_language_processing/multilingual-e5; python3 multilingual-e5.py ${OPTION}
 cd ../../natural_language_processing/t5_whisper_medical; python3 t5_whisper_medical.py ${OPTION}
 cd ../../natural_language_processing/t5_base_japanese_summarization; python3 t5_base_japanese_summarization.py ${OPTION}
+cd ../../natural_language_processing/elyza-japanese-llama-2-7b; python3 elyza.py ${OPTION}
 cd ../../neural_rendering/nerf; python3 nerf.py ${OPTION}
 cd ../../object_detection/centernet; python3 centernet.py ${OPTION}
 cd ../../object_detection/m2det; python3 m2det.py ${OPTION}

diff --git a/util/model_utils.py b/util/model_utils.py
@@ -102,5 +102,5 @@ def check_and_download_file(file_path, remote_path):
 
     if not os.path.exists(file_path):
         logger.info('Downloading %s...' % file_path)
-        urlretrieve(remote_path, file_path, progress_print)
+        urlretrieve(remote_path + os.path.basename(file_path), file_path, progress_print)
     logger.info('%s is prepared!' % file_path)