Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added Japanese LLama elyza #1310

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,7 @@ The collection of pre-trained, state-of-the-art AI models.
|[bert_insert_punctuation](/natural_language_processing/bert_insert_punctuation) | [bert-japanese](https://github.com/cl-tohoku/bert-japanese) | Pytorch | 1.2.15 and later |
|[t5_whisper_medical](/natural_language_processing/t5_whisper_medical) | error correction of medical terms using t5 | Pytorch | 1.2.13 and later | |
|[t5_base_summarization](/natural_language_processing/t5_base_japanese_summarization) | [t5-japanese](https://github.com/sonoisa/t5-japanese) | Pytorch | 1.2.13 and later |
|[elyza-japanese-llama-2-7b](/natural_language_processing/elyza-japanese-llama-2-7b) | [ELYZA-japanese-Llama-2-7b](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b) | Pytorch | 1.2.16 and later |

## Neural Rendering

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
LLAMA 2 COMMUNITY LICENSE AGREEMENT
Llama 2 Version Release Date: July 18, 2023

"Agreement" means the terms and conditions for use, reproduction, distribution and
modification of the Llama Materials set forth herein.

"Documentation" means the specifications, manuals and documentation
accompanying Llama 2 distributed by Meta at ai.meta.com/resources/models-and-
libraries/llama-downloads/.

"Licensee" or "you" means you, or your employer or any other person or entity (if
you are entering into this Agreement on such person or entity's behalf), of the age
required under applicable laws, rules or regulations to provide legal consent and that
has legal authority to bind your employer or such other person or entity if you are
entering in this Agreement on their behalf.

"Llama 2" means the foundational large language models and software and
algorithms, including machine-learning model code, trained model weights,
inference-enabling code, training-enabling code, fine-tuning enabling code and other
elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-
libraries/llama-downloads/.

"Llama Materials" means, collectively, Meta's proprietary Llama 2 and
Documentation (and any portion thereof) made available under this Agreement.

"Meta" or "we" means Meta Platforms Ireland Limited (if you are located in or, if you
are an entity, your principal place of business is in the EEA or Switzerland) and Meta
Platforms, Inc. (if you are located outside of the EEA or Switzerland).

By clicking "I Accept" below or by using or distributing any portion or element of the
Llama Materials, you agree to be bound by this Agreement.

1. License Rights and Redistribution.

a. Grant of Rights. You are granted a non-exclusive, worldwide, non-
transferable and royalty-free limited license under Meta's intellectual property or
other rights owned by Meta embodied in the Llama Materials to use, reproduce,
distribute, copy, create derivative works of, and make modifications to the Llama
Materials.

b. Redistribution and Use.

i. If you distribute or make the Llama Materials, or any derivative works
thereof, available to a third party, you shall provide a copy of this Agreement to such
third party.
ii. If you receive Llama Materials, or any derivative works thereof, from
a Licensee as part of an integrated end user product, then Section 2 of this
Agreement will not apply to you.

iii. You must retain in all copies of the Llama Materials that you
distribute the following attribution notice within a "Notice" text file distributed as a
part of such copies: "Llama 2 is licensed under the LLAMA 2 Community License,
Copyright (c) Meta Platforms, Inc. All Rights Reserved."

iv. Your use of the Llama Materials must comply with applicable laws
and regulations (including trade compliance laws and regulations) and adhere to the
Acceptable Use Policy for the Llama Materials (available at
https://ai.meta.com/llama/use-policy), which is hereby incorporated by reference into
this Agreement.

v. You will not use the Llama Materials or any output or results of the
Llama Materials to improve any other large language model (excluding Llama 2 or
derivative works thereof).

2. Additional Commercial Terms. If, on the Llama 2 version release date, the
monthly active users of the products or services made available by or for Licensee,
or Licensee's affiliates, is greater than 700 million monthly active users in the
preceding calendar month, you must request a license from Meta, which Meta may
grant to you in its sole discretion, and you are not authorized to exercise any of the
rights under this Agreement unless or until Meta otherwise expressly grants you
such rights.

3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE
LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE
PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY
WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR
FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE
FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING
THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR
USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.

4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE
LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT,
NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS
AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL,
CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN
IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF
ANY OF THE FOREGOING.

5. Intellectual Property.

a. No trademark licenses are granted under this Agreement, and in
connection with the Llama Materials, neither Meta nor Licensee may use any name
or mark owned by or associated with the other or any of its affiliates, except as
required for reasonable and customary use in describing and redistributing the
Llama Materials.

b. Subject to Meta's ownership of Llama Materials and derivatives made by or
for Meta, with respect to any derivative works and modifications of the Llama
Materials that are made by you, as between you and Meta, you are and will be the
owner of such derivative works and modifications.

c. If you institute litigation or other proceedings against Meta or any entity
(including a cross-claim or counterclaim in a lawsuit) alleging that the Llama
Materials or Llama 2 outputs or results, or any portion of any of the foregoing,
constitutes infringement of intellectual property or other rights owned or licensable
by you, then any licenses granted to you under this Agreement shall terminate as of
the date such litigation or claim is filed or instituted. You will indemnify and hold
harmless Meta from and against any claim by any third party arising out of or related
to your use or distribution of the Llama Materials.

6. Term and Termination. The term of this Agreement will commence upon your
acceptance of this Agreement or access to the Llama Materials and will continue in
full force and effect until terminated in accordance with the terms and conditions
herein. Meta may terminate this Agreement if you are in breach of any term or
condition of this Agreement. Upon termination of this Agreement, you shall delete
and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the
termination of this Agreement.

7. Governing Law and Jurisdiction. This Agreement will be governed and
construed under the laws of the State of California without regard to choice of law
principles, and the UN Convention on Contracts for the International Sale of Goods
does not apply to this Agreement. The courts of California shall have exclusive
jurisdiction of any dispute arising out of this Agreement.
46 changes: 46 additions & 0 deletions natural_language_processing/elyza-japanese-llama-2-7b/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Text Generation Japanese LLama Elyza

### input
A `SENTENCE`in japanese, example: クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。

### output
A text, consisting of a number of tokens equivalent to the "--outlength" parameter, is stored as a TXT file at the specified "SAVE_PATH" location.

### Usage
Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.

For the sample file use following command.
```
pip3 install -r requirements.txt

python3 elyza.py
```

You can use an orbitary sentence with:

```
python3 elyza.py --input クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。
```

If you want to specify the maximum number of generated tockens (e.g. 300 tockens), please use following command:

```
python3 elyza.py --outlength 300
```

if you want to run a benchmark of the model inference, you can use :

```
python3 elyza.py --benchmark
```
### Framework
PyTorch, HuggingFace Transformers

### Model Format
ONNX opset = 13

### Netron

[Elyza LLama model](LICENSE_LLama)

- [decoder_model.onnx.prototxt](https://netron.app/?url=https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/decoder_model.onnx.prototxt)
116 changes: 116 additions & 0 deletions natural_language_processing/elyza-japanese-llama-2-7b/elyza.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
import time
import sys
import os
from transformers import AutoTokenizer
from utils_elyza import *
import numpy
import platform

#from utils_rinna_gpt2 import *
import ailia

sys.path.append("../../util")
from arg_utils import get_base_parser, update_parser # noqa: E402
from model_utils import check_and_download_models, check_and_download_file # noqa: E402

# logger
from logging import getLogger # noqa: E402
logger = getLogger(__name__)

# ======================
# PARAMETERS
# ======================
WEIGHT_PATH = "decoder_model.onnx"
MODEL_PATH = "decoder_model.onnx.prototxt"
WEIGHT_PB_PATH = "decoder_model.onnx_data"
tokenizer_path= "elyza/ELYZA-japanese-Llama-2-7b-instruct"
REMOTE_PATH = "https://storage.googleapis.com/ailia-models/elyza-japanese-llama-2-7b/"
SAVE_PATH="output.txt"


DEFAULT_SYSTEM_PROMPT = "あなたは誠実で優秀な日本人のアシスタントです。"
DEFAULT_TEXT = "クマが海辺に行ってアザラシと友達になり、最終的には家に帰るというプロットの短編小説を書いてください。"



# ======================
# Arguemnt Parser Config
# ======================

parser = get_base_parser("elyza text generation", None, SAVE_PATH)
# overwrite
parser.add_argument(
"--input", "-i", default=DEFAULT_TEXT,
help="input text"
)
parser.add_argument(
"--outlength", "-o", default=256,
help="number of tokens to generate"
)
parser.add_argument(
"--onnx",
action="store_true",
help="By default, the ailia SDK is used, but with this option, you can switch to using ONNX Runtime"
)
args = update_parser(parser, check_input_type=False)





# ======================
# Main function
# ======================
def main():
check_and_download_models(WEIGHT_PATH, MODEL_PATH, REMOTE_PATH)
check_and_download_file(WEIGHT_PB_PATH, REMOTE_PATH)

pf = platform.system()
if pf == "Darwin":
logger.info("This model not optimized for macOS GPU currently. So we will use BLAS (env_id = 1).")
args.env_id = 1

if args.onnx:
import onnxruntime
# Create ONNX Runtime session with GPU as the execution provider
options = onnxruntime.SessionOptions()
options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
options.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL

# Specify the execution provider (CUDAExecutionProvider for GPU)
providers = ["CUDAExecutionProvider"] if "CUDAExecutionProvider" in onnxruntime.get_available_providers() else ["CPUExecutionProvider"]
ailia_model = onnxruntime.InferenceSession(WEIGHT_PATH, providers=providers, sess_options=options)
else:
memory_mode = ailia.get_memory_mode(True, True, False, True)
ailia_model = ailia.Net(MODEL_PATH, WEIGHT_PATH, env_id=args.env_id, memory_mode=memory_mode)

tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
#Generate prompt
logger.info("Input : "+args.input)
input_promt=generate_prompt(tokenizer, DEFAULT_SYSTEM_PROMPT, args.input)


# inference
print("Start of text generation")
if args.benchmark:
logger.info("BENCHMARK mode")
for i in range(5):
start = int(round(time.time() * 1000))
output = generate_text(tokenizer, ailia_model, input_promt, int(args.outlength), args.onnx)
end = int(round(time.time() * 1000))
logger.info("\tailia processing time {} ms".format(end - start))
else:

output = generate_text(tokenizer, ailia_model, input_promt, int(args.outlength), args.onnx)

logger.info("output : "+output)
with open(SAVE_PATH, "w") as fo:
fo.write(output)
logger.info("Script finished successfully.")


if __name__ == "__main__":
# model files check and download
check_and_download_models(WEIGHT_PATH, MODEL_PATH, REMOTE_PATH)
main()

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
transformers
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
import numpy as np



def generate_prompt(tokenizer, DEFAULT_SYSTEM_PROMPT, text):
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

prompt = "{bos_token}{b_inst} {system}{prompt} {e_inst} ".format(
bos_token=tokenizer.bos_token,
b_inst=B_INST,
system=f"{B_SYS}{DEFAULT_SYSTEM_PROMPT}{E_SYS}",
prompt=text,
e_inst=E_INST)

return prompt

def generate_text(tokenizer, model, span, outputlength, onnx_runtime=False):

#produce the initial tokens.
encoding=tokenizer.encode_plus(span, return_tensors="np", add_special_tokens=False)
input_ids = encoding["input_ids"]
attention_mask = encoding["attention_mask"]
print("Tokenizer is done")

# Initialize the generated tokens list
generated_tokens = input_ids.copy()
min_token_idx = -32000
max_token_idx = 31999
eos_token_id = tokenizer.eos_token_id
attention_mask = attention_mask.reshape(1, -1)

for _ in range(outputlength):
# Create input dictionary
input_dict = {
"input_ids": np.array(generated_tokens, dtype=np.int64),
"attention_mask": np.array(attention_mask, dtype=np.int64)
}

# Run the inference to get logits
if onnx_runtime:
logits = model.run(None,input_dict)
else:
logits = model.run(input_dict)
logits = np.array(logits[0])

# Get the logits for the next token
next_token_logits = logits[0, -1, :]

# Sample the next token using the logits (you may use different strategies for sampling)
next_token_id = np.argmax(next_token_logits)
next_token_id = max(min_token_idx, min(max_token_idx, next_token_id))
generated_tokens = np.concatenate((generated_tokens, [[next_token_id]]), axis=-1)

# Update the attention_mask to consider the newly generated token
attention_mask = np.concatenate((attention_mask, np.ones((1, 1), dtype=np.int64)), axis=1)

if next_token_id == eos_token_id:
break

out_str = tokenizer.decode(generated_tokens[0][input_ids.shape[1]: ], skip_special_tokens=True)


return out_str


1 change: 1 addition & 0 deletions scripts/download_all_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ cd ../../natural_language_processing/t5_base_japanese_title_generation; python3
cd ../../natural_language_processing/multilingual-e5; python3 multilingual-e5.py ${OPTION}
cd ../../natural_language_processing/t5_whisper_medical; python3 t5_whisper_medical.py ${OPTION}
cd ../../natural_language_processing/t5_base_japanese_summarization; python3 t5_base_japanese_summarization.py ${OPTION}
cd ../../natural_language_processing/elyza-japanese-llama-2-7b; python3 elyza.py ${OPTION}
cd ../../neural_rendering/nerf; python3 nerf.py ${OPTION}
cd ../../object_detection/centernet; python3 centernet.py ${OPTION}
cd ../../object_detection/m2det; python3 m2det.py ${OPTION}
Expand Down
2 changes: 1 addition & 1 deletion util/model_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,5 +102,5 @@ def check_and_download_file(file_path, remote_path):

if not os.path.exists(file_path):
logger.info('Downloading %s...' % file_path)
urlretrieve(remote_path, file_path, progress_print)
urlretrieve(remote_path + os.path.basename(file_path), file_path, progress_print)
logger.info('%s is prepared!' % file_path)