You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to convert Qwen2-72B-Instruct locally
However, it seems run time error and I don't know how to solve it
this is my code:
from transformers import AutoTokenizer, AutoModelForCausalLM
from auto_fp8 import AutoFP8ForCausalLM, BaseQuantizeConfig
pretrained_model_dir = "qwen/Qwen2-72B-Instruct"
quantized_model_dir = "qwen/Qwen2-72B-Instruct_FP8"
# Define quantization config with static activation scales
quantize_config = BaseQuantizeConfig(quant_method="fp8", activation_scheme="dynamic")
# For dynamic activation scales, there is no need for calbration examples
examples = []
# Load the model, quantize, and save checkpoint
model = AutoFP8ForCausalLM.from_pretrained(pretrained_model_dir, quantize_config)
model.quantize(examples)
model.save_quantized(quantized_model_dir)
I am trying to convert Qwen2-72B-Instruct locally
However, it seems run time error and I don't know how to solve it
this is my code:
and the error is:
The text was updated successfully, but these errors were encountered: