-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
模型并行的方式进行lora方式的finetuning要怎么设置呢 #130
Comments
如果是VisualGLM的话,可以参考这个issue:THUDM/VisualGLM-6B#209 (comment) 模型并行里使用lora需要构建好模型以后再model.add_mixin(lora) |
我这边微调的是GLM-10B-zh,参考这个方式,还是出现一些新的问题,如果使用model, args = FineTuneGLMModel.from_pretrained("glm-large-zh", args=args, overwrite_args={'model_parallel_size':2}),则会出现ValueError: model_parallel_size is inconsistent with prior configuration.We currently do not support changing model_parallel_size.这个问题,在去除overwrite_args={'model_parallel_size':2},和调整layer_range则还是有原来维度的问题 |
需要在训练脚本里设置 |
非常感谢您的耐心解答,我这边已经能够调通了! |
还有一个问题就是会出现if not (attention_mask.shape[-2] == 1 and (attention_mask > 0).all()): IndexError: tuple index out of range的问题,看了一下我的模型输入的attention_mask的维度是torch.Size([2, 1, 520, 520]),但是在transformer_defaults.py的29行前输出attention_mask的维度时,首先是torch.Size([2, 1, 520, 520]),后面却是torch.Size([540800])了 |
我设置会出现mp_size=2以后进行lora训练,会出现维度不匹配的问题
The text was updated successfully, but these errors were encountered: