[QUESTION] vicuna-7b-v1.5 weight conversion from huggingface to megatron-lm format #1181
Replies: 6 comments
-
I'm also interested in this, and more generally how Megatron can be used to convert from HF, continue pretraining, and convert back to HF. |
Beta Was this translation helpful? Give feedback.
-
same issue on different model |
Beta Was this translation helpful? Give feedback.
-
My understanding is that
|
Beta Was this translation helpful? Give feedback.
-
Also, if you do need |
Beta Was this translation helpful? Give feedback.
-
Thanks, man. I use And I change the convert command into Anyway, your answer really helps me a lot, before that I checked for a long time to figure out the problems |
Beta Was this translation helpful? Give feedback.
-
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
-
I am trying to convert the weight for
vicuna-7b-v1.5
in huggingface transformers ( https://huggingface.co/lmsys/vicuna-7b-v1.5 ) to be used with megatron-lm.I am using
tools/checkpoint/convert.py
to do the conversion.The command I used is as follows:
When I run it, I get an error like this:
I looked into it, and it seems this error happens here:
Megatron-LM/megatron/core/parallel_state.py
Lines 563 to 569 in 7fe863f
because
_TENSOR_MODEL_PARALLEL_GROUP
does not have a value set.However, I found that
_TENSOR_MODEL_PARALLEL_GROUP
is only set here in the whole code:Megatron-LM/megatron/core/parallel_state.py
Line 379 in 7fe863f
and this function
initialize_model_parallel
does not seem to be called during the weight conversion.How can I correctly do the weight conversion?
Beta Was this translation helpful? Give feedback.
All reactions