Replies: 5 comments
-
also encountered this issue, were you able to resolve the cuBLAS error? |
Beta Was this translation helpful? Give feedback.
0 replies
-
你好,我已收到信件,在阅读后会尽快给回复。祝好。
|
Beta Was this translation helpful? Give feedback.
0 replies
-
This NVIDIA/TransformerEngine#845 may help. Try to install TE after this commit. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
0 replies
-
你好,我已收到信件,在阅读后会尽快给回复。祝好。
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Your question
Ask a clear and concise question about Megatron-LM.
my train_bert_340m_distributed.sh(specialize part) is as follows:
CHECKPOINT_PATH=/proj/bert/checkpoints/null/
TENSORBOARD_LOGS_PATH=/proj/bert/logs
VOCAB_FILE=/proj/bert/checkpoints/bert-large-uncased-vocab.txt
DATA_PATH=/proj/bert/dataset/ag_news_text_sentence
the error is:
Beta Was this translation helpful? Give feedback.
All reactions