[QUESTION] How to pre-build the dataset's index ? #1185
Unanswered
etiennemlb
asked this question in
Q&A
Replies: 2 comments
-
you can use --data-cache-path to specify where you want to cache. And precompute it using a single node. Megatron-LM/megatron/training/arguments.py Lines 1349 to 1350 in 9de386d |
Beta Was this translation helpful? Give feedback.
0 replies
-
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
How to pre-build the dataset's index ?
I want to avoid using compute node for this task:
Beta Was this translation helpful? Give feedback.
All reactions