-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Pull requests: microsoft/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Avoid poisoning process with CUDA calls as soon as importing
#6810
opened Nov 29, 2024 by
HollowMan6
Loading…
Zero2: avoid graph breaks in torch.compile by using param_idx
#6803
opened Nov 28, 2024 by
nelyahu
Loading…
Inference UTs check for trition support from accelerator
#6782
opened Nov 25, 2024 by
raza-sikander
Loading…
Add the missing view operations from sequence parallel(async).
#6750
opened Nov 14, 2024 by
inkcherry
Loading…
Training ops kernels: Speeding up the Llama-based MoE architectures
#6734
opened Nov 8, 2024 by
RezaYazdaniAminabadi
•
Draft
Allow launcher to include
--include=node3
, not just --include=node3:1,2,3,4,5,6,7,8
#6698
opened Nov 1, 2024 by
stephen-nju
Loading…
Reduce the device bubble introduced by heavy loop synchronization in coalesced fetch/release(z3_leaf_module)
#6694
opened Oct 31, 2024 by
inkcherry
Loading…
Support the parallel conversion from ZeRO checkpoints to FP32/FP16/BF16 param weight
#6655
opened Oct 23, 2024 by
xylian86
Loading…
5 tasks done
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.