Actions: neuralmagic/AutoFP8
Actions
62 workflow runs
62 workflow runs
kv_scale
into k_scale
and v_scale
(#25)
test
#65:
Commit 2cd265f
pushed
by
mgoin
kv_scale
into k_scale
and v_scale
test
#54:
Pull request #25
synchronize
by
mgoin
kv_scale
into k_scale
and v_scale
test
#53:
Pull request #25
opened
by
mgoin
torch.inference_mode()
for lower memory usage during calibratio…
test
#43:
Commit b1c6ad6
pushed
by
mgoin
torch.inference_mode()
for lower memory usage during calibration
test
#42:
Pull request #20
opened
by
mgoin