-
Notifications
You must be signed in to change notification settings - Fork 206
Pull requests: mit-han-lab/llm-awq
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Replace FasterTransformers like KV cache layout and kernel with flash attention for better support for longer sequence
#239
opened Nov 16, 2024 by
JerryGJX
Loading…
Suggest: Add Bayesian optimization support for ratio search
#104
opened Oct 26, 2023 by
trotsky1997
Loading…
ProTip!
Follow long discussions with comments:>50.