-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gguf cleanup #230
Gguf cleanup #230
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ensure at least as much green as before landing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking at the CI test cases. It is still very basic.
Can we add the following e2e test:
- Download gguf fp version.
- Load gguf and convert to nn.Module's state_dict and save the checkpoint.
- Run generate.py on the new checkpoint.
build/builder.py
Outdated
if builder_args.gguf_path: | ||
model = _load_model_gguf(builder_args) | ||
else: | ||
model = _load_model_not_gguf(builder_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this function. _load_model_default()
* prompt * chat_mode, num_samples
* prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos
* remove redundancy * no int4 linear on ET
@mergennachin I'll add the CI test you outline in a follow-up PR. We do have an existing CI test that calls generate on a quantized GGUF file here: https://github.com/pytorch/torchchat/blob/main/.github/workflows/compile-gguf.yml. |
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
* clean up gguf loading. Move model loading to meta. * remove cpu * Fix CI and validation scripts (#154) * missing device (#232) * Use generator args to group all arguments to generator (#231) * prompt * chat_mode, num_samples * Move more generator args to use dataclass (#233) * prompt * chat_mode, num_samples * move more args * more gen args * update * args * undo some changes * typos * Minor lint fixes (#236) * remove redundancy & remove int4 linear test from ET tests (#237) * remove redundancy * no int4 linear on ET * small changes --------- Co-authored-by: Guang Yang <[email protected]> Co-authored-by: Michael Gschwind <[email protected]> Co-authored-by: Mergen Nachin <[email protected]>
Clean up gguf loading code. Move to use meta device. Share more code with non-gguf path.