Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCoder, Request for adding a model #2058

Open
insop opened this issue Nov 23, 2024 · 2 comments
Open

OpenCoder, Request for adding a model #2058

insop opened this issue Nov 23, 2024 · 2 comments

Comments

@insop
Copy link

insop commented Nov 23, 2024

The OpenCoder team has released OpenCoder 1.5B and 8B models. They seem very promising.
Requesting Team to add this model in torchtune.

Thank you!

@ebsmothers
Copy link
Contributor

ebsmothers commented Nov 23, 2024

Hi @insop thanks for creating the issue. Given the model is still relatively new we would like to wait and see a bit before onboarding it as part of our core offering. Fortunately that shouldn't stop you from being able to finetune it with torchtune. We encourage folks to plug in custom components, and for this model it should be relatively easy to do so. Since the architecture is the same as Llama you should be able to do the following:

from torchtune.models.llama3._component_builders import llama3

opencoder_8b = llama3(
	vocab_size=96640,
	num_layers=32,
	num_heads=32,
	num_kv_heads=8,
	embed_dim=4096,
	max_seq_len=8192,
	rope_base=500000.0,
	intermediate_dim=14336,
)

The tokenizer I will need to look at a bit more closely, but given that it appears to use SentencePiece with some additional preprocessing I suspect it should be a small modification off of our Llama2Tokenizer. But happy to provide more detailed pointers to help you getting started here.

@insop
Copy link
Author

insop commented Nov 24, 2024

Hi @ebsmothers

Thank you so much for how I could approach, it makes sense on when to bring the model in.

I am new to torchtune, so anypointer will be helpful and appreciated.
Please do let me know if you have any pointers on the tokenizer or brining new model related PR.

Thank you,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants