You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The torchchat framework provides an excellent platform for embedding models into many different edge-centric platforms.
The Granite Code models, specifically the 3B-128k and 8B-128k variants, are a family of models from IBM that support a wide variety of code-related tasks. The models are released under the Apache-3 license and are therefore well-suited to embedded use-cases where code intelligence is needed.
The request here is to extend the model support in torchchat to support running the 3B and 8B long-context variants of Granite Code in order to enable usage of these models across embedded use-cases.
Alternatives
Depending on the goals of the torchchat framework, extending support to non-llama models may or may not be a project goal. There are other embedded frameworks out there (notably llama.cpp and the many projects that wrap it), so these can be used to run Granite Code in embedded environments. Our goal at IBM is to provide users with as many choices as possible on how to run all of our Granite family models, so our hope is that torchchat can be a strong piece of this story!
Additional context
The 3B and 8B models use the llama architecture in transformers, so they are close to fully supported as-is. There are a few crucial pieces that are present in the transformers implementation that are missing in torchchat:
I've worked through the initial steps of solving all of these outstanding issues (see the corresponding issues). Once these are solved, the addition of these Granite Code models should consist of the following steps:
🚀 The feature, motivation and pitch
The
torchchat
framework provides an excellent platform for embedding models into many different edge-centric platforms.The Granite Code models, specifically the 3B-128k and 8B-128k variants, are a family of models from IBM that support a wide variety of code-related tasks. The models are released under the Apache-3 license and are therefore well-suited to embedded use-cases where code intelligence is needed.
The request here is to extend the model support in
torchchat
to support running the 3B and 8B long-context variants of Granite Code in order to enable usage of these models across embedded use-cases.Alternatives
Depending on the goals of the
torchchat
framework, extending support to non-llama models may or may not be a project goal. There are other embedded frameworks out there (notablyllama.cpp
and the many projects that wrap it), so these can be used to run Granite Code in embedded environments. Our goal at IBM is to provide users with as many choices as possible on how to run all of our Granite family models, so our hope is thattorchchat
can be a strong piece of this story!Additional context
The 3B and 8B models use the
llama
architecture intransformers
, so they are close to fully supported as-is. There are a few crucial pieces that are present in thetransformers
implementation that are missing intorchchat
:tokenizers
tokenizers #1251RFC (Optional)
I've worked through the initial steps of solving all of these outstanding issues (see the corresponding issues). Once these are solved, the addition of these Granite Code models should consist of the following steps:
The text was updated successfully, but these errors were encountered: