-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add OLMoE #9317
Comments
Voted! It is ideal for mobile solution. Quantized will be even better :) |
I may try this in my free time, but I'm kind of chaotic these days with research + writing so not too optimistic on timeline. I'll also try and get the OLMoE lead on it after VLLM. |
Any updates |
#9462 is doing this. And, I have also implemented this in ChatLLM.cpp. |
Looks like this is done? |
Indeed. Should have been completed by this PR: GGUFs are already available on HF |
Does this mean that it will be easier to implement compatibility to https://allenai.org/blog/olmo2 ? |
Already there: #10500 :) |
Prerequisites
Feature Description
Add this model (and other variants) https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct
Motivation
We recently released the OLMoE model at Ai2. 1.3b active / 6.9b total param MoE model. Seems solid, and we'd love people to use it.
Possible Implementation
Should be able to quickly use mix of existing OLMo implementation + Transformers version https://github.com/huggingface/transformers/blob/main/src/transformers/models/olmoe/modeling_olmoe.py
The text was updated successfully, but these errors were encountered: