Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add OLMoE #9317

Open
4 tasks done
natolambert opened this issue Sep 4, 2024 · 8 comments
Open
4 tasks done

Feature Request: Add OLMoE #9317

natolambert opened this issue Sep 4, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@natolambert
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Add this model (and other variants) https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct

Motivation

We recently released the OLMoE model at Ai2. 1.3b active / 6.9b total param MoE model. Seems solid, and we'd love people to use it.

Possible Implementation

Should be able to quickly use mix of existing OLMo implementation + Transformers version https://github.com/huggingface/transformers/blob/main/src/transformers/models/olmoe/modeling_olmoe.py

@natolambert natolambert added the enhancement New feature or request label Sep 4, 2024
@rkinas
Copy link

rkinas commented Sep 5, 2024

Voted! It is ideal for mobile solution. Quantized will be even better :)

@natolambert
Copy link
Author

I may try this in my free time, but I'm kind of chaotic these days with research + writing so not too optimistic on timeline. I'll also try and get the OLMoE lead on it after VLLM.

@Meshwa428
Copy link

Any updates

@foldl
Copy link
Contributor

foldl commented Sep 13, 2024

#9462 is doing this.

And, I have also implemented this in ChatLLM.cpp.

@github-actions github-actions bot added the stale label Oct 14, 2024
@dirkgr
Copy link

dirkgr commented Oct 16, 2024

Looks like this is done?

@felladrin
Copy link
Contributor

Looks like this is done?

Indeed. Should have been completed by this PR:

GGUFs are already available on HF

@github-actions github-actions bot removed the stale label Oct 17, 2024
@github-actions github-actions bot added the stale label Nov 16, 2024
@mseri
Copy link

mseri commented Nov 27, 2024

Does this mean that it will be easier to implement compatibility to https://allenai.org/blog/olmo2 ?

@mseri
Copy link

mseri commented Nov 27, 2024

Already there: #10500 :)

@github-actions github-actions bot removed the stale label Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants