Skip to content

Latest commit

 

History

History
203 lines (191 loc) · 35.4 KB

README.md

File metadata and controls

203 lines (191 loc) · 35.4 KB

The Open Language Models List

📄 Introduction

This is a list of permissively licensed language models with MIT, Apache 2.0, or other similar licenses. We are using the term language model broadly here to include not only autoregressive models but also models that were trained with different objectives such as MLM.

This work was mostly inspired by Stella Biderman's Directory of Generative AI, and The Foundation Model Development Cheatsheet. But unlike these two very comprehensive sources, this work is meant to be a quick and more focused reference.

  • 👑: Model + Data + Code
  • ⭐: Model + Data
  • ⚡: Model + Code

Important

This is still a work in progress. Contributions, corrections, and feedback are very welcome!

🤖 Models

Model Parameters Architecture Encoder Decoder MoE Year Hugging Face License
GPT-1 120M Transformer - - 2018 🤗 MIT
BERT-Base-Cased 110M Transformer - - 2018 🤗 Apache 2.0
BERT-Base-Uncased 110M Transformer - - 2018 🤗 Apache 2.0
BERT-Large-Cased 340M Transformer - - 2018 🤗 Apache 2.0
BERT-Large-Uncased 340M Transformer - - 2018 🤗 Apache 2.0
GPT-2-Small 124M Transformer - - 2019 🤗 MIT
GPT-2-Medium 355M Transformer - - 2019 🤗 MIT
GPT-2-Large 774M Transformer - - 2019 🤗 MIT
GPT-2-XL 1.5B Transformer - - 2019 🤗 MIT
T5-Small👑 60M Transformer - 2019 🤗 Apache 2.0
T5-Base👑 220M Transformer - 2019 🤗 Apache 2.0
T5-Large👑 770M Transformer - 2019 🤗 Apache 2.0
T5-3B👑 3B Transformer - 2019 🤗 Apache 2.0
T5-11B👑 11B Transformer - 2019 🤗 Apache 2.0
XLM-RoBERTa-Large 560M Transformer - - 2019 🤗 MIT
XLM-RoBERTa-Base 250M Transformer - - 2019 🤗 MIT
RoBERTa-Base 125M Transformer - - 2019 🤗 MIT
RoBERTa-Large 355M Transformer - - 2019 🤗 MIT
DistilBERT-Base-Cased 66M Transformer - - 2019 🤗 Apache 2.0
DistilBERT-Base-Uncased 66M Transformer - - 2019 🤗 Apache 2.0
ALBERT-Base 12M Transformer - - 2019 🤗 Apache 2.0
ALBERT-Large 18M Transformer - - 2019 🤗 Apache 2.0
ALBERT-XLarge 60M Transformer - - 2019 🤗 Apache 2.0
ALBERT-XXLarge 235M Transformer - - 2019 🤗 Apache 2.0
DeBERTa-Base 134M Transformer - - 2020 🤗 MIT
DeBERTa-Large 350M Transformer - - 2020 🤗 MIT
DeBERTa-XLarge 750M Transformer - - 2020 🤗 MIT
ELECTRA-Small-Discriminator 14M Transformer - - 2020 🤗 Apache 2.0
ELECTRA-Base-Discriminator 110M Transformer - - 2020 🤗 Apache 2.0
ELECTRA-Large-Discriminator 335M Transformer - - 2020 🤗 Apache 2.0
GPT-Neo-125M👑 125M Transformer - - 2021 🤗 MIT
GPT-Neo-1.3B👑 1.3B Transformer - - 2021 🤗 MIT
GPT-Neo-2.7B👑 2.7B Transformer - - 2021 🤗 MIT
GPT-J👑 6B Transformer - - 2021 🤗 Apache 2.0
XLM-RoBERTa-XL 3.5B Transformer - - 2021 🤗 MIT
XLM-RoBERTa-XXL 10.7B Transformer - - 2021 🤗 MIT
DeBERTa-v2-XLarge 900M Transformer - - 2021 🤗 MIT
DeBERTa-v2-XXLarge 1.5M Transformer - - 2021 🤗 MIT
DeBERTa-v3-XSmall 22M Transformer - - 2021 🤗 MIT
DeBERTa-v3-Small 44M Transformer - - 2021 🤗 MIT
DeBERTa-v3-Base 86M Transformer - - 2021 🤗 MIT
DeBERTa-v3-Large 304M Transformer - - 2021 🤗 MIT
mDeBERTa-v3-Base 86M Transformer - - 2021 🤗 MIT
GPT-NeoX👑 20B Transformer - - 2022 🤗 Apache 2.0
UL2👑 20B Transformer - 2022 🤗 Apache 2.0
YaLM⚡ 100B Transformer - - 2022 🤗 Apache 2.0
Pythia-14M👑 14M Transformer - - 2023 🤗 Apache 2.0
Pythia-70M👑 70M Transformer - - 2023 🤗 Apache 2.0
Pythia-160M👑 160M Transformer - - 2023 🤗 Apache 2.0
Pythia-410M👑 410M Transformer - - 2023 🤗 Apache 2.0
Pythia-1B👑 1B Transformer - - 2023 🤗 Apache 2.0
Pythia-1.4B👑 1.4B Transformer - - 2023 🤗 Apache 2.0
Pythia-2.8B👑 2.8B Transformer - - 2023 🤗 Apache 2.0
Pythia-6.9B👑 6.9B Transformer - - 2023 🤗 Apache 2.0
Pythia-12B👑 12B Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-111M⭐ 111M Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-256M⭐ 256M Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-590M⭐ 590M Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-1.3B⭐ 1.3B Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-2.7B⭐ 2.7B Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-6.7B⭐ 6.7B Transformer - - 2023 🤗 Apache 2.0
Cerebras-GPT-13B⭐ 13B Transformer - - 2023 🤗 Apache 2.0
BTLM👑 3B Transformer - - 2023 🤗 Apache 2.0
Phi-1 1.3B Transformer - - 2023 🤗 MIT
Phi-1.5 1.3B Transformer - - 2023 🤗 MIT
Phi-2 2.7B Transformer - - 2023 🤗 MIT
RedPajama-INCITE-3B👑 2.8B Transformer - - 2023 🤗 Apache 2.0
RedPajama-INCITE-7B👑 6.9B Transformer - - 2023 🤗 Apache 2.0
FLM 101B Transformer - - 2023 🤗 Apache 2.0
MPT-1B 1.3B Transformer - - 2023 🤗 Apache 2.0
MPT-7B 7B Transformer - - 2023 🤗 Apache 2.0
MPT-7B-8K 7B Transformer - - 2023 🤗 Apache 2.0
MPT-30B 30B Transformer - - 2023 🤗 Apache 2.0
Mistral-7B-v0.1 7B Transformer - - 2023 🤗 Apache 2.0
Mistral-7B-v0.2 7B Transformer - - 2023 🤗 Apache 2.0
Mistral-7B-v0.3 7B Transformer - - 2023 🤗 Apache 2.0
Falcon-1B 1B Transformer - - 2023 🤗 Apache 2.0
Falcon-7B 7B Transformer - - 2023 🤗 Apache 2.0
Falcon-40B 40B Transformer - - 2023 🤗 Apache 2.0
TinyLlama 1.1B Transformer - - 2023 🤗 Apache 2.0
OpenLLaMA-3B-v1👑 3B Transformer - - 2023 🤗 Apache 2.0
OpenLLaMA-7B-v1👑 7B Transformer - - 2023 🤗 Apache 2.0
OpenLLaMA-13B-v1👑 13B Transformer - - 2023 🤗 Apache 2.0
OpenLLaMA-3B-v2👑 3B Transformer - - 2023 🤗 Apache 2.0
OpenLLaMA-7B-v2👑 7B Transformer - - 2023 🤗 Apache 2.0
DeciLM-7B 7B Transformer - - 2023 🤗 Apache 2.0
Amber👑 7B Transformer - - 2023 🤗 Apache 2.0
Solar 10.7B Transformer - - 2023 🤗 Apache 2.0
Mixtral-8x7B 46.7B Transformer - 2023 🤗 Apache 2.0
OpenMoE-base-128B 637M Transformer - 2023 🤗 Apache 2.0
Mamba-130M 130M SSM - - 2023 🤗 Apache 2.0
Mamba-370M 370M SSM - - 2023 🤗 Apache 2.0
Mamba-790M 790M SSM - - 2023 🤗 Apache 2.0
Mamba-1.4B 1.4M SSM - - 2023 🤗 Apache 2.0
Mamba-2.8B 2.8B SSM - - 2023 🤗 Apache 2.0
Mamba-2.8B-slimpj 2.8B SSM - - 2023 🤗 Apache 2.0
OpenBA 15B Transformer - 2023 🤗 Apache 2.0
Yi-6B 6B Transformer - - 2023 🤗 Apache 2.0
Yi-6B-200K 6B Transformer - - 2023 🤗 Apache 2.0
Yi-9B 9B Transformer - - 2023 🤗 Apache 2.0
Yi-9B-200K 9B Transformer - - 2023 🤗 Apache 2.0
Yi-34B-200K 34B Transformer - - 2023 🤗 Apache 2.0
Persimmon-8B 8B Transformer - - 2023 🤗 Apache 2.0
Palmyra-3B 3B Transformer - - 2023 🤗 Apache 2.0
Palmyra-Small-128M 128M Transformer - - 2023 🤗 Apache 2.0
Palmyra-Base-5B 5B Transformer - - 2023 🤗 Apache 2.0
Palmyra-Large-20B 20B Transformer - - 2023 🤗 Apache 2.0
SEA-LION-3B 3B Transformer - - 2023 🤗 MIT
SEA-LION-7B 7B Transformer - - 2023 🤗 MIT
PLaMo-13B 13B Transformer - - 2023 🤗 Apache 2.0
LiteLlama 460M Transformer - - 2024 🤗 MIT
H2O-Danube 1.8B Transformer - - 2024 🤗 Apache 2.0
H2O-Danube2 1.8B Transformer - - 2024 🤗 Apache 2.0
Cosmo 1.8B Transformer - - 2024 🤗 Apache 2.0
MobiLlama-0.5B 0.5B Transformer - - 2024 🤗 Apache 2.0
MobiLlama-0.8B 0.8B Transformer - - 2024 🤗 Apache 2.0
MobiLlama-1B 1.2B Transformer - - 2024 🤗 Apache 2.0
OLMo-1B👑 1B Transformer - - 2024 🤗 Apache 2.0
OLMo-7B👑 7B Transformer - - 2024 🤗 Apache 2.0
OLMo-7B-Twin-2T👑 7B Transformer - - 2024 🤗 Apache 2.0
OLMo-1.7-7B👑 7B Transformer - - 2024 🤗 Apache 2.0
Poro 34B Transformer - - 2024 🤗 Apache 2.0
Grok-1 314B Transformer - 2024 🤗 Apache 2.0
OpenMoe-8b-1.1T 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-8B-1T 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-8B-800B 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-8B-600B 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-8B-400B 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-8B-200B 8B Transformer - 2024 🤗 Apache 2.0
OpenMoE-34B-200B 34B Transformer - 2024 🤗 Apache 2.0
Jamba 52B SSM-Transformer - 2024 🤗 Apache 2.0
JetMoE 8B Transformer - 2024 🤗 Apache 2.0
Mambaoutai 1.6B SSM - - 2024 🤗 Apache 2.0
Tele-FLM 52B Transformer - - 2024 🤗 Apache 2.0
Arctic-Base 480B Transformer - 2024 🤗 Apache 2.0
Zamba-7B 7B SSM-Transformer - 2024 🤗 Apache 2.0
Mixtral-8x22B-v0.1 141B Transformer - 2024 🤗 Apache 2.0
Granite-7b-base 7B Transformer - - 2024 🤗 Apache 2.0
Chuxin-1.6B-Base👑 1.6B Transformer - - 2024 🤗 MIT
Chuxin-1.6B-1M👑 1.6B Transformer - - 2024 🤗 MIT
Neo👑 7B Transformer - - 2024 🤗 Apache 2.0
Yi-1.5-6B 6B Transformer - - 2024 🤗 Apache 2.0
Yi-1.5-9B 9B Transformer - - 2024 🤗 Apache 2.0
Yi-1.5-34B 34B Transformer - - 2024 🤗 Apache 2.0
GECKO-7B 7B Transformer - - 2024 🤗 Apache 2.0
Qwen2-0.5B 0.5B Transformer - - 2024 🤗 Apache 2.0
Qwen2-1.5B 1.5B Transformer - - 2024 🤗 Apache 2.0
Qwen2-7B 7B Transformer - - 2024 🤗 Apache 2.0
Qwen2-57B-A14B 57B Transformer - 2024 🤗 Apache 2.0
K2👑 65B Transformer - - 2024 🤗 Apache 2.0
Pile-T5-Base👑 248M Transformer - 2024 🤗 Apache 2.0
Pile-T5-Large👑 783M Transformer - 2024 🤗 Apache 2.0
Pile-T5-XL👑 2.85B Transformer - 2024 🤗 Apache 2.0
SmolLM-135M👑 135M Transformer - - 2024 🤗 Apache 2.0
SmolLM-360M👑 360M Transformer - - 2024 🤗 Apache 2.0
SmolLM-1.7B👑 1.7B Transformer - - 2024 🤗 Apache 2.0
GRIN 42B Transformer - 2024 🤗 MIT
OLMoE-1B-7B👑 7B Transformer - 2024 🤗 Apache 2.0
Zamba2-1.2B 1.2B SSM-Transformer - - 2024 🤗 Apache 2.0
Zamba2-2.7B 2.7B SSM-Transformer - - 2024 🤗 Apache 2.0
Fox-1-1.6B 1.6B Transformer - - 2024 🤗 Apache 2.0

📚 Resources

About Openness

  • [Blog post] What "Open" Means: A great blog post by John Shaughnessy discussing how the many different incarnations of the word "open".
  • [Paper] Towards a Framework for Openness in Foundation Models: In this paper, Mozilla and Columbia Institute of Global Politics brought together over 40 leading scholars and practitioners working on openness and AI to discuss the highly debated definitions and benefits of open sourcing foundation models. Among this team are Victor Storchan, Yann LeCun, Justine Tunney, Nathan Lambert, and many others.
  • [Paper] Rethinking open source generative AI: This paper surveys over 45 generative AI models using an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods.
  • [Paper] Risks and Opportunities of Open-Source Generative AI: This paper analyzes the risks and opportunities of open-source generative AI models using a three-stage framework for Gen AI development (near, mid and long-term), and argues that, overall, the benefits of open-source Gen AI outweigh its risks.

📌 Citation

@misc{hamdy2024openlmlist,
  title = {The Open Language Models List},
  author = {Mohammed Hamdy},
  url = {https://github.com/mmhamdy/open-language-models},
  year = {2024},
}