v0.3.0
Models
- Added support for Lit-GPT (#1792)
- Added support for stop sequences in HuggingFaceClient (#1892, #1909)
- Added Mistral 7B model (#1906)
- Added IDEFICS model (#1871)
- Added Anthropic Claude 2 (#1900)
Scenarios
- Added 31 scenarios from CLEVA for evaluation of Chinese language models (#1824, #1864)
- Added VQA scenario model (#1871)
- Adddd support for running MCQA scenarios from users' JSONL files (#1889)
Metrics
- Fixed a bug that prevented using Anthropic Claude for model critique (#1862)
Frontend
Framework
- Added support for multi-modal scenarios and Vision Language Model (VLM) evaluation (#1871)
- Added support for Python 3.9 and 3.10 (#1897)
- Added a new
Tokenizer
class in preparation for removingtokenize()
anddecode()
fromClient
in a future release (#1874) - Made more dependencies optional instead of required, and added install command suggestions (#1834, #1961)
- Added support for configuring users' model deployments through YAML configuration files (#1861)
Evaluation Results
- Added evaluation results for Stanford Alpaca, MosaicML MPT, TII UAE Falcon, LMSYS Vicuna
Contributors
Thank you to the following contributors for your work on this HELM release!