-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polish4 temp #11
base: polish3
Are you sure you want to change the base?
Polish4 temp #11
Commits on Feb 27, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1cb35cc - Browse repository at this point
Copy the full SHA 1cb35ccView commit details -
fix: change the cbd_mc to be CATEGORIES-based
Restored default case for cbd_regex Fixed typo in klej_ner_mc
Configuration menu - View commit details
-
Copy full SHA for 55f274b - Browse repository at this point
Copy the full SHA 55f274bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 35af374 - Browse repository at this point
Copy the full SHA 35af374View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9540f16 - Browse repository at this point
Copy the full SHA 9540f16View commit details -
Configuration menu - View commit details
-
Copy full SHA for d3d7d01 - Browse repository at this point
Copy the full SHA d3d7d01View commit details
Commits on Mar 3, 2024
-
Configuration menu - View commit details
-
Copy full SHA for c4679ce - Browse repository at this point
Copy the full SHA c4679ceView commit details
Commits on Mar 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 7fc327d - Browse repository at this point
Copy the full SHA 7fc327dView commit details
Commits on Mar 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for e14e593 - Browse repository at this point
Copy the full SHA e14e593View commit details -
Configuration menu - View commit details
-
Copy full SHA for bc61568 - Browse repository at this point
Copy the full SHA bc61568View commit details
Commits on Aug 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 85eb77f - Browse repository at this point
Copy the full SHA 85eb77fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0632a05 - Browse repository at this point
Copy the full SHA 0632a05View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb879de - Browse repository at this point
Copy the full SHA bb879deView commit details -
Fix Issue regarding stderr (EleutherAI#1327)
* add fix fordeciding if stderr is N/A or not * process N/A
Configuration menu - View commit details
-
Copy full SHA for 6414edd - Browse repository at this point
Copy the full SHA 6414eddView commit details -
Add
local-completions
support using OpenAI interface (EleutherAI#1277)* Add `local-completions` support using OpenAI interface * Refactor oa_completion * Address tokenizer comments and change request chunks to batch size * Add warning message for tiktoken backend * fix formatting * fix whitespace * Update README.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 66783f6 - Browse repository at this point
Copy the full SHA 66783f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for f0ba560 - Browse repository at this point
Copy the full SHA f0ba560View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9dd448b - Browse repository at this point
Copy the full SHA 9dd448bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f263af - Browse repository at this point
Copy the full SHA 4f263afView commit details -
Update migrated HF dataset paths (EleutherAI#1332)
* Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0d8d549 - Browse repository at this point
Copy the full SHA 0d8d549View commit details -
Don't use
get_task_dict()
in task registration / initialization (El……eutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 268d252 - Browse repository at this point
Copy the full SHA 268d252View commit details -
manage default (greedy) gen_kwargs in vllm (EleutherAI#1341)
* manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy
Configuration menu - View commit details
-
Copy full SHA for 82e319d - Browse repository at this point
Copy the full SHA 82e319dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0938c13 - Browse repository at this point
Copy the full SHA 0938c13View commit details -
Configuration menu - View commit details
-
Copy full SHA for 97361ed - Browse repository at this point
Copy the full SHA 97361edView commit details -
Filter
docs not offset bydoc_id
(EleutherAI#1349)* get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint
Configuration menu - View commit details
-
Copy full SHA for ca3a895 - Browse repository at this point
Copy the full SHA ca3a895View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2eeaf15 - Browse repository at this point
Copy the full SHA 2eeaf15View commit details -
Configuration menu - View commit details
-
Copy full SHA for d467d2f - Browse repository at this point
Copy the full SHA d467d2fView commit details -
Add causalLM OpenVino models (EleutherAI#1290)
* added intel optimum * added intel optimum in readme * modified intel optimum * modified intel optimum * modified intel optimum * modified install optimum * modified path of IR file * added openvino_device * added openvino_device2 * changed optimum-causal to openvino-causal * Update README.md * Update README.md * remove `lm_eval.base` import * update openvino-causal -> openvino ; pass device through super().__init__() * Update README.md * Add optimum to tests dependencies * apply pre-commit * fix so tests pass --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f41ac12 - Browse repository at this point
Copy the full SHA f41ac12View commit details -
Apply some best practices and guideline recommendations to code (Eleu…
…therAI#1363) * raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions pylint-dev/pylint#2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression
Configuration menu - View commit details
-
Copy full SHA for 154f5fa - Browse repository at this point
Copy the full SHA 154f5faView commit details -
Configuration menu - View commit details
-
Copy full SHA for b43d9d9 - Browse repository at this point
Copy the full SHA b43d9d9View commit details -
delay filter init; remove
*args
(EleutherAI#1369)* delay filter init; remove `*args` * bugfix * optimize * type hint
Configuration menu - View commit details
-
Copy full SHA for 2b31cfb - Browse repository at this point
Copy the full SHA 2b31cfbView commit details -
Fix unintuitive
--gen_kwargs
behavior (EleutherAI#1329)* don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning
Configuration menu - View commit details
-
Copy full SHA for cdc41c4 - Browse repository at this point
Copy the full SHA cdc41c4View commit details -
Publish to pypi (EleutherAI#1194)
* publish to pypi * lint * Update publish.yml * minor
Configuration menu - View commit details
-
Copy full SHA for b39e8da - Browse repository at this point
Copy the full SHA b39e8daView commit details -
Make dependencies compatible with PyPI (EleutherAI#1378)
* make deps not point to github urls * formatting * try making PyPI only run on tag pushes
Configuration menu - View commit details
-
Copy full SHA for 0a39c84 - Browse repository at this point
Copy the full SHA 0a39c84View commit details -
Add support for RWKV models with World tokenizer (EleutherAI#1374)
* Add support for RWKV models with World tokenizer The RWKV line of model with the World tokenizer, does not allow the padding token to be configured, and has its value preset as 0 This however fails all the "if set" checks, and would cause the tokenizer to crash. A tokenizer class name check was added, in addition to a model type check, as there exists RWKV models which uses the neox tokenizers * Update huggingface.py Genericized so that this supports any RWKVWorld tokenizer, and added a fall-back for if the HF implementation name changes. * Comply with formatting guidelines * fix format --------- Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b7513d3 - Browse repository at this point
Copy the full SHA b7513d3View commit details -
add bypass metric (EleutherAI#1156)
* add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo
Configuration menu - View commit details
-
Copy full SHA for 7d068d2 - Browse repository at this point
Copy the full SHA 7d068d2View commit details -
Expand docs, update CITATION.bib (EleutherAI#1227)
* Update CITATION.bib * Create CONTRIBUTING.md * add disclaimer re: multi node * flesh out some sections more * Flesh out contributor guide * revert CITATION.bib * appease pre-commit --------- Co-authored-by: lintangsutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b284735 - Browse repository at this point
Copy the full SHA b284735View commit details -
Hf: minor egde cases (EleutherAI#1380)
* edge cases where variable might not be assigned. * type hint
Configuration menu - View commit details
-
Copy full SHA for 80c158c - Browse repository at this point
Copy the full SHA 80c158cView commit details -
Enable override of printed
n-shot
in table (EleutherAI#1379)* allow tasks to specify printed fewshot val * fix to belebele * update metadata field's documentation
Configuration menu - View commit details
-
Copy full SHA for d55e918 - Browse repository at this point
Copy the full SHA d55e918View commit details -
Faster Task and Group Loading, Allow Recursive Groups (EleutherAI#1321)
* add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * check if config is task update * add GroupConfig object * edit test yaml * remove args * testing returning to python task list * add weight_by_size config * describe weight_by_size in docs * fix weight by size potential error * can load individual custom python class task * moved import_function into the config loading file * remove print lines * add squadv2 yaml * temporary scroll implementation * revert back to use load_yaml_config but with modes * fix group being loaded with a None * reformat * can load unregistered tasks from a group * update scrolls * edit scrolls multiplechoice task * adjust class initialization * fix initialization * changed how to identify group and python tasks, fix logger * allow loading "include" that is nested in a group config * reworked flan benchmark * allow duplicate task in the same group to co-exist * process group_alias * removed group_alias * allow parameters set in group_config to apply to all tasks in tasklist * add function, but comment for now * reworked processing dict-base config * fixed how configs in group are processed * update to allow root group to have its alias used * remove unused classes * remove unused classes * revert some parts to original * forgot to change one variable * adapt the new process to use get_task_dict * fix for singular group call * fix variable names * add TaskManager into the evaluator * format * changed how dict tasks are loaded * add docs * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Update evaluator.py * Update evaluator.py * remove groupconfig for now * changed _config to config * update interface.md to explain TaskManager * added property functions * adjusted logger * update write_out.py * updated tests * added documentation and some modifications * added docstring documentation * precommit format * updated task loading for tests * updates tests * changed arg order for load_yaml_config * update to handle scrolls and edit log message * remove unused lines * return a list of task classes and not a dict * Update __init__.py * Delete lm_eval/tasks/benchmarks/test.yaml * Update task.py * Update lm_eval/utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update utils.py * re-added old functions with new log message * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Update new_task_guide.md * added infor regarding `get_task_dict` and documentation * add get_config for Task * pre-commit formatting --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d6b65f1 - Browse repository at this point
Copy the full SHA d6b65f1View commit details -
Fix for EleutherAI#1383 (EleutherAI#1384)
Fixes EleutherAI#1383 If this is okay, it will need to be propagated to SCROLLS
Configuration menu - View commit details
-
Copy full SHA for 5810eac - Browse repository at this point
Copy the full SHA 5810eacView commit details -
Configuration menu - View commit details
-
Copy full SHA for bad70e7 - Browse repository at this point
Copy the full SHA bad70e7View commit details -
Support for Inf2 optimum class [WIP] (EleutherAI#1364)
* initial commit * remove overwrite bs * adding neuronx dependencies * Update README.md * update neuronx
Configuration menu - View commit details
-
Copy full SHA for 09ca8ff - Browse repository at this point
Copy the full SHA 09ca8ffView commit details -
Update README.md (EleutherAI#1398)
Add instructions for non-MacOS users on how to compile janitor_util.cpp so that janitor.py can use it.
Configuration menu - View commit details
-
Copy full SHA for 590bcc7 - Browse repository at this point
Copy the full SHA 590bcc7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ed48ca - Browse repository at this point
Copy the full SHA 4ed48caView commit details -
Use Pooled rather than Combined Variance for calculating stderr of ta…
…sk groupings (EleutherAI#1390) * update formula for stderr aggregation * hack: see what happens when using stderr_for_metric bootstrapping on a group * undo bootstrap_for_stderr test * factor out variance-aggregation formulas into api.metrics * fix failing tests * remove stray print * update comment * further detail in comment * add back initialize_tasks() call * fix format
Configuration menu - View commit details
-
Copy full SHA for 77b79a0 - Browse repository at this point
Copy the full SHA 77b79a0View commit details -
adding hf_transfer (EleutherAI#1400)
* add hf_transfer * update dependencies * Delete stale `[linting]` extra * Update README.md with extras table --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ca8c608 - Browse repository at this point
Copy the full SHA ca8c608View commit details -
batch_size
withauto
defaults to 1 if `No executable batch size f……ound` (EleutherAI#1405) Fixes EleutherAI#1323
Configuration menu - View commit details
-
Copy full SHA for 79378a8 - Browse repository at this point
Copy the full SHA 79378a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for a04bf2b - Browse repository at this point
Copy the full SHA a04bf2bView commit details -
Fixes EleutherAI#1416 (EleutherAI#1418)
* Fixes EleutherAI#1416 Sets `do_sample = False` if `temperature == 0.0` and `do_sample = None` * Update huggingface.py * Update huggingface.py making linter happy
Configuration menu - View commit details
-
Copy full SHA for f2220c7 - Browse repository at this point
Copy the full SHA f2220c7View commit details -
Fix watchdog timeout (EleutherAI#1404)
* Fix watchdog timeout * Pre-commit fix * Timedelta
Configuration menu - View commit details
-
Copy full SHA for 5b0db7a - Browse repository at this point
Copy the full SHA 5b0db7aView commit details -
* un-exclude `evaluate.py` from linting * readability * readability * add task name to build info message * fix link * nit * add functions for var and mean pooling * add functions for var and mean pooling * metadata compatibility with task * rename `override_config` to `set_config` and move to `Task` * add unit test * nit * nit * bugfix * nit * nit * nit * add docstrings * fix metadata-fewshot * revert metric refactor * nit * type checking * type hints * type hints * move `override_metric` to `Task` * change metadata * change name * pre-commit * rename * remove * remove * `override_metric` backwards compatible with `Task` * type hints * use generic * type hint
Configuration menu - View commit details
-
Copy full SHA for 8d82b49 - Browse repository at this point
Copy the full SHA 8d82b49View commit details -
Configuration menu - View commit details
-
Copy full SHA for 80e0a4f - Browse repository at this point
Copy the full SHA 80e0a4fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5c1b249 - Browse repository at this point
Copy the full SHA 5c1b249View commit details -
Configuration menu - View commit details
-
Copy full SHA for 66e9620 - Browse repository at this point
Copy the full SHA 66e9620View commit details -
Added seeds to
evaluator.simple_evaluate
signature (EleutherAI#1412)* Added seeds to `evaluator.simple_evaluate` signature * Added CLI argument * Updated to add arg.
Configuration menu - View commit details
-
Copy full SHA for c2c361c - Browse repository at this point
Copy the full SHA c2c361cView commit details -
Fix: task weighting by subtask size ; update Pooled Stderr formula sl…
…ightly (EleutherAI#1427) * fix weight_by_size condition * add tests, update stderr formula slightly * apply pre-commit
Configuration menu - View commit details
-
Copy full SHA for af3ca77 - Browse repository at this point
Copy the full SHA af3ca77View commit details -
Configuration menu - View commit details
-
Copy full SHA for 205c870 - Browse repository at this point
Copy the full SHA 205c870View commit details -
Configuration menu - View commit details
-
Copy full SHA for 71bbba4 - Browse repository at this point
Copy the full SHA 71bbba4View commit details -
Configuration menu - View commit details
-
Copy full SHA for d027702 - Browse repository at this point
Copy the full SHA d027702View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8315c1f - Browse repository at this point
Copy the full SHA 8315c1fView commit details -
update bbh, gsm8k, mmlu parsing logic and prompts (Orca2 bbh_cot_zero…
…shot 0% -> 42%) (EleutherAI#1356) * update bbh, gsm8k, mmlu parsing logic and prompts * remove the formatting prompt (bbh) + minor update (mmlu) * update bbh, gsm8k, mmlu zeroshot, revert fewshots * update bbh, gsm8k, mmlu version, forward changes to gsm8k-cot * remove take_last, update to use docs parameters * add newline * ruff formatting * Update pyproject.toml * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ba89cd6 - Browse repository at this point
Copy the full SHA ba89cd6View commit details -
Add a new task HaeRae-Bench (EleutherAI#1445)
* haerae_reimplementation * edited Readme and add few_shot settings * edited readme * newlines at end of each files * Modifying the README file * applied pre-commit
Configuration menu - View commit details
-
Copy full SHA for f3e993d - Browse repository at this point
Copy the full SHA f3e993dView commit details -
Group reqs by context (EleutherAI#1425)
* add key lookup for same contexts * nit * appease pre-commit * nit * use `expand` (in-place view) rather than `repeat` * try mixed grouping * add docs. * nit * nit * nits * fix tests * Move greedy_tokens calculation out of cache loop * nit * nits * add test * nits * fix name conflict * fix name conflict * chunk tensor * move Collator * nits/docstring * fixup * fixup * group contexts only for decoders * pre-commit * fix `generate_until` test * fix `generate_until` test * Update lm_eval/models/huggingface.py Co-authored-by: Hailey Schoelkopf <[email protected]> * add docs * nit * add docs * add docs * add 'logits_cache' arg * bugfix --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 44254b3 - Browse repository at this point
Copy the full SHA 44254b3View commit details -
Add a new task GPQA (the part without CoT) (EleutherAI#1434)
* add new task GPQA_n_shot * add new task GPQA_zeroshot * correct GPQA_zeroshot filename * Add randomly shuffle choices * Correct missing parentheses * delete wrong tasks * Add README * Update lm_eval/tasks/gpqa/zeroshot/_gpqa_zeroshot_yaml * Update lm_eval/tasks/gpqa/n_shot/utils.py * Update lm_eval/tasks/gpqa/n_shot/utils.py * Update lm_eval/tasks/gpqa/README.md * placate linter * linter --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c51d0ce - Browse repository at this point
Copy the full SHA c51d0ceView commit details -
Added KMMLU evaluation method and changed ReadMe (EleutherAI#1447)
* update kmmlu default formatting * Update _default_kmmlu_yaml * Delete lm_eval/tasks/kmmlu/utils.py * new tasks implemented * add direct tasks * update direct evaluate * update direct eval * add cot sample * update cot * add cot * Update _cot_kmmlu_yaml * add kmmlu90 * Update and rename _cot_kmmlu.yaml to _cot_kmmlu_yaml * Create kmmlu90.yaml * Update _cot_kmmlu_yaml * add direct * Update _cot_kmmlu_yaml * Update and rename kmmlu90.yaml to kmmlu90_cot.yaml * Update kmmlu90_direct.yaml * add kmmlu hard * Update _cot_kmmlu_yaml * Update _cot_kmmlu_yaml * update cot * update cot * erase typo * Update _cot_kmmlu_yaml * update cot * Rename dataset to match k-mmlu-hard * removed kmmlu90 * fixed name 'kmmlu_cot' to 'kmmlu_hard_cot' and revised README * applied pre-commit before pull requests * rename datasets and add notes * Remove DS_Store cache * Update lm_eval/tasks/kmmlu/README.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Change citations and reflect reviews on version * Added kmmlu_hard and fixed other errors * fixing minor errors * remove duplicated * Rename files * try ".index" * minor fix * minor fix again * fix revert. * minor fix. thank for hailey --------- Co-authored-by: GUIJIN SON <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7dc04ed - Browse repository at this point
Copy the full SHA 7dc04edView commit details -
Add TemplateLM boilerplate LM class (EleutherAI#1279)
* loglikelihood refactor using template lm * linter * fix whitespace in target + prompt for CoT gsm8k (EleutherAI#1275) * Make `parallelize=True` vs. `accelerate launch` distinction clearer in docs (EleutherAI#1261) * Make parallelize=True distinction clearer in documentation. * run linter * Allow parameter edits for registered tasks when listed in a benchmark (EleutherAI#1273) * benchmark yamls allow minor edits of already registered tasks * add documentation * removed print * Fix data-parallel evaluation with quantized models (EleutherAI#1270) * add WIP device_map overrides * update handling outside of accelerate launcher * change .to(device) log to debug level * run linter * Rework documentation for explaining local dataset (EleutherAI#1284) * rewor documentation for explaining local dataset * fix typo * Update new_task_guide.md * Re-add citation It looks like Google Scholar has [already noticed](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C9&authuser=2&q=%22A+framework+for+few-shot+language+model+evaluation%2C+12+2023%22&btnG=) the updated citation block so let's add it back in. * Update CITATION.bib (EleutherAI#1285) Bumping CITATION.bib to match re-adding the citation in readme. cc @StellaAthena * Update nq_open.yaml (EleutherAI#1289) * Update README.md with custom integration doc (EleutherAI#1298) * Update README.md * punctuation --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * Update nq_open.yaml (EleutherAI#1305) * Update nq_open.yaml change regex * Bump NQ version --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * Update task_guide.md (EleutherAI#1306) * Update pyproject.toml (EleutherAI#1312) * Fix polemo2_in.yaml config name (EleutherAI#1313) * Update pyproject.toml (EleutherAI#1314) * Fix group register (EleutherAI#1315) * tuple should be considered as well * set option to keep callable as callable * Update task_guide.md (EleutherAI#1316) * Update polemo2_in.yaml (EleutherAI#1318) * don't pass extra kwargs to mamba any more (EleutherAI#1328) * Fix Issue regarding stderr (EleutherAI#1327) * add fix fordeciding if stderr is N/A or not * process N/A * Add `local-completions` support using OpenAI interface (EleutherAI#1277) * Add `local-completions` support using OpenAI interface * Refactor oa_completion * Address tokenizer comments and change request chunks to batch size * Add warning message for tiktoken backend * fix formatting * fix whitespace * Update README.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * fallback to classname when LM doesnt have config (EleutherAI#1334) * fix a trailing whitespace that breaks a lint job (EleutherAI#1335) * skip "benchmarks" in changed_tasks (EleutherAI#1336) * Update migrated HF dataset paths (EleutherAI#1332) * Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]> * Don't use `get_task_dict()` in task registration / initialization (EleutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]> * manage default (greedy) gen_kwargs in vllm (EleutherAI#1341) * manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy * modified default gen_kwargs to work better with CLI; changed prompt_logprobs=1 (EleutherAI#1345) * update links to task_guide.md (EleutherAI#1348) * `Filter` docs not offset by `doc_id` (EleutherAI#1349) * get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint * Add FAQ on `lm_eval.tasks.initialize_tasks()` to README (EleutherAI#1330) * Update README.md * [!Tip] * Refix issue regarding stderr (EleutherAI#1357) * Add causalLM OpenVino models (EleutherAI#1290) * added intel optimum * added intel optimum in readme * modified intel optimum * modified intel optimum * modified intel optimum * modified install optimum * modified path of IR file * added openvino_device * added openvino_device2 * changed optimum-causal to openvino-causal * Update README.md * Update README.md * remove `lm_eval.base` import * update openvino-causal -> openvino ; pass device through super().__init__() * Update README.md * Add optimum to tests dependencies * apply pre-commit * fix so tests pass --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> * Apply some best practices and guideline recommendations to code (EleutherAI#1363) * raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions pylint-dev/pylint#2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression * serialize callable functions in config (EleutherAI#1367) * delay filter init; remove `*args` (EleutherAI#1369) * delay filter init; remove `*args` * bugfix * optimize * type hint * Fix unintuitive `--gen_kwargs` behavior (EleutherAI#1329) * don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning * Publish to pypi (EleutherAI#1194) * publish to pypi * lint * Update publish.yml * minor * Make dependencies compatible with PyPI (EleutherAI#1378) * make deps not point to github urls * formatting * try making PyPI only run on tag pushes * Add support for RWKV models with World tokenizer (EleutherAI#1374) * Add support for RWKV models with World tokenizer The RWKV line of model with the World tokenizer, does not allow the padding token to be configured, and has its value preset as 0 This however fails all the "if set" checks, and would cause the tokenizer to crash. A tokenizer class name check was added, in addition to a model type check, as there exists RWKV models which uses the neox tokenizers * Update huggingface.py Genericized so that this supports any RWKVWorld tokenizer, and added a fall-back for if the HF implementation name changes. * Comply with formatting guidelines * fix format --------- Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> * add bypass metric (EleutherAI#1156) * add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo * loglikelihood refactor using template lm * lint * code review * neuron optimum * Mention TemplateLM in model_guide.md * Update lm_eval/api/model.py * fix linter * fix format * fix format * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: Lintang Sutawika <[email protected]> Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Mark Saroufim <[email protected]> Co-authored-by: Hannibal046 <[email protected]> Co-authored-by: Danielle Pintz <[email protected]> Co-authored-by: Quentin Lhoest <[email protected]> Co-authored-by: kwrobel.eth <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: Brian Vaughan <[email protected]> Co-authored-by: Baber Abbasi <[email protected]> Co-authored-by: thnkinbtfly <[email protected]> Co-authored-by: NoushNabi <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: LSinev <[email protected]> Co-authored-by: Eugene Cheah <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fbd9bf6 - Browse repository at this point
Copy the full SHA fbd9bf6View commit details -
Log which subtasks were called with which groups (EleutherAI#1456)
* log group membership * no stray prints * Update evaluator.py
Configuration menu - View commit details
-
Copy full SHA for 0d1af67 - Browse repository at this point
Copy the full SHA 0d1af67View commit details -
PR fixing the issue EleutherAI#1391 (wrong contexts in the mgsm task) (…
…EleutherAI#1440) * fix the issue EleutherAI#1391, wrong contexts in mgsm tasks * fix yaml issue for having two target_delimiter lines. For COT tasks, keep the one with a space (default) * regenerate all task yaml files - change naming so that file name will match with task name - task|file follows a consistent naming way, mgsm_(mode)_(lang) for three modes, i.e., direct, en_cot, and native_cot * English CoTs should have a space as target_delimiter * Update utils.py * Apply suggestions from code review --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b8bee2c - Browse repository at this point
Copy the full SHA b8bee2cView commit details -
feat: Add Weights and Biases support (EleutherAI#1339)
* add wandb as extra dependency * wandb metrics logging * refactor * log samples as tables * fix linter * refactor: put in a class * change dir * add panels * log eval as table * improve tables logging * improve reports logging * precommit run * ruff check * handle importing reports api gracefully * ruff * compare results * minor pre-commit fixes * build comparison report * ruff check * log results as artifacts * remove comparison script * update dependency * type annotate and docstring * add example * update readme * fix typo * teardown * handle outside wandb run * gracefully fail reports creation * precommit checks * add report url to summary * use wandb printer for better url stdout * fix ruff * handle N/A and groups * fix eval table * remove unused var * update wandb version req + disable reports stdout * remove reports feature to TODO * add label to multi-choice question data * log model predictions * lints * loglikelihood_rolling * log eval result for groups * log tables by group for better handling * precommit * choices column for multi-choice * graciously fail wandb * remove reports feature * track system metrics + total eval time + stdout --------- Co-authored-by: Lintang Sutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cf1577a - Browse repository at this point
Copy the full SHA cf1577aView commit details -
Fixed generation args issue affection OpenAI completion model (Eleuth…
…erAI#1458) * Fixed generation args issue affection openai completion model * Fixed hf unit test; removed pop attributes in OpenAi completion. * fix format * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dd5bee9 - Browse repository at this point
Copy the full SHA dd5bee9View commit details -
Configuration menu - View commit details
-
Copy full SHA for be5a419 - Browse repository at this point
Copy the full SHA be5a419View commit details -
Adding documentation for Weights and Biases CLI interface (EleutherAI…
…#1466) * interface docs * fix link
Configuration menu - View commit details
-
Copy full SHA for 4024ebb - Browse repository at this point
Copy the full SHA 4024ebbView commit details -
Add environment and transformers version logging in results dump (Ele…
…utherAI#1464) * Save git_hash to results even if git is not available to call as subprocess * Store more info about environment and transformers version in results to help researchers track inconsistencies * moved added logging to logging_utils * moved get_git_commit_hash to logging_utils.py * moved add_env_info inside evaluator
Configuration menu - View commit details
-
Copy full SHA for 8a4827a - Browse repository at this point
Copy the full SHA 8a4827aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 72d40c9 - Browse repository at this point
Copy the full SHA 72d40c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 053cf56 - Browse repository at this point
Copy the full SHA 053cf56View commit details -
add arabic mmlu (EleutherAI#1402)
* add arabic mmlu * update the description * add readme file
Configuration menu - View commit details
-
Copy full SHA for e112b37 - Browse repository at this point
Copy the full SHA e112b37View commit details -
Add Gemma support (Add flag to control BOS token usage) (EleutherAI#1465
) * add add_bos_token to HFLM * add BOS token flag to other local model classes --------- Co-authored-by: Lintang Sutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 420556e - Browse repository at this point
Copy the full SHA 420556eView commit details -
Revert "setting trust_remote_code (EleutherAI#1467)" (EleutherAI#1474)
This reverts commit c1145df.
Configuration menu - View commit details
-
Copy full SHA for 06a4347 - Browse repository at this point
Copy the full SHA 06a4347View commit details -
Create a means for caching task registration and request building. Ad… (
EleutherAI#1372) * Create a means for caching task registration and request building. Add the ability to specify an args dict for simple_evaluate(). * Remove extra S in cache path in caching module Co-authored-by: Hailey Schoelkopf <[email protected]> * Rename requests cache args, make model_args polymorphic so that a dict can also be accepted. * Update docs to reflect new caching behavior, add CLI args for requests caching. Create a function for deleting items in the cache. * Update documentation, fix minor bug with arg parsing for requests caching where an undefined variable was used. * Remove line from gitignore, add to cli for caching datasets. * Add hashing suffix to .pickles. Update test script typo. * Favor isinstance() over type() in evaluator.py * Add tests for caching, gets tests working, remove unneeded arg from build_all_requests(). * Update arg description to simple_evaluate. * Update pyproject.toml * Fix typehint * Remove the use of random() for creating default cache pickle hash. * Check that cache dir exists before clearing it in request cache tests. * Fix linting problems. * Fix additional formatting errors. * Remove trailing whitespace. * Add new line to the end of .gitignore. --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for af2d9f6 - Browse repository at this point
Copy the full SHA af2d9f6View commit details -
Cont metrics (EleutherAI#1475)
* add brier_score * process brier_score * brier score is working for N-sized class * fxied brier score * add TED to BigBench and Brier score to MMLU * format * Update metrics.py * Update task.py * Update generate_until_template_yaml * Delete lm_eval/tasks/bigbench/aux_metric.py * Update generate_until_template_yaml * Update _default_template_yaml * Update _generate_configs.py * Update _generate_configs.py * Update _generate_configs.py * fix (format?) * format? * format, once more --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9600d59 - Browse repository at this point
Copy the full SHA 9600d59View commit details -
Refactor
evaluater.evaluate
(EleutherAI#1441)* change `all_gather` to `gather` * add TaskOutput utility class * Add FilterResults class and refactor task handling. * Rename `key` to `filter_key` for clarity * Add `print_writeout` function in utils.py * Add function to calculate limit size. * Add doc_iterator method to Task class * Refactor `doc_iterator` and cleanup in Task class * remove superfluous bits * change `all_gather` to `gather` * bugfix * bugfix * fix `gather` * Refactor `gather` loop * Refactor aggregate metrics calculation * Refactor and simplify aggregate metrics calculation Removed unused code * Simplify metrics calculation and remove unused code. * simplify the metrics calculation in `utils.py` and `evaluator.py`. * Fix group metric * change evaluate to hf_evaluate * change evaluate to hf_evaluate * add docs * add docs * nits * make isslice keyword only * nit * add todo * nit * nit * nit: swap order samples_metrics tuple * move instance sorting outside loop * nit * nit * Add __repr__ for ConfigurableTask * nit * nit * Revert "nit" This reverts commit dab8d99. * fix some logging * nit * fix `predict_only` bug. thanks to `@LSinev`! * change `print_tasks` to `prepare_print_tasks` * nits * move eval utils * move eval utils * nit * add comment * added tqdm descriptions * Update lm_eval/evaluator_utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * fix mgsm bug * nit * fix `build_all_requests` * pre-commit * add ceil to limit --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7fe8dcb - Browse repository at this point
Copy the full SHA 7fe8dcbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 77ffeef - Browse repository at this point
Copy the full SHA 77ffeefView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6093c0c - Browse repository at this point
Copy the full SHA 6093c0cView commit details -
Fix AttributeError in huggingface.py When 'model_type' is Missing (El…
…eutherAI#1489) * model_type attribute error Getting attribute error when using a model without a 'model_type' * fix w/ and w/out the 'model_type' specification * use getattr(), also fix other config.model_type reference * Update huggingface.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 814f36e - Browse repository at this point
Copy the full SHA 814f36eView commit details -
Configuration menu - View commit details
-
Copy full SHA for c463825 - Browse repository at this point
Copy the full SHA c463825View commit details -
Configuration menu - View commit details
-
Copy full SHA for 47d0899 - Browse repository at this point
Copy the full SHA 47d0899View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0413dee - Browse repository at this point
Copy the full SHA 0413deeView commit details -
Improve data-parallel request partitioning for VLLM (EleutherAI#1477)
* add undistribute + use more_itertools * remove divide() util fn * add more_itertools as dependency
Configuration menu - View commit details
-
Copy full SHA for d579c8b - Browse repository at this point
Copy the full SHA d579c8bView commit details -
modify
WandbLogger
to accept arbitrary kwargs (EleutherAI#1491)* make `WandbLogger` init args optional * nit * nit * nit * move import warning to `WandbLogger` * nit * update docs * nit
Configuration menu - View commit details
-
Copy full SHA for 8146103 - Browse repository at this point
Copy the full SHA 8146103View commit details -
Vllm update DP+TP (EleutherAI#1508)
* use `@ray.remote` with distributed vLLM * update versions * bugfix * unpin vllm * fix pre-commit * added version assertion error * Revert "added version assertion error" This reverts commit 8041e9b. * added version assertion for DP * expand DP note * add warning * nit * pin vllm * fix typos
Configuration menu - View commit details
-
Copy full SHA for 30141ce - Browse repository at this point
Copy the full SHA 30141ceView commit details -
Setting trust_remote_code to True for HuggingFace datasets compatibil…
…ity (EleutherAI#1487) * setting trust_remote_code * dataset list no notebooks * respect trust remote code * Address changes, move cli options and change datasets * fix task for tests * headqa * remove kobest * pin datasets and address comments * clean up space
Configuration menu - View commit details
-
Copy full SHA for 706e10b - Browse repository at this point
Copy the full SHA 706e10bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 40b0917 - Browse repository at this point
Copy the full SHA 40b0917View commit details -
French Bench (EleutherAI#1500)
* add french-bench * rename arc easy * linting * update datasets for no remote code exec * fix string delimiter * add info to readmr * trim trailing whitespace * add detailed groups * add info to readme * remove orangesum title from fbench main * Force PPL tasks to be 0-shot --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4f19431 - Browse repository at this point
Copy the full SHA 4f19431View commit details -
Configuration menu - View commit details
-
Copy full SHA for 512de72 - Browse repository at this point
Copy the full SHA 512de72View commit details -
Fix minor edge cases (EleutherAI#951 EleutherAI#1503) (EleutherAI#1520)
* Fix padding * Fix elif in model loading * format
Configuration menu - View commit details
-
Copy full SHA for b915040 - Browse repository at this point
Copy the full SHA b915040View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2c652b5 - Browse repository at this point
Copy the full SHA 2c652b5View commit details -
Add a new task GPQA (the part CoT and generative) (EleutherAI#1482)
* Add new tasks of GPQA * Add README * Remove unused functions * Remove unused functions * Linters * Add flexible match * update * Remove deplicate function * Linter * update * Update lm_eval/filters/extraction.py Co-authored-by: Hailey Schoelkopf <[email protected]> * register multi_choice_regex * Update * run precommit --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 175bc29 - Browse repository at this point
Copy the full SHA 175bc29View commit details -
Add EQ-Bench as per EleutherAI#1459 (EleutherAI#1511)
* Start adding eq-bench * Start adding to yaml and utils * Get metric working * Add README * Handle cases where answer is not parseable * Deal with unparseable answers and add percent_parseable metric * Update README
Configuration menu - View commit details
-
Copy full SHA for 5c8105c - Browse repository at this point
Copy the full SHA 5c8105cView commit details -
Add WMDP Multiple-choice (EleutherAI#1534)
* init wmdp yaml file * Add WMDP Multiple-choice * fix linter issues * Delete lm_eval/tasks/wmdp/_wmdp.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 44f9421 - Browse repository at this point
Copy the full SHA 44f9421View commit details -
Configuration menu - View commit details
-
Copy full SHA for c9f39fa - Browse repository at this point
Copy the full SHA c9f39faView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7aedaf9 - Browse repository at this point
Copy the full SHA 7aedaf9View commit details -
update printed num-fewshot ; prevent fewshots from erroneously being …
…used by cot which hardcodes fewshot prompt (EleutherAI#1502)
Configuration menu - View commit details
-
Copy full SHA for 8c1c093 - Browse repository at this point
Copy the full SHA 8c1c093View commit details -
Cleanup and fixes (Task, Instance, and a little bit of *evaluate) (El…
…eutherAI#1533) * Remove unused `decontamination_ngrams_path` and all mentions (still no alternative path provided) * Fix improper import of LM and usage of evaluator in one of scripts * update type hints in instance and task api * raising errors in task.py instead of asserts * Fix warnings from ruff * raising errors in __main__.py instead of asserts * raising errors in tasks/__init__.py instead of asserts * raising errors in evaluator.py instead of asserts * evaluator: update type hints and remove unused variables in code * Update lm_eval/__main__.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/__main__.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/evaluator.py Co-authored-by: Hailey Schoelkopf <[email protected]> * pre-commit induced fixes --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f238713 - Browse repository at this point
Copy the full SHA f238713View commit details -
Update installation commands in openai_completions.py and contributin…
…g document and, update wandb_args description (EleutherAI#1536) * Update openai completions and docs/CONTRIBUTING.md * Update wandb args description * Update docs/interface.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3b419af - Browse repository at this point
Copy the full SHA 3b419afView commit details -
Add compatibility for vLLM's new Logprob object (EleutherAI#1549)
* Add compatibility for vLLM's new Logprob object * Fix * Update lm_eval/models/vllm_causallms.py * fix format? * trailing whitespace --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6997af7 - Browse repository at this point
Copy the full SHA 6997af7View commit details -
Fix incorrect
max_gen_toks
generation kwarg default in code2_text. (E……leutherAI#1551) * update gen_kwargs in code2-text-go.yaml * update gen_kwargs in rest code2-text
Configuration menu - View commit details
-
Copy full SHA for 74d9a95 - Browse repository at this point
Copy the full SHA 74d9a95View commit details -
Support jinja templating for task descriptions (EleutherAI#1553)
* Support jinja templating for "description" * Update task_guide.md * Update lm_eval/api/task.py * fix format? * whitespace errors * fix whitespace * fix bad variable reference --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8d5e277 - Browse repository at this point
Copy the full SHA 8d5e277View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7ffd0d1 - Browse repository at this point
Copy the full SHA 7ffd0d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 58cda52 - Browse repository at this point
Copy the full SHA 58cda52View commit details -
add Arabic EXAMS benchmark (EleutherAI#1498)
* add Arabic EXAMS benchmark * fixed the linter issue, and add more information on the readme * Update README.md --------- Co-authored-by: Lintang Sutawika <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1858b54 - Browse repository at this point
Copy the full SHA 1858b54View commit details -
* add agieval * fix typo * add cloze / math exactmatch agieval tasks, rename * update exact-match agieval tasks, allow for multiple-correct answers * add more detail to readme * don't parse_math_answer twice --------- Co-authored-by: Alex Bäuerle <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5298fc0 - Browse repository at this point
Copy the full SHA 5298fc0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 94f7159 - Browse repository at this point
Copy the full SHA 94f7159View commit details -
add manual tqdm disabling management (EleutherAI#1569)
* add manual tqdm disabling management * add typing to all new args * apply precommit changes --------- Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ee0e166 - Browse repository at this point
Copy the full SHA ee0e166View commit details -
Fix README section on vllm integration (EleutherAI#1579)
* Link to vllm integration * add pip install .[vllm] cmd
Configuration menu - View commit details
-
Copy full SHA for 28e568d - Browse repository at this point
Copy the full SHA 28e568dView commit details -
Configuration menu - View commit details
-
Copy full SHA for df6ee7a - Browse repository at this point
Copy the full SHA df6ee7aView commit details -
Proposed approach for testing CLI arg parsing (EleutherAI#1566)
* New tests for CLI args * fix spacing * change tests for parsing * add tests, fix parser * remove defaults for store_true
Configuration menu - View commit details
-
Copy full SHA for c6edcdb - Browse repository at this point
Copy the full SHA c6edcdbView commit details -
Patch for Seq2Seq Model predictions (EleutherAI#1584)
* Differentiate _encode_pair setting for decoder and enc-dec models * tok_decode to not skip special token so that eos doen't become empty string * Update model.py * Update model.py * Update huggingface.py * Update lm_eval/models/huggingface.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update model.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0dc609d - Browse repository at this point
Copy the full SHA 0dc609dView commit details -
Configuration menu - View commit details
-
Copy full SHA for baa917f - Browse repository at this point
Copy the full SHA baa917fView commit details -
Cleanup for v0.4.2 release (EleutherAI#1573)
* Update interface.md * fix: make caching reqs always work with accelerate launch * remove stale task migration checklist * remove deprecation warnings * make informative TypeErrors for get_task_dict * bump version metadata * fix num_fewshot printing bug * add fewshot value to cache key
Configuration menu - View commit details
-
Copy full SHA for 53c11f7 - Browse repository at this point
Copy the full SHA 53c11f7View commit details -
Fix eval_logger import for mmlu/_generate_configs.py (EleutherAI#1593)
* Fix eval_logger import for mmlu/_generate_configs.py * linter --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6e52d16 - Browse repository at this point
Copy the full SHA 6e52d16View commit details -
use BOS token in loglikelihood (EleutherAI#1588)
* use BOS token in loglikelihood * improve comments * add model arg * log prefix token id * log prefix token id * Update lm_eval/api/model.py Co-authored-by: Hailey Schoelkopf <[email protected]> * change name to prefix_token_id --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8cd155f - Browse repository at this point
Copy the full SHA 8cd155fView commit details -
Revert "Patch for Seq2Seq Model predictions (EleutherAI#1584)" (Eleut…
…herAI#1601) This reverts commit b7923a8.
Configuration menu - View commit details
-
Copy full SHA for 1ea55eb - Browse repository at this point
Copy the full SHA 1ea55ebView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a304c9 - Browse repository at this point
Copy the full SHA 5a304c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 39a0b3a - Browse repository at this point
Copy the full SHA 39a0b3aView commit details -
Fixes to Loglikelihood prefix token / VLLM (EleutherAI#1611)
* make vllm use prefix_token_id ; have prefix_token_id be optional method to define * custom_prefix_token_id wasn't set if not passed
Configuration menu - View commit details
-
Copy full SHA for a513931 - Browse repository at this point
Copy the full SHA a513931View commit details -
Add ACLUE task (EleutherAI#1614)
* Add task ACLUE * fix minor bug * fix code style * fix code style
Configuration menu - View commit details
-
Copy full SHA for 7d8eeba - Browse repository at this point
Copy the full SHA 7d8eebaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 45ed815 - Browse repository at this point
Copy the full SHA 45ed815View commit details -
add logging of model args (EleutherAI#1619)
* add logging of model args * nit * Add warnings. * nit * add warning * nit
Configuration menu - View commit details
-
Copy full SHA for 9064d35 - Browse repository at this point
Copy the full SHA 9064d35View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c7e4fd - Browse repository at this point
Copy the full SHA 7c7e4fdView commit details -
peft Version Assertion (EleutherAI#1635)
* peft Version Assertion * fix the linter issue
Configuration menu - View commit details
-
Copy full SHA for f970123 - Browse repository at this point
Copy the full SHA f970123View commit details -
* fix on --task list * add fixes to tokeniation * differentiate encoding for seq2seq and decoder * return token setting * format for pre-commit * Seq2seq fix, pt2 (EleutherAI#1630) * getting model class only when defined * encode_pair handles None, add_special_tokens turned into dict with default value --------- Co-authored-by: achervyakov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 048c0d3 - Browse repository at this point
Copy the full SHA 048c0d3View commit details -
Integration of NeMo models into LM Evaluation Harness library (Eleuth…
…erAI#1598) * Integration of NeMo models into LM Evaluation Harness library * rename nemo model as nemo_lm * move nemo section in readme after hf section * use self.eot_token_id in get_until() * improve progress bar showing loglikelihood requests * data replication or tensor/pipeline replication working fine within one node * run pre-commit on modified files * check whether dependencies are installed * clarify usage of torchrun in README
Configuration menu - View commit details
-
Copy full SHA for 9f50796 - Browse repository at this point
Copy the full SHA 9f50796View commit details -
Configuration menu - View commit details
-
Copy full SHA for f0b04a0 - Browse repository at this point
Copy the full SHA f0b04a0View commit details -
Configuration menu - View commit details
-
Copy full SHA for fa2acde - Browse repository at this point
Copy the full SHA fa2acdeView commit details -
Add Latxa paper evaluation tasks for Basque (EleutherAI#1654)
* add basqueglue * add eus_exams * add eus_proficiency * add eus_reading * add eus_trivia * run pre-commit
Configuration menu - View commit details
-
Copy full SHA for b948d14 - Browse repository at this point
Copy the full SHA b948d14View commit details -
Fix CLI --batch_size arg for openai-completions/local-completions (El…
…eutherAI#1656) The OpenAI interface supports batch size as an argument to the completions API, but does not seem to support specification of this on the CLI i.e. `lm_eval --model openai-completions --batch_size 16 ...` because of a simple lack of str->int conversion. This is confirmed by my usage and stacktrace from running `OPENAI_API_KEY=dummy lm_eval --model local-completions --tasks gsm8k --batch_size 16 --model_args model=nm- testing/zephyr-beta-7b-gptq-g128,tokenizer_backend=huggingface,base_url=http://localhost:8000/v1`: ``` Traceback (most recent call last): File "/home/michael/venv/bin/lm_eval", line 8, in <module> sys.exit(cli_evaluate()) File "/home/michael/code/lm-evaluation-harness/lm_eval/__main__.py", line 341, in cli_evaluate results = evaluator.simple_evaluate( File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 251, in simple_evaluate results = evaluate( File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 390, in evaluate resps = getattr(lm, reqtype)(cloned_reqs) File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 263, in generate_until list(sameuntil_chunks(re_ord.get_reordered(), self.batch_size)), File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 251, in sameuntil_chunks if len(ret) >= size or x[1] != lastuntil: TypeError: '>=' not supported between instances of 'int' and 'str' ```
Configuration menu - View commit details
-
Copy full SHA for da93b8a - Browse repository at this point
Copy the full SHA da93b8aView commit details -
Configuration menu - View commit details
-
Copy full SHA for cf10ee7 - Browse repository at this point
Copy the full SHA cf10ee7View commit details -
TMMLU+ implementation (EleutherAI#1394)
* implementation of TMMLU+ * implemented: TMMLU+ ****TMMLU+ : large-scale Traditional chinese Massive Multitask language Understanding**** - 4 categories - STEM - Social Science - Humanities - Other The TMMLU+ dataset, encompassing over 67 subjects and 20160 tasks, is six times larger and more balanced than its predecessor, TMMLU, and includes benchmark results from both closed-source and 20 open-weight Chinese large language models with 1.8B to 72B parameters. However, Traditional Chinese variants continue to underperform compared to major Simplified Chinese models. ```markdown Total number of tasks in the 'test' sets: 20160 Total number of tasks in the 'validation' sets: 2247 Total number of tasks in the 'train' sets: 335 ``` * Remove print from __init__.py There was my mistake in forgetting to remove the debug print from the code. * update: move TMMLU+ config generation program into default * fix: we should use training set as few shots example * update: README for TMMLU+ * update: a small changes of TMMLU+ README file * pre-commit run thought * Add README for TMMLU+ dataset * run precommit * trigger precommit again * trigger precommit again * isort is fussy * isort is fussy * format, again * oops * oops --------- Co-authored-by: lintang <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 76a7c23 - Browse repository at this point
Copy the full SHA 76a7c23View commit details -
Anthropic Chat API (EleutherAI#1594)
* claude3 * supply for anthropic claude3 * supply for anthropic claude3 * anthropic config changes * add callback options on anthropic * line passed * claude3 tiny change * help anthropic installation * mention sysprompt / being careful with format in readme --------- Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6786e82 - Browse repository at this point
Copy the full SHA 6786e82View commit details -
correction bug EleutherAI#1664 (EleutherAI#1670)
* correction bug EleutherAI#1664 * add any invalid characters for Windows filenames and Unix-like systems see: https://gist.github.com/doctaphred/d01d05291546186941e1b7ddc02034d3?permalink_comment_id=3958715 * Update lm_eval/__main__.py * Update scripts/zeno_visualize.py * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 98693bf - Browse repository at this point
Copy the full SHA 98693bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for c374e6f - Browse repository at this point
Copy the full SHA c374e6fView commit details -
Add delta weights model loading (EleutherAI#1712)
* added delta weights * removed debug * readme update * better error handling * autogptq warn * warn update * peft and delta error, explicitly deleting _model_delta * linter fix
Configuration menu - View commit details
-
Copy full SHA for 8518800 - Browse repository at this point
Copy the full SHA 8518800View commit details -
Add
neuralmagic
models forsparseml
anddeepsparse
(EleutherAI#……1674) * Add neuralmagic models for SparseML and DeepSparse * Update to latest and add test * Format * Fix list to List * Format * Add deepsparse/sparseml to automated testing * Update pyproject.toml * Update pyproject.toml * Update README * Fixes for dtype and device * Format * Fix test * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * Address review comments! --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8103925 - Browse repository at this point
Copy the full SHA 8103925View commit details -
Configuration menu - View commit details
-
Copy full SHA for a56bf85 - Browse repository at this point
Copy the full SHA a56bf85View commit details -
Configuration menu - View commit details
-
Copy full SHA for a09b018 - Browse repository at this point
Copy the full SHA a09b018View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6687de7 - Browse repository at this point
Copy the full SHA 6687de7View commit details -
Add XNLIeu: a dataset for cross-lingual NLI in Basque (EleutherAI#1694)
* add xnli_eu tasks * update tasks readme * update readme
Configuration menu - View commit details
-
Copy full SHA for fe92e5a - Browse repository at this point
Copy the full SHA fe92e5aView commit details -
Configuration menu - View commit details
-
Copy full SHA for d69d54d - Browse repository at this point
Copy the full SHA d69d54dView commit details -
Support individual scrolls datasets (EleutherAI#1740)
* Support individual scrolls datasets * Add qmsum context * Fix formatting
Configuration menu - View commit details
-
Copy full SHA for f38e8a1 - Browse repository at this point
Copy the full SHA f38e8a1View commit details -
Add filter registry decorator (EleutherAI#1750)
* Add register_filter decorator * Add register_filter docs
Configuration menu - View commit details
-
Copy full SHA for 7cd59dd - Browse repository at this point
Copy the full SHA 7cd59ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for dabce43 - Browse repository at this point
Copy the full SHA dabce43View commit details -
Pile 10k new task (EleutherAI#1758)
* Add Pile-10k readme * Add Pile-10k task configuration file
Configuration menu - View commit details
-
Copy full SHA for f4281a4 - Browse repository at this point
Copy the full SHA f4281a4View commit details -
Fix m_arc choices (EleutherAI#1760)
* Update utils.py This is a 4-choice task, option_e is null for all but 3 samples * Fix options Adaptive choices * add option e * bump multilingual arc version --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c51925d - Browse repository at this point
Copy the full SHA c51925dView commit details -
upload new tasks (EleutherAI#1728)
* upload new tasks * add readmes * run linters --------- Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e2bc623 - Browse repository at this point
Copy the full SHA e2bc623View commit details -
vllm lora support (EleutherAI#1756)
* vllm lora support * remove print * version check, rename lora kwarg
Configuration menu - View commit details
-
Copy full SHA for df05e78 - Browse repository at this point
Copy the full SHA df05e78View commit details -
Add option to set OpenVINO config (EleutherAI#1730)
* Add option to set OpenVINO config * Use utils.eval_logger for logging
Configuration menu - View commit details
-
Copy full SHA for af14500 - Browse repository at this point
Copy the full SHA af14500View commit details -
evaluation tracker implementation (EleutherAI#1766)
* evaluation tracker implementation * OVModelForCausalLM test fix * typo fix * moved methods args * multiple args in one flag * loggers moved to dedicated dir * improved filename sanitization
Configuration menu - View commit details
-
Copy full SHA for ba53c71 - Browse repository at this point
Copy the full SHA ba53c71View commit details -
Configuration menu - View commit details
-
Copy full SHA for da3067f - Browse repository at this point
Copy the full SHA da3067fView commit details -
Configuration menu - View commit details
-
Copy full SHA for ffc6594 - Browse repository at this point
Copy the full SHA ffc6594View commit details -
remove echo parameter in OpenAI completions API (EleutherAI#1779)
* remove echo parameter in OpenAI completions API * remove context length parameter doc string
Configuration menu - View commit details
-
Copy full SHA for d261c2f - Browse repository at this point
Copy the full SHA d261c2fView commit details -
Fix README: change
----hf_hub_log_args
to--hf_hub_log_args
(Eleut……herAI#1776) fix `----hf_hub_log_args` to `--hf_hub_log_args`
Configuration menu - View commit details
-
Copy full SHA for 29812e7 - Browse repository at this point
Copy the full SHA 29812e7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 45c5f41 - Browse repository at this point
Copy the full SHA 45c5f41View commit details -
Provide ability for custom sampler for ConfigurableTask (EleutherAI#1616
) * Added fewshot sampling seeds to evaluator.simple_evaluate signature Way to control seed of fewshot sampling may help with EleutherAI#1591 * Added ability for custom sampler for ConfigurableTask May be set in config like ``` fewshot_config: sampler: !function utils.MyFewshotSampler ``` * explicitly set fewshot random generator seed for HFLM generate_until_task test * add backward compatibility for three args seed setup * save seeds info to logs/reports
Configuration menu - View commit details
-
Copy full SHA for 615b2dd - Browse repository at this point
Copy the full SHA 615b2ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for 59c553a - Browse repository at this point
Copy the full SHA 59c553aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4e63a32 - Browse repository at this point
Copy the full SHA 4e63a32View commit details -
Configuration menu - View commit details
-
Copy full SHA for ea773e4 - Browse repository at this point
Copy the full SHA ea773e4View commit details -
Re-add Hendrycks MATH (no sympy checking, no Minerva hardcoded prompt…
…) variant (EleutherAI#1793) * add Hendrycks MATH (no sympy checking) variant * add readmes for MATH tasks
Configuration menu - View commit details
-
Copy full SHA for aa4e118 - Browse repository at this point
Copy the full SHA aa4e118View commit details -
Logging Updates (Alphabetize table printouts, fix eval tracker bug) (E…
…leutherAI#1774) (EleutherAI#1791) * fix auto-batch size bug for seq2seq models * alphabetize task + group tables ; fix eval tracker bug * fix eval tracker bug
Configuration menu - View commit details
-
Copy full SHA for b3e8661 - Browse repository at this point
Copy the full SHA b3e8661View commit details -
Initial integration of the Unitxt to LM eval harness (EleutherAI#1615)
* Initial support for Unitxt datasets in LM Eval Harness See https://github.com/IBM/unitxt The script 'generate_yamls.py' creates LM Eval Harness yaml files corresponding to Unitxt datasets specified in the 'unitxt_datasets' file. The glue code required to register Unitxt metrics is in 'unitxt_wrapper.py'. * Added dataset loading check to generate_yaml Improved error messages. * Speed up generate_yaml Added printouts and improved error message * Added output printout * Simplified integration of unitxt datasets Store all the common yaml configuration in a yaml include shared by all datasets of the same task. * Post code review comments - part 1 1. Made sure include files don't end wth 'yaml' so they won't be marked as tasks 2. Added more datasets and tasks (NER, GEC) 3. Added README * Post code review comments - part 2 1. Added install unitxt install option in pyproject.toml: pip install 'lm_eval[unitxt]' 2. Added a check that unitxt is installed and print a clear error message if not * Commited missing pyproject change * Added documentation on adding datasets * More doc changes * add unitxt extra to readme * run precommit --------- Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c864ea2 - Browse repository at this point
Copy the full SHA c864ea2View commit details -
add task for mmlu evaluation in arc multiple choice format (EleutherA…
…I#1745) * add mmlu arc style evaluation * rename arc_style to continuation --------- Co-authored-by: Jonathan Burdge <[email protected]> Co-authored-by: Jonathan Burdge <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for bba2bf6 - Browse repository at this point
Copy the full SHA bba2bf6View commit details -
Update flag
--hf_hub_log_args
in interface documentation (EleutherA……I#1806) * update interface documentation with flag --hf_hub_logs_arg * update interface documentation with flag --hf_hub_logs_arg 2
Configuration menu - View commit details
-
Copy full SHA for a137c3e - Browse repository at this point
Copy the full SHA a137c3eView commit details -
* add copal * change name to copal id for clarity and the task name * remove `copal_id...` to yaml to make it work * checkmark on README * change group name to `copal_id`
Configuration menu - View commit details
-
Copy full SHA for cd0b2ba - Browse repository at this point
Copy the full SHA cd0b2baView commit details -
Adding tinyBenchmarks datasets (EleutherAI#1545)
* Add tinyBenchmarks * Add acknowledgements * Add ordering of outputs for data-parallel * Run pre-commit * Add few_shot specifications * Add tinyBenchmarks post-processing * add conditional import ; fix task names --------- Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6bcb05e - Browse repository at this point
Copy the full SHA 6bcb05eView commit details -
Configuration menu - View commit details
-
Copy full SHA for e888fb6 - Browse repository at this point
Copy the full SHA e888fb6View commit details -
Configuration menu - View commit details
-
Copy full SHA for ab46906 - Browse repository at this point
Copy the full SHA ab46906View commit details -
Fix: support PEFT/LoRA with added tokens (EleutherAI#1828)
* resize model embeddings * resize only * tokenizer help * load tokenizer before model * add comment and run precommit lint * Add log message Co-authored-by: Hailey Schoelkopf <[email protected]> --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5759d86 - Browse repository at this point
Copy the full SHA 5759d86View commit details -
Configuration menu - View commit details
-
Copy full SHA for b542fd9 - Browse repository at this point
Copy the full SHA b542fd9View commit details -
Configuration menu - View commit details
-
Copy full SHA for d02eb34 - Browse repository at this point
Copy the full SHA d02eb34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21f36dd - Browse repository at this point
Copy the full SHA 21f36ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5ca629a - Browse repository at this point
Copy the full SHA 5ca629aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b93289 - Browse repository at this point
Copy the full SHA 2b93289View commit details -
Fix
batch_size=auto
for HF Seq2Seq models (EleutherAI#1765) (Eleuth……erAI#1790) * fix auto-batch size bug for seq2seq models * run linter
Configuration menu - View commit details
-
Copy full SHA for e6223c0 - Browse repository at this point
Copy the full SHA e6223c0View commit details -
Fix Brier Score (EleutherAI#1847)
`gold_one_hot` needs to follow the dimension of predictions so that it still works when `--limit` is used and the indexes in gold does not cover all gold indexes.
Configuration menu - View commit details
-
Copy full SHA for 8329adb - Browse repository at this point
Copy the full SHA 8329adbView commit details -
Fix for bootstrap_iters = 0 case (EleutherAI#1715) (EleutherAI#1789)
* add handling for bootstrap_iters=0 case * add more detail to docstring * run precommit
Configuration menu - View commit details
-
Copy full SHA for e3ec75f - Browse repository at this point
Copy the full SHA e3ec75fView commit details -
add mmlu tasks from pile-t5 (EleutherAI#1710)
* add mmlu tasks from pile-t5 * Update _mmlu_flan_cot_fewshot_template_yaml * Update _mmlu_flan_cot_zeroshot_template_yaml * Update _mmlu_flan_generative_template_yaml * Update _mmlu_flan_loglikelihood_template_yaml * Update _default_template_yaml --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ee44bf2 - Browse repository at this point
Copy the full SHA ee44bf2View commit details -
Bigbench fix (EleutherAI#1686)
* edit process multiple-choice * split template yaml * remove * modified multiple_choice tasks * udpate * Update multiple_choice_template_b_yaml * Update multiple_choice_template_a_yaml --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 83f9d66 - Browse repository at this point
Copy the full SHA 83f9d66View commit details -
Rename
lm_eval.logging -> lm_eval.loggers
(EleutherAI#1858)* rename lm_eval.logging module * fix evaluation tracker args
Configuration menu - View commit details
-
Copy full SHA for fe6fb1a - Browse repository at this point
Copy the full SHA fe6fb1aView commit details -
Updated vllm imports in vllm_causallms.py (EleutherAI#1890)
* Reorder vllm imports in vllm_causallms.py * Update vllm_causallms.py
Configuration menu - View commit details
-
Copy full SHA for b69aecc - Browse repository at this point
Copy the full SHA b69aeccView commit details -
[HFLM]Add support for Ascend NPU (EleutherAI#1886)
* [HFLM]Add support for Ascend NPU Co-authored-by: jiaqiw09 <[email protected]> Co-authored-by: zhabuye <[email protected]> * bump accelerate dependency version to 0.26.0 for NPU compat. --------- Co-authored-by: jiaqiw09 <[email protected]> Co-authored-by: zhabuye <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d177975 - Browse repository at this point
Copy the full SHA d177975View commit details -
higher_is_better
tickers in output table (EleutherAI#1893)* Higher is better tickers in output table * add extra check for `higher_is_better` not being None already * Update lm_eval/evaluator.py * fixup format I messed up * add comment (and retrigger tests) --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for bbc1216 - Browse repository at this point
Copy the full SHA bbc1216View commit details -
Add dataset card when pushing to HF hub (EleutherAI#1898)
* dataset card initial * few fixes * adds groups for math, mmlu, gpqa * added summary agrs * moved sanitize_list to utils * readme update * recreate metadata moved * multiple model support * results latest split fix * readme update and small refactor * fix grouping * add comments * added pathlib * corrected pathlib approach * check whether to create a metadata card * convert posix paths to str * default hf org from token * hf token value error * Add logs after successful upload * logging updates * dataset card example in the readme --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Alina Lozovskaia <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ebc3807 - Browse repository at this point
Copy the full SHA ebc3807View commit details -
Making hardcoded few shots compatible with the chat template mechanism (
EleutherAI#1895) * init test 1 * fix * this format seems to be working - need to update all other tasks with the new format * bbh with few shot format * fix fewshot bbh * add mmlu flan cot * samples of cot * kmmlu * fix gsm8k * update keys for mmlu * minerva math * bbh * fix * fix samples * small fixes to templates * last prompt format change * fixing prompt * fixed minerva math format * rm accidental commited file * added doc for few shot samples * Update lm_eval/loggers/evaluation_tracker.py * Update lm_eval/loggers/evaluation_tracker.py * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * added check in sampler per code review * added the system from a function, plus an example in minerva math * style * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * fix unit tests 1 * forcing use of test split --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 105b516 - Browse repository at this point
Copy the full SHA 105b516View commit details -
Configuration menu - View commit details
-
Copy full SHA for acc4029 - Browse repository at this point
Copy the full SHA acc4029View commit details -
Configuration menu - View commit details
-
Copy full SHA for e53f271 - Browse repository at this point
Copy the full SHA e53f271View commit details -
Complete task list from pr 1727 (EleutherAI#1901)
* added tasks and task family descriptors * continue work on task list w/ links; slightly reorganize README * Apply suggestions from code review * Rename file so that it'll preview in Github when viewing lm_eval/tasks folder * Update new_task_guide.md * Update README.md * run linter * Add language column to task table; Add missing tasks to task table; fix nq_open and storycloze READMEs * fix typo * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * apply format --------- Co-authored-by: Harish Vadaparty <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 85550b3 - Browse repository at this point
Copy the full SHA 85550b3View commit details -
Add chat template (EleutherAI#1873)
* initial chat template * tokenizer attribute check * variable rename * interface update * system instruction * system inst default update * fewshot as multiturn * typing update * indent update * added comments * Adding a fewshot in a more readable way * linting * Moved apply chat template to LM * multiturn alternation fix * cache key update * apply chat template method fix * add system prompt hash to cache_key * tokenizer name property for cache_key * property name fix * linting backward compatibility fix * docs and errors update * add documentation on adding chat template compatibility to model_guide * fewshot as multiturn check fix * saving system inst and chat template in results * eval tracker update * docs update * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> --------- Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0f995d9 - Browse repository at this point
Copy the full SHA 0f995d9View commit details -
Multiple Choice Questions and Large Languages Models: A Case Study wi…
…th Fictional Medical Data (EleutherAI#1867) * glianorex tasks * Create README.md * Update README.md * Update README.md * fix formatting * fix internal formatting
Configuration menu - View commit details
-
Copy full SHA for aceb0ce - Browse repository at this point
Copy the full SHA aceb0ceView commit details -
Modify pre-commit hook to check merge conflicts accidentally committe…
…d not at current merge commit (EleutherAI#1927)
Configuration menu - View commit details
-
Copy full SHA for 55c36de - Browse repository at this point
Copy the full SHA 55c36deView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2d1ffb9 - Browse repository at this point
Copy the full SHA 2d1ffb9View commit details -
Add new Lambada translations (EleutherAI#1897)
* added tasks and task family descriptors * configs for the new lambada translations * continue work on task list w/ links; slightly reorganize README * Apply suggestions from code review * Rename file so that it'll preview in Github when viewing lm_eval/tasks folder * Update new_task_guide.md * Update README.md * run linter * Add language column to task table; Add missing tasks to task table; fix nq_open and storycloze READMEs * fix typo * update `lm_eval/tasks/README.md` with task description --------- Co-authored-by: Harish Vadaparty <[email protected]> Co-authored-by: anthony <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c63d56a - Browse repository at this point
Copy the full SHA c63d56aView commit details -
Implement NoticIA (EleutherAI#1912)
* Noticia * test * Final testes implementation * Fixes * Fix linters
Configuration menu - View commit details
-
Copy full SHA for 17fcd25 - Browse repository at this point
Copy the full SHA 17fcd25View commit details -
Configuration menu - View commit details
-
Copy full SHA for 58264ac - Browse repository at this point
Copy the full SHA 58264acView commit details -
Configuration menu - View commit details
-
Copy full SHA for 66e2c9d - Browse repository at this point
Copy the full SHA 66e2c9dView commit details -
Update basque-glue (EleutherAI#1913)
* Update README.md * Update bec.yaml * Update bhtc.yaml * Update coref.yaml * Update qnli.yaml * Update vaxx.yaml * Update wic.yaml
Configuration menu - View commit details
-
Copy full SHA for 1865671 - Browse repository at this point
Copy the full SHA 1865671View commit details -
Test output table layout consistency (EleutherAI#1916)
* sort metrics in output table * update docstring in `consolidate_results` * add tests for verifying consistency of table output * update tests to account for floating point inconsistencies * updated tests based on `pythia-14m`
Configuration menu - View commit details
-
Copy full SHA for eaf6696 - Browse repository at this point
Copy the full SHA eaf6696View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0c1aeb - Browse repository at this point
Copy the full SHA a0c1aebView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4320c18 - Browse repository at this point
Copy the full SHA 4320c18View commit details -
Configuration menu - View commit details
-
Copy full SHA for ff41506 - Browse repository at this point
Copy the full SHA ff41506View commit details -
Add task definitions: 8tags, dyk, ppc, psc, belebele PL (regex), pole…
…mo2 (multiple_choice)
Configuration menu - View commit details
-
Copy full SHA for 15950dd - Browse repository at this point
Copy the full SHA 15950ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for a107ca9 - Browse repository at this point
Copy the full SHA a107ca9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b8e7b3 - Browse repository at this point
Copy the full SHA 6b8e7b3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8568c6e - Browse repository at this point
Copy the full SHA 8568c6eView commit details -
Configuration menu - View commit details
-
Copy full SHA for ca605fd - Browse repository at this point
Copy the full SHA ca605fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 18e618e - Browse repository at this point
Copy the full SHA 18e618eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 76a4f36 - Browse repository at this point
Copy the full SHA 76a4f36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8f0d25c - Browse repository at this point
Copy the full SHA 8f0d25cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4972634 - Browse repository at this point
Copy the full SHA 4972634View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c4b0a1 - Browse repository at this point
Copy the full SHA 6c4b0a1View commit details -
Configuration menu - View commit details
-
Copy full SHA for d880314 - Browse repository at this point
Copy the full SHA d880314View commit details -
Configuration menu - View commit details
-
Copy full SHA for f876552 - Browse repository at this point
Copy the full SHA f876552View commit details -
Configuration menu - View commit details
-
Copy full SHA for 02dd644 - Browse repository at this point
Copy the full SHA 02dd644View commit details -
Configuration menu - View commit details
-
Copy full SHA for 637afd1 - Browse repository at this point
Copy the full SHA 637afd1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 53039c2 - Browse repository at this point
Copy the full SHA 53039c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6d5e657 - Browse repository at this point
Copy the full SHA 6d5e657View commit details -
Configuration menu - View commit details
-
Copy full SHA for 38e954a - Browse repository at this point
Copy the full SHA 38e954aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 88e0034 - Browse repository at this point
Copy the full SHA 88e0034View commit details -
Configuration menu - View commit details
-
Copy full SHA for 458cdc7 - Browse repository at this point
Copy the full SHA 458cdc7View commit details -
Configuration menu - View commit details
-
Copy full SHA for fea7b68 - Browse repository at this point
Copy the full SHA fea7b68View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2693979 - Browse repository at this point
Copy the full SHA 2693979View commit details -
Configuration menu - View commit details
-
Copy full SHA for 776a4d3 - Browse repository at this point
Copy the full SHA 776a4d3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 53f1dfc - Browse repository at this point
Copy the full SHA 53f1dfcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 87b2160 - Browse repository at this point
Copy the full SHA 87b2160View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8998552 - Browse repository at this point
Copy the full SHA 8998552View commit details -
Configuration menu - View commit details
-
Copy full SHA for b6c4ac3 - Browse repository at this point
Copy the full SHA b6c4ac3View commit details -
Configuration menu - View commit details
-
Copy full SHA for f9ca054 - Browse repository at this point
Copy the full SHA f9ca054View commit details -
Configuration menu - View commit details
-
Copy full SHA for 29959a3 - Browse repository at this point
Copy the full SHA 29959a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for aaaac9d - Browse repository at this point
Copy the full SHA aaaac9dView commit details -
Configuration menu - View commit details
-
Copy full SHA for db195a2 - Browse repository at this point
Copy the full SHA db195a2View commit details -
Configuration menu - View commit details
-
Copy full SHA for d96cd84 - Browse repository at this point
Copy the full SHA d96cd84View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2df424f - Browse repository at this point
Copy the full SHA 2df424fView commit details -
Configuration menu - View commit details
-
Copy full SHA for badffa9 - Browse repository at this point
Copy the full SHA badffa9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c3a1bec - Browse repository at this point
Copy the full SHA c3a1becView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ca2260 - Browse repository at this point
Copy the full SHA 1ca2260View commit details -
Configuration menu - View commit details
-
Copy full SHA for ee8f8be - Browse repository at this point
Copy the full SHA ee8f8beView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d61f54 - Browse repository at this point
Copy the full SHA 5d61f54View commit details -
Configuration menu - View commit details
-
Copy full SHA for 10a79a2 - Browse repository at this point
Copy the full SHA 10a79a2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8819b64 - Browse repository at this point
Copy the full SHA 8819b64View commit details -
Configuration menu - View commit details
-
Copy full SHA for 45f6010 - Browse repository at this point
Copy the full SHA 45f6010View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8974fc2 - Browse repository at this point
Copy the full SHA 8974fc2View commit details
Commits on Aug 13, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 0bea423 - Browse repository at this point
Copy the full SHA 0bea423View commit details
Commits on Aug 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 21d0ea9 - Browse repository at this point
Copy the full SHA 21d0ea9View commit details