Polish4 temp #11

Restored default case for cbd_regex Fixed typo in klej_ner_mc

* add fix fordeciding if stderr is N/A or not * process N/A

* Add `local-completions` support using OpenAI interface * Refactor oa_completion * Address tokenizer comments and change request chunks to batch size * Add warning message for tiktoken backend * fix formatting * fix whitespace * Update README.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>

…eutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]>

* manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy

…ogprobs=1 (EleutherAI#1345)

* get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint

) * Update README.md * [!Tip]

* added intel optimum * added intel optimum in readme * modified intel optimum * modified intel optimum * modified intel optimum * modified install optimum * modified path of IR file * added openvino_device * added openvino_device2 * changed optimum-causal to openvino-causal * Update README.md * Update README.md * remove `lm_eval.base` import * update openvino-causal -> openvino ; pass device through super().__init__() * Update README.md * Add optimum to tests dependencies * apply pre-commit * fix so tests pass --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

…therAI#1363) * raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions pylint-dev/pylint#2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression

* delay filter init; remove `*args` * bugfix * optimize * type hint

* don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning

* publish to pypi * lint * Update publish.yml * minor

* make deps not point to github urls * formatting * try making PyPI only run on tag pushes

* Add support for RWKV models with World tokenizer The RWKV line of model with the World tokenizer, does not allow the padding token to be configured, and has its value preset as 0 This however fails all the "if set" checks, and would cause the tokenizer to crash. A tokenizer class name check was added, in addition to a model type check, as there exists RWKV models which uses the neox tokenizers * Update huggingface.py Genericized so that this supports any RWKVWorld tokenizer, and added a fall-back for if the HF implementation name changes. * Comply with formatting guidelines * fix format --------- Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>

* add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo

* Update CITATION.bib * Create CONTRIBUTING.md * add disclaimer re: multi node * flesh out some sections more * Flesh out contributor guide * revert CITATION.bib * appease pre-commit --------- Co-authored-by: lintangsutawika <[email protected]>

* edge cases where variable might not be assigned. * type hint

* allow tasks to specify printed fewshot val * fix to belebele * update metadata field's documentation

* add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * check if config is task update * add GroupConfig object * edit test yaml * remove args * testing returning to python task list * add weight_by_size config * describe weight_by_size in docs * fix weight by size potential error * can load individual custom python class task * moved import_function into the config loading file * remove print lines * add squadv2 yaml * temporary scroll implementation * revert back to use load_yaml_config but with modes * fix group being loaded with a None * reformat * can load unregistered tasks from a group * update scrolls * edit scrolls multiplechoice task * adjust class initialization * fix initialization * changed how to identify group and python tasks, fix logger * allow loading "include" that is nested in a group config * reworked flan benchmark * allow duplicate task in the same group to co-exist * process group_alias * removed group_alias * allow parameters set in group_config to apply to all tasks in tasklist * add function, but comment for now * reworked processing dict-base config * fixed how configs in group are processed * update to allow root group to have its alias used * remove unused classes * remove unused classes * revert some parts to original * forgot to change one variable * adapt the new process to use get_task_dict * fix for singular group call * fix variable names * add TaskManager into the evaluator * format * changed how dict tasks are loaded * add docs * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Update evaluator.py * Update evaluator.py * remove groupconfig for now * changed _config to config * update interface.md to explain TaskManager * added property functions * adjusted logger * update write_out.py * updated tests * added documentation and some modifications * added docstring documentation * precommit format * updated task loading for tests * updates tests * changed arg order for load_yaml_config * update to handle scrolls and edit log message * remove unused lines * return a list of task classes and not a dict * Update __init__.py * Delete lm_eval/tasks/benchmarks/test.yaml * Update task.py * Update lm_eval/utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update utils.py * re-added old functions with new log message * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Update new_task_guide.md * added infor regarding `get_task_dict` and documentation * add get_config for Task * pre-commit formatting --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

Fixes EleutherAI#1383 If this is okay, it will need to be propagated to SCROLLS

* initial commit * remove overwrite bs * adding neuronx dependencies * Update README.md * update neuronx

Add instructions for non-MacOS users on how to compile janitor_util.cpp so that janitor.py can use it.

…sk groupings (EleutherAI#1390) * update formula for stderr aggregation * hack: see what happens when using stderr_for_metric bootstrapping on a group * undo bootstrap_for_stderr test * factor out variance-aggregation formulas into api.metrics * fix failing tests * remove stray print * update comment * further detail in comment * add back initialize_tasks() call * fix format

* add hf_transfer * update dependencies * Delete stale `[linting]` extra * Update README.md with extras table --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…ound` (EleutherAI#1405) Fixes EleutherAI#1323

* Fixes EleutherAI#1416 Sets `do_sample = False` if `temperature == 0.0` and `do_sample = None` * Update huggingface.py * Update huggingface.py making linter happy

* Fix watchdog timeout * Pre-commit fix * Timedelta

* un-exclude `evaluate.py` from linting * readability * readability * add task name to build info message * fix link * nit * add functions for var and mean pooling * add functions for var and mean pooling * metadata compatibility with task * rename `override_config` to `set_config` and move to `Task` * add unit test * nit * nit * bugfix * nit * nit * nit * add docstrings * fix metadata-fewshot * revert metric refactor * nit * type checking * type hints * type hints * move `override_metric` to `Task` * change metadata * change name * pre-commit * rename * remove * remove * `override_metric` backwards compatible with `Task` * type hints * use generic * type hint

…utherAI#1358)

* Added seeds to `evaluator.simple_evaluate` signature * Added CLI argument * Updated to add arg.

…ightly (EleutherAI#1427) * fix weight_by_size condition * add tests, update stderr formula slightly * apply pre-commit

…shot 0% -> 42%) (EleutherAI#1356) * update bbh, gsm8k, mmlu parsing logic and prompts * remove the formatting prompt (bbh) + minor update (mmlu) * update bbh, gsm8k, mmlu zeroshot, revert fewshots * update bbh, gsm8k, mmlu version, forward changes to gsm8k-cot * remove take_last, update to use docs parameters * add newline * ruff formatting * Update pyproject.toml * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* haerae_reimplementation * edited Readme and add few_shot settings * edited readme * newlines at end of each files * Modifying the README file * applied pre-commit

* add key lookup for same contexts * nit * appease pre-commit * nit * use `expand` (in-place view) rather than `repeat` * try mixed grouping * add docs. * nit * nit * nits * fix tests * Move greedy_tokens calculation out of cache loop * nit * nits * add test * nits * fix name conflict * fix name conflict * chunk tensor * move Collator * nits/docstring * fixup * fixup * group contexts only for decoders * pre-commit * fix `generate_until` test * fix `generate_until` test * Update lm_eval/models/huggingface.py Co-authored-by: Hailey Schoelkopf <[email protected]> * add docs * nit * add docs * add docs * add 'logits_cache' arg * bugfix --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* add new task GPQA_n_shot * add new task GPQA_zeroshot * correct GPQA_zeroshot filename * Add randomly shuffle choices * Correct missing parentheses * delete wrong tasks * Add README * Update lm_eval/tasks/gpqa/zeroshot/_gpqa_zeroshot_yaml * Update lm_eval/tasks/gpqa/n_shot/utils.py * Update lm_eval/tasks/gpqa/n_shot/utils.py * Update lm_eval/tasks/gpqa/README.md * placate linter * linter --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* update kmmlu default formatting * Update _default_kmmlu_yaml * Delete lm_eval/tasks/kmmlu/utils.py * new tasks implemented * add direct tasks * update direct evaluate * update direct eval * add cot sample * update cot * add cot * Update _cot_kmmlu_yaml * add kmmlu90 * Update and rename _cot_kmmlu.yaml to _cot_kmmlu_yaml * Create kmmlu90.yaml * Update _cot_kmmlu_yaml * add direct * Update _cot_kmmlu_yaml * Update and rename kmmlu90.yaml to kmmlu90_cot.yaml * Update kmmlu90_direct.yaml * add kmmlu hard * Update _cot_kmmlu_yaml * Update _cot_kmmlu_yaml * update cot * update cot * erase typo * Update _cot_kmmlu_yaml * update cot * Rename dataset to match k-mmlu-hard * removed kmmlu90 * fixed name 'kmmlu_cot' to 'kmmlu_hard_cot' and revised README * applied pre-commit before pull requests * rename datasets and add notes * Remove DS_Store cache * Update lm_eval/tasks/kmmlu/README.md Co-authored-by: Hailey Schoelkopf <[email protected]> * Change citations and reflect reviews on version * Added kmmlu_hard and fixed other errors * fixing minor errors * remove duplicated * Rename files * try ".index" * minor fix * minor fix again * fix revert. * minor fix. thank for hailey --------- Co-authored-by: GUIJIN SON <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>

@StellaAthena

* loglikelihood refactor using template lm * linter * fix whitespace in target + prompt for CoT gsm8k (EleutherAI#1275) * Make `parallelize=True` vs. `accelerate launch` distinction clearer in docs (EleutherAI#1261) * Make parallelize=True distinction clearer in documentation. * run linter * Allow parameter edits for registered tasks when listed in a benchmark (EleutherAI#1273) * benchmark yamls allow minor edits of already registered tasks * add documentation * removed print * Fix data-parallel evaluation with quantized models (EleutherAI#1270) * add WIP device_map overrides * update handling outside of accelerate launcher * change .to(device) log to debug level * run linter * Rework documentation for explaining local dataset (EleutherAI#1284) * rewor documentation for explaining local dataset * fix typo * Update new_task_guide.md * Re-add citation It looks like Google Scholar has [already noticed](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C9&authuser=2&q=%22A+framework+for+few-shot+language+model+evaluation%2C+12+2023%22&btnG=) the updated citation block so let's add it back in. * Update CITATION.bib (EleutherAI#1285) Bumping CITATION.bib to match re-adding the citation in readme. cc @StellaAthena * Update nq_open.yaml (EleutherAI#1289) * Update README.md with custom integration doc (EleutherAI#1298) * Update README.md * punctuation --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * Update nq_open.yaml (EleutherAI#1305) * Update nq_open.yaml change regex * Bump NQ version --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * Update task_guide.md (EleutherAI#1306) * Update pyproject.toml (EleutherAI#1312) * Fix polemo2_in.yaml config name (EleutherAI#1313) * Update pyproject.toml (EleutherAI#1314) * Fix group register (EleutherAI#1315) * tuple should be considered as well * set option to keep callable as callable * Update task_guide.md (EleutherAI#1316) * Update polemo2_in.yaml (EleutherAI#1318) * don't pass extra kwargs to mamba any more (EleutherAI#1328) * Fix Issue regarding stderr (EleutherAI#1327) * add fix fordeciding if stderr is N/A or not * process N/A * Add `local-completions` support using OpenAI interface (EleutherAI#1277) * Add `local-completions` support using OpenAI interface * Refactor oa_completion * Address tokenizer comments and change request chunks to batch size * Add warning message for tiktoken backend * fix formatting * fix whitespace * Update README.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]> * fallback to classname when LM doesnt have config (EleutherAI#1334) * fix a trailing whitespace that breaks a lint job (EleutherAI#1335) * skip "benchmarks" in changed_tasks (EleutherAI#1336) * Update migrated HF dataset paths (EleutherAI#1332) * Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]> * Don't use `get_task_dict()` in task registration / initialization (EleutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]> * manage default (greedy) gen_kwargs in vllm (EleutherAI#1341) * manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy * modified default gen_kwargs to work better with CLI; changed prompt_logprobs=1 (EleutherAI#1345) * update links to task_guide.md (EleutherAI#1348) * `Filter` docs not offset by `doc_id` (EleutherAI#1349) * get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint * Add FAQ on `lm_eval.tasks.initialize_tasks()` to README (EleutherAI#1330) * Update README.md * [!Tip] * Refix issue regarding stderr (EleutherAI#1357) * Add causalLM OpenVino models (EleutherAI#1290) * added intel optimum * added intel optimum in readme * modified intel optimum * modified intel optimum * modified intel optimum * modified install optimum * modified path of IR file * added openvino_device * added openvino_device2 * changed optimum-causal to openvino-causal * Update README.md * Update README.md * remove `lm_eval.base` import * update openvino-causal -> openvino ; pass device through super().__init__() * Update README.md * Add optimum to tests dependencies * apply pre-commit * fix so tests pass --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> * Apply some best practices and guideline recommendations to code (EleutherAI#1363) * raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions pylint-dev/pylint#2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression * serialize callable functions in config (EleutherAI#1367) * delay filter init; remove `*args` (EleutherAI#1369) * delay filter init; remove `*args` * bugfix * optimize * type hint * Fix unintuitive `--gen_kwargs` behavior (EleutherAI#1329) * don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning * Publish to pypi (EleutherAI#1194) * publish to pypi * lint * Update publish.yml * minor * Make dependencies compatible with PyPI (EleutherAI#1378) * make deps not point to github urls * formatting * try making PyPI only run on tag pushes * Add support for RWKV models with World tokenizer (EleutherAI#1374) * Add support for RWKV models with World tokenizer The RWKV line of model with the World tokenizer, does not allow the padding token to be configured, and has its value preset as 0 This however fails all the "if set" checks, and would cause the tokenizer to crash. A tokenizer class name check was added, in addition to a model type check, as there exists RWKV models which uses the neox tokenizers * Update huggingface.py Genericized so that this supports any RWKVWorld tokenizer, and added a fall-back for if the HF implementation name changes. * Comply with formatting guidelines * fix format --------- Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> * add bypass metric (EleutherAI#1156) * add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo * loglikelihood refactor using template lm * lint * code review * neuron optimum * Mention TemplateLM in model_guide.md * Update lm_eval/api/model.py * fix linter * fix format * fix format * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: Lintang Sutawika <[email protected]> Co-authored-by: Stella Biderman <[email protected]> Co-authored-by: Mark Saroufim <[email protected]> Co-authored-by: Hannibal046 <[email protected]> Co-authored-by: Danielle Pintz <[email protected]> Co-authored-by: Quentin Lhoest <[email protected]> Co-authored-by: kwrobel.eth <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: Brian Vaughan <[email protected]> Co-authored-by: Baber Abbasi <[email protected]> Co-authored-by: thnkinbtfly <[email protected]> Co-authored-by: NoushNabi <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: LSinev <[email protected]> Co-authored-by: Eugene Cheah <[email protected]>

* log group membership * no stray prints * Update evaluator.py

…EleutherAI#1440) * fix the issue EleutherAI#1391, wrong contexts in mgsm tasks * fix yaml issue for having two target_delimiter lines. For COT tasks, keep the one with a space (default) * regenerate all task yaml files - change naming so that file name will match with task name - task|file follows a consistent naming way, mgsm_(mode)_(lang) for three modes, i.e., direct, en_cot, and native_cot * English CoTs should have a space as target_delimiter * Update utils.py * Apply suggestions from code review --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* add wandb as extra dependency * wandb metrics logging * refactor * log samples as tables * fix linter * refactor: put in a class * change dir * add panels * log eval as table * improve tables logging * improve reports logging * precommit run * ruff check * handle importing reports api gracefully * ruff * compare results * minor pre-commit fixes * build comparison report * ruff check * log results as artifacts * remove comparison script * update dependency * type annotate and docstring * add example * update readme * fix typo * teardown * handle outside wandb run * gracefully fail reports creation * precommit checks * add report url to summary * use wandb printer for better url stdout * fix ruff * handle N/A and groups * fix eval table * remove unused var * update wandb version req + disable reports stdout * remove reports feature to TODO * add label to multi-choice question data * log model predictions * lints * loglikelihood_rolling * log eval result for groups * log tables by group for better handling * precommit * choices column for multi-choice * graciously fail wandb * remove reports feature * track system metrics + total eval time + stdout --------- Co-authored-by: Lintang Sutawika <[email protected]>

…erAI#1458) * Fixed generation args issue affection openai completion model * Fixed hf unit test; removed pop attributes in OpenAi completion. * fix format * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…#1466) * interface docs * fix link

…utherAI#1464) * Save git_hash to results even if git is not available to call as subprocess * Store more info about environment and transformers version in results to help researchers track inconsistencies * moved added logging to logging_utils * moved get_git_commit_hash to logging_utils.py * moved add_env_info inside evaluator

…eutherAI#1469)

* add arabic mmlu * update the description * add readme file

) * add add_bos_token to HFLM * add BOS token flag to other local model classes --------- Co-authored-by: Lintang Sutawika <[email protected]>

This reverts commit c1145df.

EleutherAI#1372) * Create a means for caching task registration and request building. Add the ability to specify an args dict for simple_evaluate(). * Remove extra S in cache path in caching module Co-authored-by: Hailey Schoelkopf <[email protected]> * Rename requests cache args, make model_args polymorphic so that a dict can also be accepted. * Update docs to reflect new caching behavior, add CLI args for requests caching. Create a function for deleting items in the cache. * Update documentation, fix minor bug with arg parsing for requests caching where an undefined variable was used. * Remove line from gitignore, add to cli for caching datasets. * Add hashing suffix to .pickles. Update test script typo. * Favor isinstance() over type() in evaluator.py * Add tests for caching, gets tests working, remove unneeded arg from build_all_requests(). * Update arg description to simple_evaluate. * Update pyproject.toml * Fix typehint * Remove the use of random() for creating default cache pickle hash. * Check that cache dir exists before clearing it in request cache tests. * Fix linting problems. * Fix additional formatting errors. * Remove trailing whitespace. * Add new line to the end of .gitignore. --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* add brier_score * process brier_score * brier score is working for N-sized class * fxied brier score * add TED to BigBench and Brier score to MMLU * format * Update metrics.py * Update task.py * Update generate_until_template_yaml * Delete lm_eval/tasks/bigbench/aux_metric.py * Update generate_until_template_yaml * Update _default_template_yaml * Update _generate_configs.py * Update _generate_configs.py * Update _generate_configs.py * fix (format?) * format? * format, once more --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* change `all_gather` to `gather` * add TaskOutput utility class * Add FilterResults class and refactor task handling. * Rename `key` to `filter_key` for clarity * Add `print_writeout` function in utils.py * Add function to calculate limit size. * Add doc_iterator method to Task class * Refactor `doc_iterator` and cleanup in Task class * remove superfluous bits * change `all_gather` to `gather` * bugfix * bugfix * fix `gather` * Refactor `gather` loop * Refactor aggregate metrics calculation * Refactor and simplify aggregate metrics calculation Removed unused code * Simplify metrics calculation and remove unused code. * simplify the metrics calculation in `utils.py` and `evaluator.py`. * Fix group metric * change evaluate to hf_evaluate * change evaluate to hf_evaluate * add docs * add docs * nits * make isslice keyword only * nit * add todo * nit * nit * nit: swap order samples_metrics tuple * move instance sorting outside loop * nit * nit * Add __repr__ for ConfigurableTask * nit * nit * Revert "nit" This reverts commit dab8d99. * fix some logging * nit * fix `predict_only` bug. thanks to `@LSinev`! * change `print_tasks` to `prepare_print_tasks` * nits * move eval utils * move eval utils * nit * add comment * added tqdm descriptions * Update lm_eval/evaluator_utils.py Co-authored-by: Hailey Schoelkopf <[email protected]> * fix mgsm bug * nit * fix `build_all_requests` * pre-commit * add ceil to limit --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…eutherAI#1489) * model_type attribute error Getting attribute error when using a model without a 'model_type' * fix w/ and w/out the 'model_type' specification * use getattr(), also fix other config.model_type reference * Update huggingface.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* add undistribute + use more_itertools * remove divide() util fn * add more_itertools as dependency

* make `WandbLogger` init args optional * nit * nit * nit * move import warning to `WandbLogger` * nit * update docs * nit

* use `@ray.remote` with distributed vLLM * update versions * bugfix * unpin vllm * fix pre-commit * added version assertion error * Revert "added version assertion error" This reverts commit 8041e9b. * added version assertion for DP * expand DP note * add warning * nit * pin vllm * fix typos

…ity (EleutherAI#1487) * setting trust_remote_code * dataset list no notebooks * respect trust remote code * Address changes, move cli options and change datasets * fix task for tests * headqa * remove kobest * pin datasets and address comments * clean up space

* add french-bench * rename arc easy * linting * update datasets for no remote code exec * fix string delimiter * add info to readmr * trim trailing whitespace * add detailed groups * add info to readme * remove orangesum title from fbench main * Force PPL tasks to be 0-shot --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* Fix padding * Fix elif in model loading * format

* Add new tasks of GPQA * Add README * Remove unused functions * Remove unused functions * Linters * Add flexible match * update * Remove deplicate function * Linter * update * Update lm_eval/filters/extraction.py Co-authored-by: Hailey Schoelkopf <[email protected]> * register multi_choice_regex * Update * run precommit --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* Start adding eq-bench * Start adding to yaml and utils * Get metric working * Add README * Handle cases where answer is not parseable * Deal with unparseable answers and add percent_parseable metric * Update README

* init wmdp yaml file * Add WMDP Multiple-choice * fix linter issues * Delete lm_eval/tasks/wmdp/_wmdp.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>

)

…used by cot which hardcodes fewshot prompt (EleutherAI#1502)

…eutherAI#1533) * Remove unused `decontamination_ngrams_path` and all mentions (still no alternative path provided) * Fix improper import of LM and usage of evaluator in one of scripts * update type hints in instance and task api * raising errors in task.py instead of asserts * Fix warnings from ruff * raising errors in __main__.py instead of asserts * raising errors in tasks/__init__.py instead of asserts * raising errors in evaluator.py instead of asserts * evaluator: update type hints and remove unused variables in code * Update lm_eval/__main__.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/__main__.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/api/task.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update lm_eval/evaluator.py Co-authored-by: Hailey Schoelkopf <[email protected]> * pre-commit induced fixes --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…g document and, update wandb_args description (EleutherAI#1536) * Update openai completions and docs/CONTRIBUTING.md * Update wandb args description * Update docs/interface.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* Add compatibility for vLLM's new Logprob object * Fix * Update lm_eval/models/vllm_causallms.py * fix format? * trailing whitespace --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…leutherAI#1551) * update gen_kwargs in code2-text-go.yaml * update gen_kwargs in rest code2-text

* Support jinja templating for "description" * Update task_guide.md * Update lm_eval/api/task.py * fix format? * whitespace errors * fix whitespace * fix bad variable reference --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* add Arabic EXAMS benchmark * fixed the linter issue, and add more information on the readme * Update README.md --------- Co-authored-by: Lintang Sutawika <[email protected]>

* add agieval * fix typo * add cloze / math exactmatch agieval tasks, rename * update exact-match agieval tasks, allow for multiple-correct answers * add more detail to readme * don't parse_math_answer twice --------- Co-authored-by: Alex Bäuerle <[email protected]>

…AI#1563)

* add manual tqdm disabling management * add typing to all new args * apply precommit changes --------- Co-authored-by: haileyschoelkopf <[email protected]>

* Link to vllm integration * add pip install .[vllm] cmd

* New tests for CLI args * fix spacing * change tests for parsing * add tests, fix parser * remove defaults for store_true

* Differentiate _encode_pair setting for decoder and enc-dec models * tok_decode to not skip special token so that eos doen't become empty string * Update model.py * Update model.py * Update huggingface.py * Update lm_eval/models/huggingface.py Co-authored-by: Hailey Schoelkopf <[email protected]> * Update model.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* Update interface.md * fix: make caching reqs always work with accelerate launch * remove stale task migration checklist * remove deprecation warnings * make informative TypeErrors for get_task_dict * bump version metadata * fix num_fewshot printing bug * add fewshot value to cache key

* Fix eval_logger import for mmlu/_generate_configs.py * linter --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* use BOS token in loglikelihood * improve comments * add model arg * log prefix token id * log prefix token id * Update lm_eval/api/model.py Co-authored-by: Hailey Schoelkopf <[email protected]> * change name to prefix_token_id --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…herAI#1601) This reverts commit b7923a8.

* make vllm use prefix_token_id ; have prefix_token_id be optional method to define * custom_prefix_token_id wasn't set if not passed

* Add task ACLUE * fix minor bug * fix code style * fix code style

…AI#1612)

* add logging of model args * nit * Add warnings. * nit * add warning * nit

* peft Version Assertion * fix the linter issue

* fix on --task list * add fixes to tokeniation * differentiate encoding for seq2seq and decoder * return token setting * format for pre-commit * Seq2seq fix, pt2 (EleutherAI#1630) * getting model class only when defined * encode_pair handles None, add_special_tokens turned into dict with default value --------- Co-authored-by: achervyakov <[email protected]>

…erAI#1598) * Integration of NeMo models into LM Evaluation Harness library * rename nemo model as nemo_lm * move nemo section in readme after hf section * use self.eot_token_id in get_until() * improve progress bar showing loglikelihood requests * data replication or tensor/pipeline replication working fine within one node * run pre-commit on modified files * check whether dependencies are installed * clarify usage of torchrun in README

…leutherAI#1647)

* add basqueglue * add eus_exams * add eus_proficiency * add eus_reading * add eus_trivia * run pre-commit

…eutherAI#1656) The OpenAI interface supports batch size as an argument to the completions API, but does not seem to support specification of this on the CLI i.e. `lm_eval --model openai-completions --batch_size 16 ...` because of a simple lack of str->int conversion. This is confirmed by my usage and stacktrace from running `OPENAI_API_KEY=dummy lm_eval --model local-completions --tasks gsm8k --batch_size 16 --model_args model=nm- testing/zephyr-beta-7b-gptq-g128,tokenizer_backend=huggingface,base_url=http://localhost:8000/v1`: ``` Traceback (most recent call last): File "/home/michael/venv/bin/lm_eval", line 8, in <module> sys.exit(cli_evaluate()) File "/home/michael/code/lm-evaluation-harness/lm_eval/__main__.py", line 341, in cli_evaluate results = evaluator.simple_evaluate( File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 251, in simple_evaluate results = evaluate( File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper return fn(*args, **kwargs) File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 390, in evaluate resps = getattr(lm, reqtype)(cloned_reqs) File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 263, in generate_until list(sameuntil_chunks(re_ord.get_reordered(), self.batch_size)), File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 251, in sameuntil_chunks if len(ret) >= size or x[1] != lastuntil: TypeError: '>=' not supported between instances of 'int' and 'str' ```

* implementation of TMMLU+ * implemented: TMMLU+ ****TMMLU+ : large-scale Traditional chinese Massive Multitask language Understanding**** - 4 categories - STEM - Social Science - Humanities - Other The TMMLU+ dataset, encompassing over 67 subjects and 20160 tasks, is six times larger and more balanced than its predecessor, TMMLU, and includes benchmark results from both closed-source and 20 open-weight Chinese large language models with 1.8B to 72B parameters. However, Traditional Chinese variants continue to underperform compared to major Simplified Chinese models. ```markdown Total number of tasks in the 'test' sets: 20160 Total number of tasks in the 'validation' sets: 2247 Total number of tasks in the 'train' sets: 335 ``` * Remove print from __init__.py There was my mistake in forgetting to remove the debug print from the code. * update: move TMMLU+ config generation program into default * fix: we should use training set as few shots example * update: README for TMMLU+ * update: a small changes of TMMLU+ README file * pre-commit run thought * Add README for TMMLU+ dataset * run precommit * trigger precommit again * trigger precommit again * isort is fussy * isort is fussy * format, again * oops * oops --------- Co-authored-by: lintang <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* claude3 * supply for anthropic claude3 * supply for anthropic claude3 * anthropic config changes * add callback options on anthropic * line passed * claude3 tiny change * help anthropic installation * mention sysprompt / being careful with format in readme --------- Co-authored-by: haileyschoelkopf <[email protected]>

* correction bug EleutherAI#1664 * add any invalid characters for Windows filenames and Unix-like systems see: https://gist.github.com/doctaphred/d01d05291546186941e1b7ddc02034d3?permalink_comment_id=3958715 * Update lm_eval/__main__.py * Update scripts/zeno_visualize.py * fix format --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* added delta weights * removed debug * readme update * better error handling * autogptq warn * warn update * peft and delta error, explicitly deleting _model_delta * linter fix

…1674) * Add neuralmagic models for SparseML and DeepSparse * Update to latest and add test * Format * Fix list to List * Format * Add deepsparse/sparseml to automated testing * Update pyproject.toml * Update pyproject.toml * Update README * Fixes for dtype and device * Format * Fix test * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * Address review comments! --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…herAI#1699)

EleutherAI#1698

* add xnli_eu tasks * update tasks readme * update readme

) * Update task.py * Update __init__.py

* Support individual scrolls datasets * Add qmsum context * Fix formatting

* Add register_filter decorator * Add register_filter docs

* Add Pile-10k readme * Add Pile-10k task configuration file

* Update utils.py This is a 4-choice task, option_e is null for all but 3 samples * Fix options Adaptive choices * add option e * bump multilingual arc version --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* upload new tasks * add readmes * run linters --------- Co-authored-by: haileyschoelkopf <[email protected]>

* vllm lora support * remove print * version check, rename lora kwarg

* Add option to set OpenVINO config * Use utils.eval_logger for logging

* evaluation tracker implementation * OVModelForCausalLM test fix * typo fix * moved methods args * multiple args in one flag * loggers moved to dedicated dir * improved filename sanitization

* remove echo parameter in OpenAI completions API * remove context length parameter doc string

…herAI#1776) fix `----hf_hub_log_args` to `--hf_hub_log_args`

) * Added fewshot sampling seeds to evaluator.simple_evaluate signature Way to control seed of fewshot sampling may help with EleutherAI#1591 * Added ability for custom sampler for ConfigurableTask May be set in config like ``` fewshot_config: sampler: !function utils.MyFewshotSampler ``` * explicitly set fewshot random generator seed for HFLM generate_until_task test * add backward compatibility for three args seed setup * save seeds info to logs/reports

)

…) variant (EleutherAI#1793) * add Hendrycks MATH (no sympy checking) variant * add readmes for MATH tasks

…leutherAI#1774) (EleutherAI#1791) * fix auto-batch size bug for seq2seq models * alphabetize task + group tables ; fix eval tracker bug * fix eval tracker bug

* Initial support for Unitxt datasets in LM Eval Harness See https://github.com/IBM/unitxt The script 'generate_yamls.py' creates LM Eval Harness yaml files corresponding to Unitxt datasets specified in the 'unitxt_datasets' file. The glue code required to register Unitxt metrics is in 'unitxt_wrapper.py'. * Added dataset loading check to generate_yaml Improved error messages. * Speed up generate_yaml Added printouts and improved error message * Added output printout * Simplified integration of unitxt datasets Store all the common yaml configuration in a yaml include shared by all datasets of the same task. * Post code review comments - part 1 1. Made sure include files don't end wth 'yaml' so they won't be marked as tasks 2. Added more datasets and tasks (NER, GEC) 3. Added README * Post code review comments - part 2 1. Added install unitxt install option in pyproject.toml: pip install 'lm_eval[unitxt]' 2. Added a check that unitxt is installed and print a clear error message if not * Commited missing pyproject change * Added documentation on adding datasets * More doc changes * add unitxt extra to readme * run precommit --------- Co-authored-by: haileyschoelkopf <[email protected]>

…I#1745) * add mmlu arc style evaluation * rename arc_style to continuation --------- Co-authored-by: Jonathan Burdge <[email protected]> Co-authored-by: Jonathan Burdge <[email protected]>

…I#1806) * update interface documentation with flag --hf_hub_logs_arg * update interface documentation with flag --hf_hub_logs_arg 2

* add copal * change name to copal id for clarity and the task name * remove `copal_id...` to yaml to make it work * checkmark on README * change group name to `copal_id`

* Add tinyBenchmarks * Add acknowledgements * Add ordering of outputs for data-parallel * Run pre-commit * Add few_shot specifications * Add tinyBenchmarks post-processing * add conditional import ; fix task names --------- Co-authored-by: haileyschoelkopf <[email protected]>

* resize model embeddings * resize only * tokenizer help * load tokenizer before model * add comment and run precommit lint * Add log message Co-authored-by: Hailey Schoelkopf <[email protected]> --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

…rAI#1865)

…#1880)

…erAI#1790) * fix auto-batch size bug for seq2seq models * run linter

`gold_one_hot` needs to follow the dimension of predictions so that it still works when `--limit` is used and the indexes in gold does not cover all gold indexes.

* add handling for bootstrap_iters=0 case * add more detail to docstring * run precommit

* add mmlu tasks from pile-t5 * Update _mmlu_flan_cot_fewshot_template_yaml * Update _mmlu_flan_cot_zeroshot_template_yaml * Update _mmlu_flan_generative_template_yaml * Update _mmlu_flan_loglikelihood_template_yaml * Update _default_template_yaml --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* edit process multiple-choice * split template yaml * remove * modified multiple_choice tasks * udpate * Update multiple_choice_template_b_yaml * Update multiple_choice_template_a_yaml --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* rename lm_eval.logging module * fix evaluation tracker args

* Reorder vllm imports in vllm_causallms.py * Update vllm_causallms.py

* [HFLM]Add support for Ascend NPU Co-authored-by: jiaqiw09 <[email protected]> Co-authored-by: zhabuye <[email protected]> * bump accelerate dependency version to 0.26.0 for NPU compat. --------- Co-authored-by: jiaqiw09 <[email protected]> Co-authored-by: zhabuye <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>

* Higher is better tickers in output table * add extra check for `higher_is_better` not being None already * Update lm_eval/evaluator.py * fixup format I messed up * add comment (and retrigger tests) --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* dataset card initial * few fixes * adds groups for math, mmlu, gpqa * added summary agrs * moved sanitize_list to utils * readme update * recreate metadata moved * multiple model support * results latest split fix * readme update and small refactor * fix grouping * add comments * added pathlib * corrected pathlib approach * check whether to create a metadata card * convert posix paths to str * default hf org from token * hf token value error * Add logs after successful upload * logging updates * dataset card example in the readme --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Alina Lozovskaia <[email protected]>

EleutherAI#1895) * init test 1 * fix * this format seems to be working - need to update all other tasks with the new format * bbh with few shot format * fix fewshot bbh * add mmlu flan cot * samples of cot * kmmlu * fix gsm8k * update keys for mmlu * minerva math * bbh * fix * fix samples * small fixes to templates * last prompt format change * fixing prompt * fixed minerva math format * rm accidental commited file * added doc for few shot samples * Update lm_eval/loggers/evaluation_tracker.py * Update lm_eval/loggers/evaluation_tracker.py * Update docs/new_task_guide.md Co-authored-by: Hailey Schoelkopf <[email protected]> * added check in sampler per code review * added the system from a function, plus an example in minerva math * style * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * fix unit tests 1 * forcing use of test split --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

Fix EleutherAI#1906

* added tasks and task family descriptors * continue work on task list w/ links; slightly reorganize README * Apply suggestions from code review * Rename file so that it'll preview in Github when viewing lm_eval/tasks folder * Update new_task_guide.md * Update README.md * run linter * Add language column to task table; Add missing tasks to task table; fix nq_open and storycloze READMEs * fix typo * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> * apply format --------- Co-authored-by: Harish Vadaparty <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* initial chat template * tokenizer attribute check * variable rename * interface update * system instruction * system inst default update * fewshot as multiturn * typing update * indent update * added comments * Adding a fewshot in a more readable way * linting * Moved apply chat template to LM * multiturn alternation fix * cache key update * apply chat template method fix * add system prompt hash to cache_key * tokenizer name property for cache_key * property name fix * linting backward compatibility fix * docs and errors update * add documentation on adding chat template compatibility to model_guide * fewshot as multiturn check fix * saving system inst and chat template in results * eval tracker update * docs update * Apply suggestions from code review Co-authored-by: Hailey Schoelkopf <[email protected]> --------- Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>

…th Fictional Medical Data (EleutherAI#1867) * glianorex tasks * Create README.md * Update README.md * Update README.md * fix formatting * fix internal formatting

…d not at current merge commit (EleutherAI#1927)

* added tasks and task family descriptors * configs for the new lambada translations * continue work on task list w/ links; slightly reorganize README * Apply suggestions from code review * Rename file so that it'll preview in Github when viewing lm_eval/tasks folder * Update new_task_guide.md * Update README.md * run linter * Add language column to task table; Add missing tasks to task table; fix nq_open and storycloze READMEs * fix typo * update `lm_eval/tasks/README.md` with task description --------- Co-authored-by: Harish Vadaparty <[email protected]> Co-authored-by: anthony <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

* Noticia * test * Final testes implementation * Fixes * Fix linters

* Update README.md * Update bec.yaml * Update bhtc.yaml * Update coref.yaml * Update qnli.yaml * Update vaxx.yaml * Update wic.yaml

* sort metrics in output table * update docstring in `consolidate_results` * add tests for verifying consistency of table output * update tests to account for floating point inconsistencies * updated tests based on `pythia-14m`

…mo2 (multiple_choice)

Commits on Mar 3, 2024

fix regex tasks; add benchmark groups

djstrong committed Mar 3, 2024

Configuration menu

View commit details

Copy full SHA for c4679ce

Browse repository at this point

Copy the full SHA

c4679ce View commit details

Browse the repository at this point in the history

Commits on Mar 5, 2024

fix stderr aggregation

djstrong committed Mar 5, 2024

Configuration menu

View commit details

Copy full SHA for 7fc327d

Browse repository at this point

Copy the full SHA

7fc327d View commit details

Browse the repository at this point in the history

Commits on Aug 13, 2024

fix belebele

djstrong committed Aug 13, 2024

Configuration menu

View commit details

Copy full SHA for 0bea423

Browse repository at this point

Copy the full SHA

0bea423 View commit details

Browse the repository at this point in the history

Commits on Aug 22, 2024

polish pes split

djstrong committed Aug 22, 2024

Configuration menu

View commit details

Copy full SHA for 21d0ea9

Browse repository at this point

Copy the full SHA

21d0ea9 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish4 temp #11

Polish4 temp #11

Commits on Feb 27, 2024

Commits on Mar 3, 2024

Commits on Mar 5, 2024

Commits on Mar 10, 2024

Commits on Aug 2, 2024

Commits on Aug 13, 2024

Commits on Aug 22, 2024

Polish4 temp #11

Are you sure you want to change the base?

Polish4 temp #11

Commits on Feb 27, 2024

Commits on Mar 3, 2024

Commits on Mar 5, 2024

Commits on Mar 10, 2024

Commits on Aug 2, 2024

Commits on Aug 13, 2024

Commits on Aug 22, 2024