Polish4 temp #11

djstrong · 2024-08-02T22:39:40Z

No description provided.

Restored default case for cbd_regex Fixed typo in klej_ner_mc

* add fix fordeciding if stderr is N/A or not * process N/A

* Add `local-completions` support using OpenAI interface * Refactor oa_completion * Address tokenizer comments and change request chunks to batch size * Add warning message for tiktoken backend * fix formatting * fix whitespace * Update README.md --------- Co-authored-by: Hailey Schoelkopf <[email protected]>

* Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>

…eutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]>

* manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy

…ogprobs=1 (EleutherAI#1345)

* get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint

) * Update README.md * [!Tip]

* added intel optimum * added intel optimum in readme * modified intel optimum * modified intel optimum * modified intel optimum * modified install optimum * modified path of IR file * added openvino_device * added openvino_device2 * changed optimum-causal to openvino-causal * Update README.md * Update README.md * remove `lm_eval.base` import * update openvino-causal -> openvino ; pass device through super().__init__() * Update README.md * Add optimum to tests dependencies * apply pre-commit * fix so tests pass --------- Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]>

…therAI#1363) * raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions pylint-dev/pylint#2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression

* delay filter init; remove `*args` * bugfix * optimize * type hint

* don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning

djstrong and others added 30 commits February 27, 2024 21:10

fixes

1cb35cc

fix: change the cbd_mc to be CATEGORIES-based

55f274b

Restored default case for cbd_regex Fixed typo in klej_ner_mc

fix: typo in cbd_mc.yaml

35af374

fix: typo in cbd_mc.yaml

9540f16

update polish groups

d3d7d01

fix regex tasks; add benchmark groups

c4679ce

fix stderr aggregation

7fc327d

add perplexity task

e14e593

belebele mc

bc61568

Update task_guide.md (EleutherAI#1316)

85eb77f

Update polemo2_in.yaml (EleutherAI#1318)

0632a05

don't pass extra kwargs to mamba any more (EleutherAI#1328)

bb879de

Fix Issue regarding stderr (EleutherAI#1327)

6414edd

* add fix fordeciding if stderr is N/A or not * process N/A

fallback to classname when LM doesnt have config (EleutherAI#1334)

f0ba560

fix a trailing whitespace that breaks a lint job (EleutherAI#1335)

9dd448b

skip "benchmarks" in changed_tasks (EleutherAI#1336)

4f263af

Update migrated HF dataset paths (EleutherAI#1332)

0d8d549

* Update arc_easy.yaml * Update flan_cot.yaml * update HF dataset path * Update freeform.yaml * Update flan_cot.yaml --------- Co-authored-by: Lintang Sutawika <[email protected]>

Don't use get_task_dict() in task registration / initialization (El…

268d252

…eutherAI#1331) * don't use get_task_dict() as a helper, it will download the dataset! * pre-commit * Update README.md --------- Co-authored-by: lintangsutawika <[email protected]>

manage default (greedy) gen_kwargs in vllm (EleutherAI#1341)

82e319d

* manage default (greedy) gen_kwargs in vllm better * mirror HF `do_sample` * just need to set temp=0 for greedy

modified default gen_kwargs to work better with CLI; changed prompt_l…

0938c13

…ogprobs=1 (EleutherAI#1345)

update links to task_guide.md (EleutherAI#1348)

97361ed

Filter docs not offset by doc_id (EleutherAI#1349)

ca3a895

* get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint

Add FAQ on lm_eval.tasks.initialize_tasks() to README (EleutherAI#1330

2eeaf15

) * Update README.md * [!Tip]

Refix issue regarding stderr (EleutherAI#1357)

d467d2f

serialize callable functions in config (EleutherAI#1367)

b43d9d9

delay filter init; remove *args (EleutherAI#1369)

2b31cfb

* delay filter init; remove `*args` * bugfix * optimize * type hint

Fix unintuitive --gen_kwargs behavior (EleutherAI#1329)

cdc41c4

* don't override do_sample if no value for it is passed * Update gen_kwargs override condition * Update huggingface.py * Update huggingface.py * run linters * silence an erroneous warning

djstrong added 30 commits August 2, 2024 16:07

polish eq-bench

637afd1

polish eq-bench

53039c2

polish eq-bench

6d5e657

polish eq-bench

38e954a

polish eq-bench

88e0034

polish eq-bench

458cdc7

polish eq-bench

fea7b68

polish eq-bench

2693979

polish eq-bench

776a4d3

polish eq-bench

53f1dfc

polish eq-bench

87b2160

polish eq-bench

8998552

polish eq-bench

b6c4ac3

polish eq-bench

f9ca054

polish eq-bench

29959a3

polish eq-bench

aaaac9d

polish eq-bench

db195a2

polish eq-bench

d96cd84

polish eq-bench

2df424f

polish eq-bench

badffa9

polish eq-bench

c3a1bec

fgd

1ca2260

fgd

ee8f8be

generate until <|im_end|>

5d61f54

powuad; pes; hash fix

10a79a2

fix multiple choice openai

8819b64

fix multiple choice openai

45f6010

fix multiple choice openai

8974fc2

fix belebele

0bea423

polish pes split

21d0ea9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish4 temp #11

Polish4 temp #11

djstrong commented Aug 2, 2024

Polish4 temp #11

Are you sure you want to change the base?

Polish4 temp #11

Conversation

djstrong commented Aug 2, 2024