您好，请问一下为什么Qwen2.5官方发布的榜单中72B和32B的LiveCodeBench分数跟LiveCodeBench作者公开72B和32B的分数差异这么大？ #1079

13416157913 · 2024-11-15T01:43:12Z

Qwen2.5

Qwen2.5-72B-Instruct,Qwen2.5-32B-Instruct

vllm

I have followed the GitHub README.
I have checked the Qwen documentation and cannot find an answer there.
I have checked the documentation of the related framework and cannot find useful information.
I have searched the issues and there is not a similar one.

您好，请问一下为什么Qwen2.5官方发布的榜单中72B和32B的LiveCodeBench分数跟LiveCodeBench作者公开72B和32B的分数差异这么大？

您好，请问一下为什么Qwen2.5官方发布的榜单中72B和32B的LiveCodeBench分数跟LiveCodeBench作者公开72B和32B的分数差异这么大？

您好，请问一下为什么Qwen2.5官方发布的榜单中72B和32B的LiveCodeBench分数跟LiveCodeBench作者公开72B和32B的分数差异这么大？

13416157913 · 2024-11-19T11:18:35Z

还有个疑问就是：LiveCodeBench基准中有代码生成、自我修复、代码执行、测试输出预测 4个场景，请问Qwen2.5官方发布的榜单中的LiveCodeBench分数使用的是哪个场景的分数？

huybery · 2024-11-20T05:23:13Z

你需要拖动具体日期，与我们表格中的日期对应。

13416157913 · 2024-11-20T05:37:45Z

你需要拖动具体日期，与我们表格中的日期对应。

选择2023-5~2024-9，Qwen2.5-72B-Instruct分数时50，和Qwen2.5官网发布的55.5差距还是很大；

jklj077 assigned huybery Nov 19, 2024

Provide feedback