Towards Event-oriented Long Video Understanding

🔥 News

2024.06.20 🌟 Benchmark, evaluation code, training data, and model are released!

👀 Overview

We introduce Event-Bench, an event-oriented long video understanding benchmark built on existing datasets and human annotations. Event-Bench consists of three event understanding abilities and six event-related tasks, including 2,190 test instances to comprehensively evaluate the ability to understand video events.

Event-Bench provides a systematic comparison across different kinds of capabilities for existing video MLLMs, and points out the major shortcomings of open-source MLLMs.

🔍 Benchmark Data and Instruction Dataset

Download the raw videos in EventBench from the google drive link.

Download the annotation of EventBench from the huggingface link

Download the merged video instruction dataset from the google drive link

License:

Event-Bench is only used for academic research. Commercial use in any form is prohibited.

🔮 Evaluation Pipeline

Prompt:

The common prompt used in our evaluation follows this format:

<QUESTION>
A. <OPTION1>
B. <OPTION2>
C. <OPTION3>
D. <OPTION4>
Answer with the option's letter from the given choices directly.

Evaluation:

We recommend you to save the inference result in the format as example_result.jsonl. Once you have prepared the model responses in this format, please execute our evaluation script evaluate_em.py, and you will get the accuracy scores.

python evaluate_em.py \
    --path $RESULTS_FILE

If you want to use GPT-4-turbo for evaluation, please use the following script evaluate_gpt.py.

python evaluate_gpt.py \
    --input_file $INPUT_FILE \
    --output_file $OUTPUT_FILE

📈 Experimental Results

Evaluation results of different Video MLLMs.

Citation

If you find our work helpful for your research, please consider citing our work.

@misc{du2024eventoriented,
    title={Towards Event-oriented Long Video Understanding},
    author={Yifan Du and Kun Zhou and Yuqi Huo and Yifan Li and Wayne Xin Zhao and Haoyu Lu and Zijia Zhao and Bingning Wang and Weipeng Chen and Ji-Rong Wen},
    year={2024},
    eprint={2406.14129},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
asset		asset
evaluation		evaluation
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Event-oriented Long Video Understanding

🔥 News

👀 Overview

🔍 Benchmark Data and Instruction Dataset

🔮 Evaluation Pipeline

📈 Experimental Results

Citation

About

Releases

Packages

Languages

Richar-Du/Event-Bench

Folders and files

Latest commit

History

Repository files navigation

Towards Event-oriented Long Video Understanding

🔥 News

👀 Overview

🔍 Benchmark Data and Instruction Dataset

🔮 Evaluation Pipeline

📈 Experimental Results

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages