Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
dxli94 authored Oct 3, 2024
1 parent 84b06b9 commit dcb3a74
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,17 @@ Aria is the first open MoE model that is natively multimodal. It features SoTA p

| Category | Benchmark | Aria | Pixtral 12B | Llama3 8B | Llama3-V 8B | GPT-4V | GPT-4o mini | GPT-4o | Gemini-1.5 Flash | Gemini-1.5 Pro |
|-------------------------------------|-------------------------|-------|-------------|-----------|-------------|--------|-------------|--------|------------------|----------------|
| **Knowledge(Multimodal)** | MMMU | 54.2 | 52.5 | - | 49.6 | 56.4 | 59.4 | 69.1 | 56.1 | 62.2 |
| **Math(Multimodal)** | MathVista | 64.1 | 58.0 | - | - | - | 54.7 | 63.8 | 58.4 | 63.9 |
| **Document** | DocQA | 92.9 | 90.7 | - | 84.4 | 88.4 | - | 92.8 | 89.9 | 93.1 |
| **Chart** | ChartQA | 86.1 | 81.8 | - | 78.7 | 78.4 | - | 85.7 | 85.4 | 87.2 |
| **Knowledge(Multimodal)** | MMMU | 54.9 | 52.5 | - | 49.6 | 56.4 | 59.4 | 69.1 | 56.1 | 62.2 |
| **Math(Multimodal)** | MathVista | 66.1 | 58.0 | - | - | - | 54.7 | 63.8 | 58.4 | 63.9 |
| **Document** | DocQA | 92.6 | 90.7 | - | 84.4 | 88.4 | - | 92.8 | 89.9 | 93.1 |
| **Chart** | ChartQA | 86.4 | 81.8 | - | 78.7 | 78.4 | - | 85.7 | 85.4 | 87.2 |
| **Scene Text** | TextVQA | 81.1 | - | - | 78.2 | 78.0 | - | - | 78.7 | 78.7 |
| **General Visual QA** | MMBench-1.1 | 81.1 | - | - | - | 79.8 | 76.0 | 82.2 | - | 73.9 |
| **Video Understanding** | LongVideoBench | 64.0 | 47.4 | - | - | 60.7 | 58.8 | 66.7 | 62.4 | 64.4 |
| **Knowledge(Language)** | MMLU (5-shot) | 73.6 | 69.2 | 69.4 | - | 86.4 | - | 89.1 | 78.9 | 85.9 |
| **Math(Language)** | MATH | 50.0 | 48.1 | 51.9 | - | - | 70.2 | 76.6 | - | - |
| **General Visual QA** | MMBench-1.1 | 80.3 | - | - | - | 79.8 | 76.0 | 82.2 | - | 73.9 |
| **Video Understanding** | LongVideoBench | 65.3 | 47.4 | - | - | 60.7 | 58.8 | 66.7 | 62.4 | 64.4 |
| **Knowledge(Language)** | MMLU (5-shot) | 73.3 | 69.2 | 69.4 | - | 86.4 | - | 89.1 | 78.9 | 85.9 |
| **Math(Language)** | MATH | 50.8 | 48.1 | 51.9 | - | - | 70.2 | 76.6 | - | - |
| **Reasoning(Language)** | ARC Challenge | 91.0 | - | 83.4 | - | - | 96.4 | 96.7 | - | - |
| **Coding** | HumanEval | 75.6 | 72.0 | 72.6 | - | 67.0 | 87.2 | 90.2 | 74.3 | 84.1 |
| **Coding** | HumanEval | 73.2 | 72.0 | 72.6 | - | 67.0 | 87.2 | 90.2 | 74.3 | 84.1 |


## News
Expand Down

0 comments on commit dcb3a74

Please sign in to comment.