Text@1.0 Leaderboard
# Leaderboard for UEval text generation evaluation.| Rank | Model Name | Avg | Art | Diagram | Exercise | Life | Paper | Space | Tech | Textbook | Date |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | GPT 5 Thinking | 83.8% | 94.7% | 84.3% | 81.5% | 85.8% | 60.5% | 85.7% | 87.9% | 90.0% | 1/14/2025 |
| 2 | GPT 5 Instant | 77.7% | 83.4% | 80.1% | 61.0% | 78.8% | 82.8% | 79.7% | 67.1% | 88.5% | 1/14/2025 |
| 3 | Gemini 2.5 Flash | 75.5% | 81.9% | 77.2% | 63.4% | 70.8% | 83.4% | 73.5% | 67.4% | 86.6% | 1/14/2025 |
| 4 | Gemini 2.0 Flash | 73.2% | 85.2% | 73.3% | 60.4% | 71.2% | 74.4% | 70.5% | 70.4% | 80.5% | 1/14/2025 |
| 5 | Emu 3.5 | 64.6% | 79.6% | 68.9% | 53.3% | 70.4% | 50.8% | 53.6% | 57.8% | 82.3% | 1/14/2025 |
| 6 | Bagel | 48.5% | 58.2% | 71.7% | 27.7% | 48.1% | 35.2% | 31.9% | 44.6% | 70.6% | 1/14/2025 |
| 7 | Janus Pro | 39.3% | 44.0% | 73.9% | 18.9% | 40.0% | 27.2% | 28.5% | 30.8% | 51.3% | 1/14/2025 |
| 8 | Show o2 | 36.9% | 45.4% | 57.2% | 22.2% | 26.9% | 30.2% | 28.1% | 30.2% | 55.2% | 1/14/2025 |
| 9 | MMaDA | 12.8% | 12.5% | 19.2% | 3.0% | 10.1% | 11.0% | 16.7% | 7.5% | 22.4% | 1/14/2025 |
Submit your results by opening an issue in our GitHub.