Text@1.0 Leaderboard

Text overview
# Leaderboard for UEval text generation evaluation.
RankModel Name

Avg

Art

Diagram

Exercise

Life

Paper

Space

Tech

Textbook

Date
1GPT 5 Thinking

83.8%

94.7%

84.3%

81.5%

85.8%

60.5%

85.7%

87.9%

90.0%

1/14/2025
2GPT 5 Instant

77.7%

83.4%

80.1%

61.0%

78.8%

82.8%

79.7%

67.1%

88.5%

1/14/2025
3Gemini 2.5 Flash

75.5%

81.9%

77.2%

63.4%

70.8%

83.4%

73.5%

67.4%

86.6%

1/14/2025
4Gemini 2.0 Flash

73.2%

85.2%

73.3%

60.4%

71.2%

74.4%

70.5%

70.4%

80.5%

1/14/2025
5Emu 3.5

64.6%

79.6%

68.9%

53.3%

70.4%

50.8%

53.6%

57.8%

82.3%

1/14/2025
6Bagel

48.5%

58.2%

71.7%

27.7%

48.1%

35.2%

31.9%

44.6%

70.6%

1/14/2025
7Janus Pro

39.3%

44.0%

73.9%

18.9%

40.0%

27.2%

28.5%

30.8%

51.3%

1/14/2025
8Show o2

36.9%

45.4%

57.2%

22.2%

26.9%

30.2%

28.1%

30.2%

55.2%

1/14/2025
9MMaDA

12.8%

12.5%

19.2%

3.0%

10.1%

11.0%

16.7%

7.5%

22.4%

1/14/2025

Submit your results by opening an issue in our GitHub.