Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -77,9 +77,6 @@ The core is the use of <b style="color:red">synergy</b> as the evaluative criter
|
|
77 |
|
78 |
|
79 |
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
<div align="center">
|
84 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/BPqs-3UODQWvjFzvZYkI4.png' width=1000px>
|
85 |
</div>
|
@@ -93,9 +90,9 @@ The core is the use of <b style="color:red">synergy</b> as the evaluative criter
|
|
93 |
**A companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.**
|
94 |
|
95 |
|
96 |
-
We set two
|
97 |
-
- [**General-Bench-Openset**](https://huggingface.co/datasets/General-Level/General-Bench-Openset) with inputs and labels of samples all publicly open, for open-world use (e.g., academic experiment).
|
98 |
-
- [**General-Bench-Closeset**](https://huggingface.co/datasets/General-Level/General-Bench-Closeset) with only sample inputs available, which
|
99 |
|
100 |
|
101 |
<div align="center">
|
|
|
77 |
|
78 |
|
79 |
|
|
|
|
|
|
|
80 |
<div align="center">
|
81 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/BPqs-3UODQWvjFzvZYkI4.png' width=1000px>
|
82 |
</div>
|
|
|
90 |
**A companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.**
|
91 |
|
92 |
|
93 |
+
We set two dataset types according to the use purpose:
|
94 |
+
- [**General-Bench-Openset**](https://huggingface.co/datasets/General-Level/General-Bench-Openset) with inputs and labels of samples all publicly open, for **free open-world use** (e.g., for academic experiment/comparisons).
|
95 |
+
- [**General-Bench-Closeset**](https://huggingface.co/datasets/General-Level/General-Bench-Closeset) with only sample inputs available, which is used for ranking in our leaderboard. Participants need to submit the predictions to us for internal evaluation.
|
96 |
|
97 |
|
98 |
<div align="center">
|