Spaces:
Running
Running
Commit
·
22eb8e9
1
Parent(s):
11c4c4e
add
Browse files- README.md +1 -4
- src/about.py +7 -1
README.md
CHANGED
@@ -43,7 +43,4 @@ If you encounter problem on the space, don't hesitate to restart it to remove th
|
|
43 |
You'll find
|
44 |
- the main table' columns names and properties in `src/display/utils.py`
|
45 |
- the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
|
46 |
-
- the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
|
47 |
-
|
48 |
-
|
49 |
-
The Crossword task requires inferring correct words from given clues and filling them into a grid. A key challenge lies in satisfying the constraint of shared letter intersections between horizontal and vertical words. We collected 150 Crossword samples published in 2024 from <a href="https://www.latimes.com" target="_blank"> Los Angeles Times</a> and <a href="https://www.vulture.com" target="_blank"> Vulture</a> in three sizes: \(5 \times 5\), $10\times10$, and $15\times15$, with 50 ones for each size.$n^2 \times n^2$
|
|
|
43 |
You'll find
|
44 |
- the main table' columns names and properties in `src/display/utils.py`
|
45 |
- the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
|
46 |
+
- the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
|
|
|
|
|
|
src/about.py
CHANGED
@@ -93,6 +93,12 @@ Make sure you have followed the above steps first.
|
|
93 |
If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
|
94 |
"""
|
95 |
|
96 |
-
CITATION_BUTTON_LABEL = "
|
97 |
CITATION_BUTTON_TEXT = r"""
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
"""
|
|
|
93 |
If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
|
94 |
"""
|
95 |
|
96 |
+
CITATION_BUTTON_LABEL = "BibTeX Citation"
|
97 |
CITATION_BUTTON_TEXT = r"""
|
98 |
+
@article{chen2025lr,
|
99 |
+
title={LR $$\{$$\}$\^{}$\{$2$\}$ $ Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems},
|
100 |
+
author={Chen, Jianghao and Wei, Zhenlin and Ren, Zhenjiang and Li, Ziyong and Zhang, Jiajun},
|
101 |
+
journal={arXiv preprint arXiv:2502.17848},
|
102 |
+
year={2025}
|
103 |
+
}
|
104 |
"""
|