FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training.
Yuichi Tateno PRO
hotchpotch
AI & ML interests
IR, Kaggle(competitions master)
Recent Activity
liked
a model
about 21 hours ago
llm-jp/llm-jp-modernbert-base
published
a model
about 21 hours ago
hotchpotch/query-crafter-japanese-Qwen3-4B
published
a model
about 21 hours ago
hotchpotch/query-crafter-japanese-Qwen3-1.7B
Organizations
Collections
4
spaces
4
models
32

hotchpotch/query-crafter-japanese-Qwen3-1.7B
Updated
•
16

hotchpotch/query-crafter-japanese-Qwen3-4B
Updated

hotchpotch/tmp-mb-jp-30m-reranker
Updated
•
94

hotchpotch/japanese-reranker-cross-encoder-xsmall-v1
Text Ranking
•
Updated
•
6.84k
•
7

hotchpotch/japanese-reranker-cross-encoder-small-v1
Text Ranking
•
Updated
•
338
•
3

hotchpotch/japanese-reranker-cross-encoder-large-v1
Text Ranking
•
Updated
•
2.52k
•
15

hotchpotch/japanese-reranker-cross-encoder-base-v1
Text Ranking
•
Updated
•
666
•
1

hotchpotch/japanese-bge-reranker-v2-m3-v1
Text Ranking
•
Updated
•
1.64k
•
15

hotchpotch/fineweb-2-edu-japanese-classifier
Updated
•
10

hotchpotch/fineweb-2-japanese-text-cleaner
Updated
•
13
datasets
22
hotchpotch/japanese-query-crafter-reasoning-80k
Viewer
•
Updated
•
83.3k
•
12
•
1
hotchpotch/tmp-5M-qa-small-tokens-cleaned
Viewer
•
Updated
•
5M
•
241
hotchpotch/japanese-qa-reasoning-100k
Viewer
•
Updated
•
106k
•
62
•
1
hotchpotch/fineweb-2-edu-japanese
Viewer
•
Updated
•
262M
•
1.39k
•
9
hotchpotch/fineweb-2-edu-japanese-noise-detect-raw
Viewer
•
Updated
•
64.2M
•
205
hotchpotch/fineweb-2-japanese-noise-spans
Viewer
•
Updated
•
344k
•
48
hotchpotch/fineweb-2-edu-japanese-scores
Viewer
•
Updated
•
313k
•
61
•
1
hotchpotch/sentence_transformer_japanese
Viewer
•
Updated
•
13.2M
•
418
•
5
hotchpotch/JQaRA
Viewer
•
Updated
•
278k
•
603
•
19
hotchpotch/JaCWIR
Viewer
•
Updated
•
518k
•
176
•
6