8b-class-japanese-models - a leonardlin Collection

leonardlin 's Collections

8b-class-japanese-models

speed

sota

evals

tuning

rag

context

safety

image

vision

code

prompt injection

TOREAD

data

voice

8b-class-japanese-models

updated 24 days ago

shisa-ai/shisa-v2-qwen2.5-7b

Text Generation • Updated 22 days ago • 81 • 3
shisa-ai/shisa-v2-llama3.1-8b

Text Generation • Updated 22 days ago • 65 • 1
shisa-ai/shisa-v2-llama3.1-8b-preview

Updated 24 days ago • 9
sbintuitions/sarashina2.2-3b-instruct-v0.1

Text Generation • Updated Mar 5 • 13.9k • 22

Note 2025-03 This is a 3B model, but they claim it is comparable to a 7B class perf, so let's see: https://www.sbintuitions.co.jp/blog/entry/2025/03/07/093143
llm-jp/llm-jp-3-7.2b-instruct3

Text Generation • Updated Feb 4 • 6.24k • 3

Note 2025-02 ; 4096 context window, test with `VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 vllm serve llm-jp/llm-jp-3-7.2b-instruct3 --max-model-len 8192 --rope-scaling '{"rope_type":"dynamic","factor":2.0}`
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

Text Generation • Updated Apr 2 • 15.1k • • 19
Qwen/Qwen2.5-7B-Instruct

Text Generation • Updated Jan 12 • 3.42M • • 665

Note Still has Chinese cross-lingual-token leakage
google/gemma-2-9b-it

Text Generation • Updated Aug 27, 2024 • 355k • • 708

Note No system prompt support breaks many evals
weblab-GENIAC/Tanuki-8B-dpo-v1.0

Text Generation • Updated Oct 9, 2024 • 756 • 40
meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25, 2024 • 5.97M • • 3.94k
shisa-ai/shisa-v1-llama3-8b

Text Generation • Updated Mar 19 • 28 • 6

Note shisa-v1 SFT only
elyza/Llama-3-ELYZA-JP-8B

Text Generation • Updated Jun 26, 2024 • 26.4k • • 115
augmxnt/shisa-gamma-7b-v1

Text Generation • Updated Mar 9 • 43.2k • • 18
augmxnt/shisa-7b-v1

Text Generation • Updated Dec 20, 2023 • 65 • 28

Note Requires chat template or BOS token for proper responses