metadata
base_model:
- google/gemma-3-12b-it
About 13.5M tokens total of mixed instruct and RP data.
Both RP datasets and the inkstruct include system prompts to help g3 understand the system role (via <start_of_turn>system
).
datasets:
- path: ToastyPigeon/some-rp-extended
type: customgemma-regex
- path: allura-org/inkstructmix-v0.2.1a-system-reasoning-separated
type: customgemma-regex
data_files: inkstruct-system.json
split: train[:750]
- path: ToastyPigeon/unalign-v2
type: customgemma-regex
split: train[:50%]
- path: ToastyPigeon/synth-rp
split: train[:20%]
type: customgemma-regex