TinyRP / README.md
DarwinAnim8or's picture
Update README.md
3717251 verified
metadata
language: en
license: openrail
tags:
  - text-generation
  - roleplay
  - tinystories
  - mistral
datasets:
  - roleplay4fun/aesir-v1.1
  - roleplay4fun/pippa

TinyRP

A >tiny< roleplaying model, 30M parameters large, trained from scratch with a custom tokenizer! Inspired by the success of models like Microsoft's Phi and TinyStories, this is an experiment to see if reasonable succes can be achieved for roleplay with a similar approach.

Roleplay was chosen because it is harder to keep a story going consistently across multiple turns than it is to simply generate it once based off of a simple prompt. Do remember, that this training set was not sanitized, so NSFW results are definitely possible.

Out of scope

Anything other than roleplay in English.

Formatting

Use the ChatML format to "chat" with this model. The entire training dataset was modified to use this format, so the model only understand it. Use the "system" prompt to describe the name and character of the AI, the user and assistant tags are used as normal.

Recommended settings

TBD

Model Details

  • Architecture: Mistral (modified for smaller size)
  • Parameters: ~30M
  • Vocabulary Size: 8192
  • Context Length: 1024
  • Sliding Window: 2048
  • Training Steps: 38000 (18 epochs)

Training Configuration

  • Hidden Size: 512
  • Layers: 8
  • Attention Heads: 8
  • Learning Rate: 3e-4
  • Batch Size: 32 (effective)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TinyRP")
model = AutoModelForCausalLM.from_pretrained("TinyRP")

# Generate text
inputs = tokenizer("Hello there!", return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

This model was trained from scratch using a custom dataset for roleplay scenarios. Training was stopped at approximately 38000 (interrupted) steps.

Training code

Please see this GitHub repo for training code

Limitations

  • Optimized for roleplay and creative writing scenarios
  • May not perform well on factual or analytical tasks
  • Training was interrupted, so further fine-tuning may improve performance