Snowflake-G0-Release

This is the initial release of the pre-trained model Snowflake-G0 series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

Model details

  • Architecture: SnowflakeCore
  • Hidden size: 384
  • Number of attention heads: 6
  • Number of layers: 4
  • Feed-forward dimension: 768
  • Maximum sequence length: 384
  • Vocabulary size: 30522

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/Snowflake-G0-Release")
config = AutoConfig.from_pretrained("FlameF0X/Snowflake-G0-Release")
model = AutoModel.from_pretrained("FlameF0X/Snowflake-G0-Release")

Memory Optimization Techniques

  • Mixed precision training
  • Gradient accumulation (4 steps)
  • Fused QKV projection
  • Pre-norm architecture
  • Weight tying between embedding and output layers
  • Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.

Downloads last month
62
Safetensors
Model size
16.6M params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Dataset used to train FlameF0X/Snowflake-G0-Release

Space using FlameF0X/Snowflake-G0-Release 1

Collection including FlameF0X/Snowflake-G0-Release