Snowflake-G0-Release

This is the initial release of the pre-trained model Snowflake-G0 series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

Model details

Architecture: SnowflakeCore
Hidden size: 384
Number of attention heads: 6
Number of layers: 4
Feed-forward dimension: 768
Maximum sequence length: 384
Vocabulary size: 30522

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/Snowflake-G0-Release")
config = AutoConfig.from_pretrained("FlameF0X/Snowflake-G0-Release")
model = AutoModel.from_pretrained("FlameF0X/Snowflake-G0-Release")

Memory Optimization Techniques

Mixed precision training
Gradient accumulation (4 steps)
Fused QKV projection
Pre-norm architecture
Weight tying between embedding and output layers
Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.

FlameF0X
/

Snowflake-G0-Release

Snowflake-G0-Release

Model details

HuggingFace Transformers Compatibility

Memory Optimization Techniques

Dataset used to train FlameF0X/Snowflake-G0-Release

Space using FlameF0X/Snowflake-G0-Release 1

Collection including FlameF0X/Snowflake-G0-Release

Snowflake G0