Removed custom bidirectional layer as it is not needed when using the Llama attention_masks b57b92e verified Ruurd commited on 2 days ago
Create safe fallback for models not yet initialized with masking_type f2ca6a6 verified Ruurd commited on 20 days ago
Overhaul code for appropriate masking for full model instead of just attention layers b43e862 verified Ruurd commited on 20 days ago
Implement improved attention masking for bidirectional_masked 1723639 verified Ruurd commited on 20 days ago
Change LoRA size from 256 to 512, also back to bidirectional_masked 620a6cd verified Ruurd commited on 23 days ago