mengfanxu
fxmeng
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 4 hours ago
ByteDance-Seed/MAGACorpus
published
a model
about 5 hours ago
fxmeng/TransMLA-rmsnorm-llama-2-q512-kv896
commented on
a paper
3 months ago
TransMLA: Multi-head Latent Attention Is All You Need
Organizations
None yet
Collections
8
models
56
fxmeng/TransMLA-rmsnorm-llama-2-q512-kv896
Updated
fxmeng/PiSSA-llama-7b-commonsense-148k
Updated
fxmeng/PiSSA-Llama-3-8b-commonsense-148k
Updated
fxmeng/PiSSA-Llama-2-7b-commonsense-148k
Updated
fxmeng/PiSSA-llama-13b-commonsense-148k
Updated
fxmeng/CLOVER-llama-3-8b-commonsense-148k
Updated
fxmeng/CLOVER-llama-2-7b-commonsense-148k
Updated
fxmeng/CLOVER-llama-13b-commonsense-148k
Updated
•
3
fxmeng/CLOVER-llama-7b-commonsense-148k
Updated
fxmeng/TransMLA_qwen2.5_0.5b_instruct
Updated
datasets
9
fxmeng/pissa-dataset
Viewer
•
Updated
•
844k
•
1.34k
•
3
fxmeng/big-bench-hard-continue-finetuning
Viewer
•
Updated
•
10.3k
•
35
fxmeng/commonsense_filtered
Viewer
•
Updated
•
170k
•
68
•
1
fxmeng/MetaMath-GSM240K
Viewer
•
Updated
•
240k
•
24
•
1
fxmeng/MetaMath-MATH155K
Viewer
•
Updated
•
155k
•
13
fxmeng/CodeFeedback-Python105K
Viewer
•
Updated
•
105k
•
465
•
6
fxmeng/llava_finetune_336x336
Preview
•
Updated
•
6
fxmeng/llava_pretrain_336x336
Preview
•
Updated
•
5
fxmeng/WizardLM_evol_instruct_V2_143k
Viewer
•
Updated
•
143k
•
33
•
2