Saeed's picture

Saeed

MLDataScientist

·

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

btbtyler09/Qwen3-30B-A3B-gptq-8bit:4-bit

new activity 9 days ago

unsloth/Qwen3-235B-A22B-GGUF:UD quants missing some files

new activity 9 days ago

unsloth/Qwen3-235B-A22B-128K-GGUF:UD quants missing some files

View all activity

Organizations

None yet

MLDataScientist's activity

upvoted a collection 9 days ago

Qwen3

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 65 items • Updated 7 days ago • 137

upvoted a paper 3 months ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61

upvoted a paper 4 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66

upvoted a paper 6 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 147

upvoted a paper 8 months ago

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 73

upvoted a collection 10 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 665

upvoted a collection 11 months ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 2 days ago • 162