Alan Tseng's picture

12 7

Alan Tseng

agentlans

·

agentlans

AI & ML interests

Small data, boring AI

Recent Activity

reacted to ProCreations's post with ❤️ about 10 hours ago

Post of the Day I’m fine-tuning Qwen 2.5-0.5B to be extremely good at math, using high-quality datasets and some smart training strategies. The logs are looking really promising so far! Expected release: Tomorrow morning? I’ll post as soon as it’s ready — stay tuned. If you want faster updates or just wanna chat about it, come join my Discord: https://discord.gg/EXsug2Ux29 (Heads up: we might ask a couple quick questions when you join — just making sure we keep the server safe.) Also, check out one of the datasets we’re using: https://huggingface.co/datasets/ProCreations/SimpleMath This project is also helping shape the future of IntellIte. The insights and techniques we’re developing here — better dataset curation, fine-tuning tricks, and evaluation methods — will directly contribute to making IntellIte even sharper, faster, and more reliable, especially for math and reasoning tasks. Big progress ahead. Can’t wait to share it with you all!

replied to ProCreations's post about 10 hours ago

Post of the Day I’m fine-tuning Qwen 2.5-0.5B to be extremely good at math, using high-quality datasets and some smart training strategies. The logs are looking really promising so far! Expected release: Tomorrow morning? I’ll post as soon as it’s ready — stay tuned. If you want faster updates or just wanna chat about it, come join my Discord: https://discord.gg/EXsug2Ux29 (Heads up: we might ask a couple quick questions when you join — just making sure we keep the server safe.) Also, check out one of the datasets we’re using: https://huggingface.co/datasets/ProCreations/SimpleMath This project is also helping shape the future of IntellIte. The insights and techniques we’re developing here — better dataset curation, fine-tuning tricks, and evaluation methods — will directly contribute to making IntellIte even sharper, faster, and more reliable, especially for math and reasoning tasks. Big progress ahead. Can’t wait to share it with you all!

updated a model about 10 hours ago

agentlans/granite-3.3-2b-instruct-ethics

View all activity

Organizations

None yet

Collections 1

models 104

agentlans/granite-3.3-2b-instruct-ethics

Text2Text Generation • Updated about 10 hours ago

agentlans/granite-3.3-2b-instruct-critical-thinking

Text2Text Generation • Updated about 14 hours ago

agentlans/hayao-miyazaki-quote

Video-Text-to-Text • Updated 2 days ago

agentlans/Qwen2.5-1.5B-Instruct-Keywords

Updated 8 days ago • 1

agentlans/Qwen2.5-1.5B-Refiner

Updated 8 days ago • 6 • 1

agentlans/Qwen2.5-1.5B-Instruct-Titler

Updated 8 days ago

agentlans/Qwen2.5-1.5B-Instruct-Summarizer

Updated 8 days ago

agentlans/Phi-4-mini-instruct-drill

Updated 11 days ago

agentlans/Llama-3.1-8B-Instruct-drill

Updated 11 days ago

agentlans/Llama-3.2-3B-Instruct-drill

Updated 11 days ago

datasets 83

agentlans/reddit-logic

Viewer • Updated 1 day ago • 9.94k • 101 • 1

agentlans/literary-reasoning

Viewer • Updated 1 day ago • 8.23k • 41 • 1

agentlans/reddit-ethics

Viewer • Updated 1 day ago • 9.74k • 144 • 3

agentlans/finewebedu-guru

Viewer • Updated 10 days ago • 39.9k • 105

agentlans/drill

Viewer • Updated 12 days ago • 660k • 39

agentlans/finewebedu-short-answer

Viewer • Updated 16 days ago • 9.96k • 46

agentlans/finewebedu-annotated

Viewer • Updated 18 days ago • 9.99k • 75

agentlans/noun-phrases

Viewer • Updated 18 days ago • 3.9M • 56

agentlans/facebook-anli-claims

Viewer • Updated 19 days ago • 29.4k • 53

agentlans/finewebedu-multiple-choice

Viewer • Updated 20 days ago • 9.99k • 50