13 12 16

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

liked a Space 15 days ago

nari-labs/Dia-1.6B

upvoted a collection about 1 month ago

Llama 4

View all activity

Organizations

garrethlee's activity

liked a Space 15 days ago

1.21k

Dia 1.6B

👯

Generate realistic dialogue from a script, using Dia!

liked a Space 3 months ago

2.56k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 4 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated Mar 27 • 1.31M • • 12.1k

liked a dataset 5 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Jan 8 • 12.5B • 51.3k • 476

liked 2 Spaces 5 months ago

Number Tokenization Blog

📈

Explore how tokenization affects arithmetic in LLMs

498

Synthetic Data Generator

🧬

Build datasets using natural language

liked a Space 6 months ago

Hub LFS Analysis

📈

An analysis of LFS files on the Hub.

liked a model 6 months ago

GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct

Updated Nov 6, 2024 • 4.72k • 39

liked a Space 6 months ago

Sahabat-AI Chatbot (Gemma2 9b)

😻

Chatbot

liked 2 datasets 6 months ago

indolem/IndoMMLU

Updated Oct 11, 2023 • 782 • 18

PleIAs/common_corpus

Viewer • Updated Feb 11 • 470M • 43k • 258

liked 3 Spaces 7 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

110

TxT360: Trillion Extracted Text

📖

Create a large, deduplicated dataset for LLM pre-training

937

Model Memory Utility

🚀

Calculate memory usage for training models

liked a Space 8 months ago

936

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a model about 1 year ago

mistralai/Mistral-7B-Instruct-v0.2

Text Generation • Updated Sep 27, 2024 • 1.61M • • 2.75k