Mohammed Hamdy's picture

Mohammed Hamdy

mmhamdy

·

AI & ML interests

TechBio | AI4Sci | NLP | Reinforcement Learning

Recent Activity

upvoted an article 13 days ago

Tiny Agents: a MCP-powered agent in 50 lines of code

upvoted a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

posted an update about 1 month ago

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model? In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer. 💡 Examples of ideas explored in the article: ✅ What was the inspiration for the attention mechanism? ✅ How did we go from attention to self-attention? ✅ Did the team have any other names in mind for the model? and more... I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates. Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

View all activity

Organizations

mmhamdy's activity

liked a model about 2 months ago

sesame/csm-1b

Text-to-Speech • Updated Mar 16 • 54.2k • 1.99k

liked a Space about 2 months ago

The Distill Template

Craft Beautiful Blogs

liked a model 2 months ago

ElectricAlexis/NotaGen

Updated Feb 26 • 137

liked a model 3 months ago

microsoft/wham

Updated Feb 21 • 609 • 256

liked a Space 3 months ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

liked a model 4 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 28 days ago • 1.88M • • 4.24k

liked a dataset 4 months ago

HuggingFaceH4/MATH-500

Viewer • Updated Nov 15, 2024 • 500 • 64k • 150

liked a model 5 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 467k • 841

liked a Space 5 months ago

Scaling test-time compute

Enhance math problem solving by scaling test-time compute

liked a model 5 months ago

CohereLabs/c4ai-command-r7b-12-2024

Text Generation • Updated 7 days ago • 5.35k • • 382

liked a Space 5 months ago

Discussion Forum

liked a dataset 5 months ago

CohereLabs/Global-MMLU

Viewer • Updated 24 days ago • 602k • 10.9k • 118

liked a Space 5 months ago

Language Leads Dashboard

View and search languages by lead status

liked 2 datasets 5 months ago

zjunlp/Mol-Instructions

Updated Mar 3, 2024 • 964 • 53

AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 2.96k • 446

liked a dataset 6 months ago

HuggingFaceTB/smoltalk

Viewer • Updated Feb 10 • 2.2M • 7.35k • 333

liked a dataset 8 months ago

KbsdJames/Omni-MATH

Viewer • Updated Oct 12, 2024 • 4.43k • 3.38k • 98

liked a model 9 months ago

HuggingFaceTB/SmolLM-135M-Instruct

Text Generation • Updated Sep 4, 2024 • 32.8k • 115

liked a model 11 months ago

fireworks-ai/llama-3-firefunction-v2

Text Generation • Updated Jun 18, 2024 • 105 • • 145

liked a Space 11 months ago

FineWeb: decanting the web for the finest text data at scale

Generate high-quality web text data for LLM training