Zach Mustafa's picture

Zach Mustafa PRO

Zmu

·

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

DawnC/VisionScout

liked a Space 3 days ago

onnx-community/model-explorer

liked a Space 3 days ago

RiverZ/ICEdit

View all activity

Organizations

Zmu's activity

upvoted a collection 3 days ago

D-FINE

State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated 3 days ago • 46

upvoted 2 collections 6 days ago

SigLIP2

36 items • Updated Apr 3 • 69

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 20 days ago • 187

upvoted a paper 8 days ago

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Paper • 2504.19413 • Published 10 days ago • 11

upvoted 2 collections 15 days ago

Perception LM

7 items • Updated 21 days ago • 42

Perception Encoder

9 items • Updated 21 days ago • 47

upvoted a paper 15 days ago

Vidi: Large Multimodal Models for Video Understanding and Editing

Paper • 2504.15681 • Published 16 days ago • 15

upvoted 2 collections about 1 month ago

LipSync and Face Operations

19 items • Updated 8 days ago • 49

Excellent SLM & SVLM

Excellent SLM (small language models) and SVLM (small vison language models). • 29 items • Updated Apr 1 • 4

upvoted a paper about 2 months ago

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10 • 38

upvoted an article about 2 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 406

upvoted a paper 2 months ago

VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing

Paper • 2502.17258 • Published Feb 24 • 79

upvoted an article 3 months ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

Feb 19

• 70

upvoted a paper 3 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 144

upvoted a collection 3 months ago

SmolVLM2 📺 Smallest video LM ever 🤏🏻

11 items • Updated 3 days ago • 84

upvoted an article 3 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

• 243

upvoted a paper 3 months ago

LLM Agents Making Agent Tools

Paper • 2502.11705 • Published Feb 17 • 2

upvoted an article 3 months ago

Article

Build awesome datasets for video generation

Feb 12

• 30

upvoted a collection 3 months ago

Temporal Preference Optimization

Temporal Preference Optimization for Long-form Video Understanding • 3 items • Updated Jan 19 • 5