Omar Sanseviero's picture

Omar Sanseviero

osanseviero

·

https://osanseviero.github.io/hackerllama/

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Recent Activity

upvoted an article 5 days ago

17 Reasons Why Gradio Isn't Just Another UI Library

upvoted a collection 7 days ago

liked a model 7 days ago

speakleash/Bielik-11B-v2.3-Instruct

View all activity

Organizations

osanseviero's activity

upvoted an article 5 days ago

Article

17 Reasons Why Gradio Isn't Just Another UI Library

13 days ago

• 25

upvoted a collection 7 days ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated 10 days ago • 25

upvoted a paper 8 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 14 days ago • 248

upvoted a collection 25 days ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 10 days ago • 172

upvoted a paper about 1 month ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 49

upvoted a collection about 2 months ago

Gemma 3 Release

24 items • Updated 10 days ago • 346

upvoted an article about 2 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 403

upvoted a collection about 2 months ago

Cohere Labs Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 13 days ago • 68

upvoted an article about 2 months ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

Mar 4

• 74

upvoted a paper about 2 months ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58

upvoted an article 2 months ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

Feb 19

• 69

upvoted a collection 2 months ago

GemmaX2

GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated Feb 7 • 21

upvoted 2 papers 2 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 183

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 143

upvoted a collection 2 months ago

PaliGemma 2 Mix

13 items • Updated 25 days ago • 60

upvoted a paper 2 months ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published Feb 11 • 29

upvoted an article 3 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.23k